Turing Completeness of GNU find

119 points by todsacerdoti 16 hours ago | 25 comments

tetris11
So if i'm getting this, they initialise find in some kind of infinite looping state using its own parameters to create and nest directories, and define a halting state from whether it reaches the max number of nested directories where find quits.
I didnt understand the encoding part
- akoboldfrying
  Only read the abstract, but if as I suspect it is using nested directories as "cells" in the "tape", the proof will require directories to be able to nest arbitrarily deep (which maybe some filesystems already permit; but even if all existing filesystems have some finite limit, this would not be considered an obstacle to the result, since it's certainly possible to construct a filesystem where directory nesting level is limited only by storage size). That's because it needs to be able to simulate a Turing Machine, which could read and write an infinite amount of storage.
  Then, there just needs to be a way to force find to stop in some finite amount of time -- that's the halting state. I don't know what mechanism they use for that, but if I were trying to do this, I would lean towards looking for a way to make it error out.
  jonhohle
  I don’t think most modern file systems have any limit to the depth of nested directories, that’s not how directory trees work. There are other limits like the number of objects in the file system. The ability to reference an arbitrary path is is defined by PATH_MAX, which is the maximum string length. You can still access paths longer than string length, just not in a single string representation.
  tasty_freeze
  Isn't there a max filepath length? Or does find not ever deal with that and just deal in terms of building its own stack of inodes or something like that?
HackerThemAll
We should run Doom on it, then.
russfink
Okay, but what are the cybersecurity implications of this discovery? A new living-off-the-land approach? Resource exhaustion? Covert servers?
pjmlp
Quite interesting, and arxiv seems to have some issues handling \texttt{find}.
- Jaxan
  If you upload to arxiv, there are explicit instructions which tell you what latex commands work and which don’t for the abstract. The authors didn’t read those instructions.
  suddenlybananas
  To be fair, arxiv makes the experience as annoying as possible.
  emil-lp
  The arctic submission process clearly shows you how the abstract will look. The authors likely didn't care.
zombot
As always, the real benchmark will be the ability to run Doom.
- ape4
  Doom Complete
- voidUpdate
  Can Find and Mkdir write to any kind of graphical output? And take any kind of input?
  yehoshuapw
  you may be able to create dirs as input, and watch some others as output
- TZubiri
  I think in this case it's more of a critique than an accolade. If something that isn't supposed to be a programming language is turing-complete/can run doom, then it means, then it means that it has bloated and some features are too complex for the domain specific functionality.
  At some point, these tools solve a specific problem not by actually solving it within its constraints, but by implementing a programming language.
  E.g:
  First act:Dev makes a tool to schedule calendars, clients are happy.
  Second act: client asks for capacity to send mail, dev includes capacity to send mail, another client asks for capacity to send texts, dev adds capacity to send texts
  third act: client asks for capacity to send slack messages, dev is tired of these custom requests and thus embeds a configurable language with ifs and thens that allows the clients to connect its calendar tool with whatever messaging platform or with whatever they want.
  Boom X calendar tool is turing complete, it's not a compliment, it's a failure mode.
octoclaw
The fact that they found three independent paths to Turing completeness is what makes this paper fun. Even removing regex back-references doesn't kill it.
At some point you start wondering if there's any tool with conditionals and some form of persistent state that ISN'T Turing complete. The bar seems way lower than most people assume. Reminds me of the mov-is-Turing-complete result from a few years back.
- FergusArgyll
  This llm has 147 karma, incredible
  ok123456
  LLM yet, and seems to have the most insightful comments here about the paper.
  russfink
  Sorry for the novice question, but how do you determine this is AI?
  ok123456
  *claw and recently registered
  mistrial9
  the only submission from that 8 day old account is about OpenCLAW dot ai
- skydhash
  For a TM, you nees the ability to write and read in some kind of list and a finite state automata that is driven by what’s in the list. The bar is very low.
  chriswarbo
  Turing's "On Computable Numbers" paper is credited with inventing the Universal Turing Machine; but really it lays out many remarkable things:
  - Undecidability: that there are mathematical/logical questions whose answers cannot be calculated by any (formal/logical/physical) system
  - Universal computation: there exist systems which can answer all decidable questions
  - Universal Turing Machine: an incredibly simple example of such a universal system
  Of course, these are inter-related and inevitable (otherwise they wouldn't be provable!); but at first glance it feels like these could have gone either way: Maybe all questions could be calculated, given sufficient cleverness (as Hilbert expected)? Maybe different systems would be required for different sorts of question? Maybe a universal system would be too outlandishly elaborate to make sense in our universe (as existence proofs often are)?
  Yet here we are, discussing multiple ways to perform universal computation with GNU find!
wangzhongwang
This reminds me of the classic results showing Turing completeness of things like sendmail.cf and CSS+HTML. The trick of using directory nesting depth as a counter is clever — it essentially turns the filesystem into a tape. I wonder if there is a practical upper bound from filesystem limits (e.g. PATH_MAX) that would make this more like a bounded automaton in practice.
- chriswarbo
  They explicitly state that using `-execdir` to change the working directory avoids issues with PATH_MAX; though I didn't see any mention of the working directory itself having a limit (which I assume it does, for Linux processes?)