• This is a really great test for vibe coding. This isn't easy, but it took me several hours to pass. Vibe coding the results is ... not exactly faster. Reminding it to output logs (I'm just doing this in chat and manually copy/pasting the code), it getting hung up on 'the maximum wait time' exactly equaling the challenge, etc. Opus was able to generate a passing implementation up to level 7 on the first level but can't seem to pass level 12. Sonnet, had to iterate on every level up to level 5, and couldn't pass that level.
  • Almost the same game but in C++ was an assignment in my Software Development class. The last challenge had multiple elevators that each would only service specific floors. You were intended to do some kind of pathfinding to get all passengers to their destination in time. But I just tried multiple heuristics until one passed the hidden test set.
  • Solving it with Claude is a totally different kind of fun of course. But anyway, Claude browser extension is very good at it. I sent it the initial prompt, and then asked it to continue on each next challenge. It passed first 5 challenges on the fly, and started to struggle on challenge 6, which it solved after 4 attempts. I stopped at that point because the fun was depleted.

    It's like role-playing a story of software developer in the era AI, but accelerated. The results are truly good and fast. Coding fun zero. The new fun is prompt/context engineering.

    <elevator_saga_solver_prompt> You are a JavaScript developer. On this page you are presented with a coding challenge to solve: an elevator to program in JavaScript. Analyze the page, take a screenshot to understand the floor and elevator layout (how many floors, how many elevators), see the sample code in the solution text box and replace it with your solution for the challenge. Keep the solution simple, just sophisticated enough to solve the task at hand, do not over-engineer or optimize, not unless your initial solution fails. After you insert the solution into the text box, click the "Start" button to test it. After a time limit set for a solution (it is indicated on a page), verify if the solution worked: read page or take screenshot. If it didn't work, try a new better solution. If it worked, you task is complete. See the API documentation here: https://play.elevatorsaga.com/documentation.html#docs . </elevator_saga_solver_prompt>

  • AKA the hard drive scheduling game. Takes me back to my first algorithms class in school thirty five years ago.
  • When I was at AWS over a decade ago, there was endless complaints about the elevator algorithms by engineers, with the usual egotistical tech-bro insistence that they could do better. Things used to really suck at lunchtime when folks would flood to the elevators and be stuck waiting for ages. Those same geniuses could never figure out the benefit of staggering lunch times.

    Someone got really tired of it, and somehow organised a hackathon weekend, with the elevator company, and let teams of engineers have at it.

    Every single team failed to come up with better algorithms. All the complaints stopped dead.

    • Is AWS an environment where attempts to discover improvements are commonly mocked?
  • I love this, but I have also been having a lot of fun with https://www.codingame.com/
  • I haven't played this since probably around 2015, but I think about it regularly. I want there to be more programming games like this, but have yet to come up with any ideas as perfect as this one.
  • I remember playing this ~10 years ago. It was very humbling.
  • I thought it was fun to search for a solution that can beat every level (eventually found one!) As far as I know, no LLM can do this on its own, which tells us something about the kind of problems they’re weak at.
  • This kind of stuff can be a great LLM benchmark as Opus basically screwed it up and created a monstrosity as solution on first try.
    • Interesting! It did well for a first try. This was my prompt:

      Lets play elevator saga! Here's the initial implementation:

          {
              init: function(elevators, floors) {
                    var elevator = elevators[0]; // Let's use the first elevator
                  // Whenever the elevator is idle (has no more queued destinations) ...
                  elevator.on("idle", function() {
                      // let's go to all the floors (or did we forget one?)
                      elevator.goToFloor(0);
                      elevator.goToFloor(1);
                  });
              },
              update: function(dt, elevators, floors) {
                  // We normally don't need to do anything here
              }
          }
      
      and documentation attached in the PDF.
  • Fun!

    Reminds me that one of my favourite exercises in TLA+ is to design an elevator call system.

  • I've been fascinated by elevator algorithms since visiting NYC as a kid. The interesting stuff starts to happen when you account for popular floors, people going to work, coming home at the end of the day, dog-walking times, subway arrivals, all the semi-deterministic behavior we see in real life.