OpenBSD on Motorola 88000 Processors

146 points by rbanffy 2 days ago | 26 comments

OneDonOne
An excellent source for this architecture is Mitch Alsup and his Usenet posts going back to the late 1980s (he still posted regularly in the 2020s.)
readitalready
I still have some of the 88000 reference manuals, and it was really my first introduction to RISC architecture, and I thought it was great. But I never figured out why companies like Apple never chose it for their CPU?
- mrpippy
  I believe it was the first RISC that Apple prototyped building a Mac around, including a 68K emulator. IIRC from Gary Davidian’s CHM oral history, it was corporate dealmaking that led to AIM and PPC more than any technical negatives for the 88K.
  https://computerhistory.org/blog/transplanting-the-macs-cent...
  klodolph
  Yeah, this is probably closest to the right answer. Apple DID choose the 88K, and then changed. Reportedly they put some 88K systems in a Mac chassis.
  I do wonder what the exact reasons were. Maybe the PPC (complete systems) could be made cheaper? Maybe Apple was worried about relying on a single vendor? I am kind of skeptical of the “corporate dealmaking” angle, because it seems like there are valid technical reasons to NOT choose the 88K. Namely, that it requires companion chips, and the whole system (board + chips) ends up being complicated and expensive.
  kalleboo
  What I always read was that Apple did not want to be stuck relying only on Motorola again like they were with the 680x0. And it worked out, kinda, Apple had IBM to rely on to make the G5 (until IBM also lost interest)
  I remember reading that the successor 88110 design with the support chips integrated was announced mainly to woo Apple but I don't know how true that is.
  fredoralive
  Bitsavers have some documents about the Jaguar RISC project[1] that do indicate Apple's feedback went into the 88110, for example in the System ERS it states "The main processor for the Jaguar is a new version of the Motorola 88000 family which has been enhanced (with input from Jaguar's team) in several areas over the existing implementation. This processor (which will be the MC88110) will be referred to as XJS in the ERS.". There's also an architecture document describing changes Apple wants to make to the 88000 ISA, although I'm not sure how much of this actually got through into the final 88110 (Apple wanted to break binary compatibility, not sure if that happened).
  [1] The high end RISC machine project that went nowhere, which AFAIK became known as Tesseract when switched to PPC before it fizzled out.
- kev009
  Timing. The 68k still had legs, i.e. the 68040 provided great drop in performance and had an enormous ecosystem and economies of scale. By the time the RISC wars were starting to get fever pitched, the POWER architecture and AIM alliance seemed like a blessing to combine ecosystems and economies of scale for the A and M constituents. And it was.. successful product lines for 2-3 decades from all sorts of embedded systems to G5 workstations to spacecraft.
- rbanffy
  The 88000 was implemented across three large ICs. This took an enormous amount of board space and would be unfeasible on the smaller Macs.
- panick21_
  They did basically. What happened is that Apple own CPU project crashed and burned. Then they had some meetings with people including DEC for Alpha and IBM. IBM offered POWER and IBM was also willing to go in on some other projects, like the next gen OS Teligent.
  But Apple didn't want to drop Motorola fully. So Motorola, Apple and IBM figured out that with some tweaks to the 88000 the could turn it into something POWER like. And that thing was PowerPC that Motorola supplied to Apple. That's my understanding.
- shawn_w
  Complicated, expensive CPU marketed to very high end workstation use? Nobody thought it was worth picking up even if it was faster than the alternatives.
  hrmtst93837
  Nobody wanted to bet payroll on a weird new ISA with no volume story. The "faster" part only matters if your compiler and OS aren't tripping over oddball silicon limits every patch release and that was a huge if back then, because once the toolchain, ABI, and kernel are all fighting the chip the benchmark win dies fast.
  panick21_
  Exept of course that Apple internally spent outrages amount of resources on their own CPU project that also wouldnt have had a volume story. Its only because that procet failed that they started looking at alternatives.
- cmrdporcupine
  Both Apple and NeXT had machines prototyped around it, but it was initially very expensive I believe, and I think Apple was easily convinced to go with PowerPC ... and rather than evolve it and push it further Motorola dropped it in favour of going in on PowerPC.
  The sad thing is Intel showed there was still life left in CISC, and Motorola themselves ended up circling back on 68k in the form of ColdFire which proved you could do for 68k what Intel did w/ the Pentium. But by then all their 68k customers had moved on from the 68k ISA.
  p_l
  68k was much harder to optimize than x86, being way more CISC-y
  68k like VAX was seen as dead avenue not only compared to RISC
  adrian_b
  Motorola had made a few design mistakes, like adding memory indirect addressing in MC68020, which were removed much later, in the ColdFire Motorola CPUs.
  But Intel had made much more design mistakes in the x86 ISA.
  The truth is that the success of Intel and the failure of Motorola had absolutely nothing to do with the technical advantages or disadvantages of their CPU architectures.
  Intel won and Motorola failed simply because IBM had chosen the Intel 8088 for the IBM PC.
  Being chosen by IBM was partly due to luck and partly due to a bad commercial strategy of Motorola, which had chosen to develop in parallel 2 incompatible CPU architectures, MC68000 intended for the high end of the market and MC6809 for the low end of the market.
  Perhaps more due to luck than due to wise planning, Intel had chosen to not divert their efforts into developing 2 distinct architectures (because they were already working in parallel at the 432 architecture for their future CPUs, which was a flop), so after developing the 8086 for the high end of the market they have just crippled it a little into the 8088 for the low end of the market.
  Both 8086 and MC68000 were considered too expensive by IBM, but 8088 seemed a better choice than Z80 or MC6809, mainly by allowing more memory than 64 kB, which was already rather little in 1980.
  In the following years, until 80486 Motorola succeeded to have a consistent lead in performance over Intel and they always introduced various innovations a few years before Intel, but they never succeeded to match again Intel in price and manufacturing reliability, because Intel had the advantage of producing an order of magnitude more CPUs, which helped solving all problems.
  Eventually Intel matched and then exceeded the performance of the Motorola CPUs, despite the disadvantages of their architecture, due to having access to superior manufacturing, so Motorola had to restrict the use of their proprietary ISAs to the embedded markets, switching to IBM POWER for general-purpose computers.
  p_l
  Analysis of issues in making more performant 68k and VAX are major part of what led to RISC development, with complex addressing (even in earliest 68000) being part of the problem. People think of x86 as CISC when reading about CISC vs RISC, but x86 was not much of a consideration when industry was switching to RISC-style designs - it was hitting walls on complex ISAs, especially VAX (which was allowed to live for way too long), but also to an extent 68k.
  N.b. 68000 was supposed to be a 16bit extension of 6800, which among others resulted in hilarious two layers of microcoding.
  AS for IBM PC, 68000 had major flaw of being newer while 8086 had been available for longer and with second sources - 68000 was released at the same time as reduced capability 8088, while equivalent reduced capability model for 68k arrived in 1982.
  Someone
  > 68k was much harder to optimize than x86
  Harder to optimize or because of its orthogonal instruction set easier to write code for?
  p_l
  Harder to optimize at microarchitectural level because each individual instruction represents way more complex execution model, including to even decode what the CPU is supposed to do.
  X86 is comparatively simple, with limited indirect addressing support to the point it can be inlined in execution pipeline, and many instructions either being actually "simple" to implement, or acceptable to do in slow path. M68k (and VAX even more) are comparatively harder to build modern superscalar chip for.
  rjsw
  What matters is how easy it is to create an out-of-order implementation of an ISA, there isn't a 680x0 equivalent of the Pentium Pro.
  inkyoto
  Respectfully, this is nonsense.
  «More CISC-y» does not by itself mean «harder to optimise for». For compilers, what matters far more is how regular the ISA is: how uniform the register file is, how consistent the condition codes are, how predictable the addressing modes are, and how many nasty special cases the backend has to tiptoe around.
  The m68k family was certainly CISC, but it was also notably regular and fairly orthogonal (the legacy of the PDP-11 ISA, which was a major influence on m68k). Motorola’s own programming model gives one 16 programmer-visible 32-bit registers, with data and address registers used systematically, and consistent condition-code behaviour across instructions.
  Contrast that with old x86, which was full of irregularities and quirks that compilers hate: segmented addressing, fewer truly general registers (5 general purpose registers), multiple implicit operands, and addressing rules tied to specific registers and modes. Even modern GCC documentation still has to mention x86 cases where a specific register role reduces register-allocation freedom, which is exactly the sort of target quirk that makes optimisation more awkward.
  So…
  68k: complex, but tidy x86: complex, and grubby
  What worked for x86, though, was the sheer size of the x86 market, which resulted in better compiler support, more tuning effort, and vastly more commercial optimisation work than m68k. But that is not the same claim as «68k was harder to optimise because it was more CISC-y».
  p_l
  Notice I didn't write harder to optimize for - I am not talking about optimizing code, but optimizing the actual internal microarchitecture.
  Turns out m68k orthogonality results in explosion of complexity of the physical implementation and is way harder to optimize, especially since compilers did use that. Whereas way more limited x86 was harder to write code generation for, but it meant there was simpler execution in silicon and less need to pander to slow path only instructions. And then on top of that you got the part where Intel's scale meant they could have two-three teams working on separate x86 cpu at the same time.
rbanffy
I confess I have a soft spot for these machines - the road not taken is always tempting to explore. Sadly, it didn't do well on the market, even less in Europe, so there are very few working machines around me and even fewer floating around on eBay. :-(
zdw
The 88k multi-chip cache/MMU architecture is fascinating, especially how it could be designed with a single cache chip, or a split I/D cache across two or more different chips.
znpy
Weird to see Omron mentioned. I have a digital weight scale from them in my bathroom :)
snvzz
m88k is an ISA primarily designed by Mitch Alsup.
Mitch Alsup has extensive experience in ISA design, has participated (tangentially) in informing RISC-V design process.
Recently, he's designed my66000, an interesting, fresh take at a new ISA that I recommend exploring.
- Findecanor
  Alsup liked to write a lot about his my66000 on Usenet, but does not share documents about it with everyone. (Yes, I've emailed him and been ignored. I have had to piece together what I know about it from multiple posts.) Apparently it runs in FPGA and there are assemblers and compiler back-ends for it.
  Like the 88000, the register file is shared between integer and floating point units. One interesting detail is that it supports CRAY-style vector operations using the same architectural registers, and downgrades to scalar operation automatically on interrupts. This means that the register state to load/store on context switches is small.
mrumon
[dead]
helf
[dead]
mohite
[flagged]