Posts

Showing posts from 2025

Enter the IBM z17 mainframe with Telum II (more clues for Power11?)


IBM is announcing their new z17 mainframe, based on the Telum II (see our notes on the original Telum CPU). IBM first announced the Telum II last year and the z17, its intended first deployment, has now emerged just about bang on time.
Still, we're obviously more interested in Power ISA around here, and IBM has yet to say much substantive about Power11 other than the usual assertions of additional power efficiency, more cores and higher clock. It is also expected to offer DDR5 support for enhanced memory bandwidth, though this is all but certain to require OMI DDR5, not direct-attached RAM as in our Raptor boxes. But it's often instructive to look at what's going on with IBM mainframes for microarchitectural clues now that Z-machines and IBM "big" Power chips often have the same underlying design.

The first Telum strongly emphasized cache. Interestingly, it did so by dropping categorial L3 and L4 altogether: instead, IBM developed a strategy where cores could reach into the L2 of other cores and treat that as L3, and reach into other chips' cache and treat that as L4. Each chip had eight cores and 32MB of L2 per core, giving lots of opportunity for more efficient utilization. The picture of the Telum II die above shows that IBM has not substantially deviated from this plan, using the same 128K/128K L1 but increasing L2 to 36MB per core. IBM's documentation says that there are eight cores per chip, but at a cursory glance there appear to be ten on the die, likely for yield reasons (two cores would be fused off). Assuming these dud cores still have useable cache, however, that matches IBM's specs of up to 360MB of effective L3 and a whopping 2.88GB of L4 per system.

The cores top out at 5.5GHz with various microarchitectural improvements such as better branch prediction and faster store writeback and address translation, all the typical kinds of tweaks that would also likely show up in Power11. Power11 is also expected to remain on 7nm with a "refined" process instead of moving to 5nm. (It's possible that Power12, whenever that arrives, may skip 5nm entirely.)

Of course, the marketing material on z17 is all AI all the time. IBM's claimed AI improvements seem to descend from an enhanced "DPU" ("data processing unit") with its own 64K (32K instruction/32K data) L1 cache capable of 24 trillion INT8 operations per second, the kind of bolt-on hardware that could also be incorporated or scaled-down into other products. In fact, such a product exists already, shown above: IBM's Spyre Accelerator, which is basically 32 more DPUs. These attach over PCIe and would be a good alternative to our having to scrabble around with iffy GPU support, assuming that IBM supports this in Linux (but they already do for LinuxONE systems, so it shouldn't be much of a stretch).

If you have the money and a convenient IBM salesdroid who actually answers the phone, you too can horrify your electrical utility starting in June. As for those of us on the small systems side, Power11 in whatever form it ends up taking is not anticipated to emerge until Q3 2025, presumably as what will be the E1100 series starting with the E1180 and going down. This further shrinks the production and sales window for the long-anticipated Raptor S1 systems, however, and there hasn't been a lot of news about those — to say nothing of what the Trump tariffs could mean for rolling out a new system.

Plan 9 finally comes to the POWER9


A great announcement today from the 9front team: ongoing support for ppc64. Today's release finally brings the groundbreaking Plan 9 operating system to our platform as a preferred architecture, with big endian a first-class citizen. It's idiosyncratic, sometimes baffling and at times deliberately outright user-hostile, but now you have a choice of operating systems and can truly use our refreshingly different computers with a refreshingly different user interface paradigm. I'm downloading and installing it on the Blackbird right now and I'll report back with a full review in detail.

Also, I'm well aware of the calendar, thanks.

Real-world Blackbird does real-world stuff for Apache (really)


Let's call it a #ShowUsYourTalos moment. This video from Savoir Technologies shows off their own Blackbird system, carrying an 8-core POWER9 CPU with a 3U HSF, a 4-slot NVMe riser card, two 64GB DDR4 DIMMs and a 500W PSU running on the onboard ASPEED framebuffer.

But this machine isn't just a bragging rights toy: it provides substantial support for the Apache products Savoir works on. These are primarily Java-based and there are three main choices for JDKs on POWER9, in particular Adoptium's Temurin, Eclipse OpenJ9 (descended from IBM's original J9, which I personally run on my AIX POWER6), and Red Hat's build of OpenJDK. Savoir tests on all three.

As anyone working on Java will attest, it's not enough just to make sure it works on different JVMs. This machine is dedicated to improving ppc64le support, stability and performance actually on the architecture itself. (Linus would agree.) Savoir does multiple builds to tamp down broken unit tests and find glitches due to Power ISA's different memory model guarantees. One example they cite in the video was a stress test they did on this very box, running one billion SOAP requests through Apache CXF with no errors.

I'm not involved or linked to Savoir in any way; I'm just delighted to see real hardware in the real world doing real things for real people. Right now, I don't think you're going to get throughput like this from anything with the current crop of RISC-V chips in it, and I'm hopeful that S1 is still in the pipeline to give us the shot in the arm we need to stay ahead of the curve on open hardware.

Microwatt goes multiprocessor


It's been awhile since we dropped in on Microwatt, the OpenPOWER VHDL softcore. Microwatt now runs on multiple FPGA boards or can be run (slowly) in simulation, and is capable of booting Linux. Raptor uses Microwatt for the Arctic Tern soft BMC. Although it still doesn't support vector instructions, recent commits have added an FPU and many of the standard special-purpose registers, and the newest ones now add support for SMP.

The newest pull request, currently to be committed, allows more than one processor core to be created by adding an NCPUS option to soc.vhdl. These cores can be debugged separately with JTAG and have the same view of memory and the same timebase value, and can be individually activated. For interrupts, they each have their own presentation controller in the XICS.

Although Microwatt cores are currently of only modest performance, more cores — if you have the space — can certainly improve its throughput and the range of applications it could be practical for. Unfortunately, we've still yet to hear anything new about the Solid Silicon S1 or how libre Power11 will end up being. Hopefully as the Microwatt design gets more efficient, at least the very smallest Power ISA systems will now have some additional flexibilities to work with.