Keynote notes from Day 1 of OpenPOWER Summit NA (and introducing the Condor)
I'm still catching up on everything since I have to do this after $DAYJOB, but the big news from the OpenPOWER Summit keynote among all the great vendors and technology announcements (Day 1) was the last of the POWER9s and the next Raptor system.
Although there were many great pieces in the keynote, the IBM Power roadmap is of course of significant interest. The big one was a subtle but significant change in announced specs. Although one more generation of POWER9 is planned before POWER10, compare this slide to what we posted last year:
For the (now) 2020 "Advanced I/O" POWER9, there's still the same number of PCIe lanes, same signaling speed, same CAPI, NVLink and OpenCAPI 4.0 options. But memory bandwidth went from 350 GB/s to 650.
This whopping difference appears to be from OpenCAPI agnostic buffered memory, implemented as OMI, the Open Memory Interface. For POWER8 IBM introduced Centaur, a way of getting around the inherent limitations of running DDRRAM on a large number of channels by creating an intermediate controller. Instead of driving the RAM directly (as in POWER9 scale-out CPUs like the Sforzas in our Talos II systems), Centaur accepts high-level read and write commands from the CPU(s) and abstracts away the details of getting it to and from the actual DIMMs, including reordering requests and caching them as needed. Each differential memory interface channel on the CPU has its own Centaur which in the current implementation offers four DDR4 memory channels, giving a single CPU up to eight buffered channels to memory and effectively 32 channels to the underlying DDR4. Centaur is also supported on POWER9 scale-up so that people's investment in RAM won't go to waste, but a complex chip like that adds various board engineering constraints, which is why POWER9 scale-out with direct memory attachment was also offered as an option for systems that didn't quite need all Centaur had to offer. (Scale-out's emphasis on PCIe lanes also makes a difference in that market segment, too.)
The idea with OMI is striking a balance between buffering memory and directly attaching it, and AIO POWER9 will be the first CPU to support it. By being "agnostic" it has no ties to any particular underlying memory technology, meaning it can grow as new technologies emerge. OMI runs at 25Gbps per lane and with a latency of just 5ns instead of the 10ns of present-day Centaur. Best of all, it will be non-proprietary, meaning any vendor that wants to make an OMI-compliant memory system can do so and hopefully increase the economies of scale. In fact, one of them did:
Microchip subsidiary Microsemi's OMI-compliant "differential DIMMs" (DDIMMs) should be simultaneously available with the AIO POWER9 next year, using their custom on-board DDR4 OMI interface with an eight-lane channel for a full 25GB/s. I have to say I'm a little cold over yet another RAM standard (looking at the weirdo RAM in my SGI Fuel), but as long as the prices are competitive and the performance is stonking, I could be convinced. Alternatively, the OMI controller could simply be on the board and fan out to regular DIMM slots more or less as things work now, though this robs the standard of some of the future proofing I think it's intended to have.
Back to IBM. The 14nm "Bandwidth Beast," as they're nicknaming the AIO POWER9, will have 16 x8 OMI channels for 25 GT/s and -- there it is -- up to 650 GB/s peak bandwidth. Microchip's buffer won't get that high, though, which is a puzzling thing to pair it with; it seems to top out at "only" 410 GB/s (I know, cry me a river). Onboard will be up to 24 SMT-4 cores, up to 120MB eDRAM L3 cache, 48 PCIe 4.0 lanes (yes, same as our scale-out Sforzas) at 16 GT/s, and up to 48 lanes each for NVLink and OpenCAPI 4.0 attaches. Clearly IBM intends this to replace both scale-up and scale-out simultaneously, so I guess AIO also stands for "all in one":
Oh yeah ... Raptor was there too. Here's Hugh Blemings introducing Tim Pearson:
I'll gently needle Raptor here and say they need a PowerPoint or LibreOffice deity to sex up their slides a bit. But who needs eye candy when you can announce this?
Yes, friends, you too can have a big, intimidating vulture of a computer -- in a form factor smaller than the T2. The Condor is that mythical LaGrange system we heard about last fall. This is a single-socket system to get it to fit in an ATX form factor as opposed to the hulking EATX T2 I'm typing this on, so it won't take advantage of the extra X-Bus capacity, although a multi-socket LaGrange would probably have been too pricey and power-hungry (and too big) for our rarified workstation market anyway. It will have 4 PCIe slots and 8 DDR4 slots (42 PCIe lanes as opposed to Sforza's 48, but double Sforza's four DDR4 channels), which for my money would slot it between the T2 and T2 Lite, and Raptor seems to be encouraging this comparison with the board size. The extra PCIe slot probably would entice some buyers who don't find the Blackbird or T2 Lite expandable enough but don't want to go the full hog, as well as those looking for a less expensive platform to experiment with OpenCAPI (it offers one slot).
We would expect nothing less from Raptor than it to be a fully open, blob-free platform, and it will be available Q1 2020. Price wasn't announced, but my guess is it will be commensurate with that same product placement.
One final miscellaneous note; I don't recall anything said about this, but Raptor seems to have it on its wiki now, so I'll assume the "embargo" is lifted. While you're waiting for the AIO, in the meantime the Sforza DD2.3 stepping should be emerging soon, which will fix various errata including the DAWR for hardware watchpoints. Finally! This should drop right into your existing Talos and Blackbird systems.
Comments
Post a Comment
Comments are subject to moderation. Be nice.