As is now becoming tradition, Ampere Computing has published an annual roadmap update. In the 2024 disclosure, the company highlights wins for its AmpereOne product line, collaborations with partners, how it offers SKU differentiation, and some product/performance data. Some of this news is simply 'good stuff for business', but also contains some competitive analysis. I've been following Ampere for years, and while this update shows the company in a good light and moving forward, I still have some long-tail questions (and they know, I keep asking them!).
Quick Recap
Ampere Computing, or simply 'Ampere', is a rarity in the semiconductor space. The company is the only one, after succeeding over a dozen that failed, to sell Arm-based server processors in the open market. Part of that success is attributed to the staff - as CEO Renee James explained to me in an interview in December 2023, they hired experts from those failed attempts to learn and ensure they didn't fall into the same pitfalls. Today, there are many Arm-based enterprise server processors: there's NVIDIA's Grace, Amazon's Graviton series, Google's Axion, and Microsoft's Cobolt. The difference is that those processors aren't sold on the open market, at retail - whereas Ampere's product line is designed to be.
Ampere has two product lines - Ampere Altra/Altra Max, and AmpereOne. Technically, the name is probably the Ampere AmpereOne (like the Ferrari LaFerrari). The product lines are differentiated in core count, core microarchitecture, and target market, with Altra being released in 2020 and AmpereOne being released in 2023.
The Altra/Max SoC, codename 'Mystique', offers up to 128 Arm Neoverse N1 cores using TSMC N7 and eight channels of DDR4. It is monolithic, and the 80-core Altra came first, followed by the 128-core Altra Max. At the top of the stack is the M128-30, which breaks down as a Mystique-128 core at 3.0 GHz. Ampere ensures that the cores run at 3.0 GHz with any workload, regardless of complexity, in order to provide predictable performance.
The AmpereOne SoC, codenamed 'Siryn', uses a completely custom, internally developed Arm v9 core. Some minor details are known about this core - one of the overhanging questions to Ampere (which I always ask) is around microarchitecture disclosures. The part up to now has offered 128 to 192 cores, built with TSMC N5, and using eight channels of DDR5.
For US users who want to use Ampere Altra/Max today without going to a server OEM, ASRock Rack has a motherboard that often goes up for sale paired with a CPU over at Newegg for $1500. Reviews of the hardware have recommended buying a different CPU cooler than the one in box, which is more suitable for servers and not really for under-desk towers. AmpereOne is not currently yet available in a similar way.
The company has over 1500+ people worldwide, and focuses on 'sustainable computing' - i.e. extracting efficiency at the CPU socket for the cloud. The main customer (we believe) so far has been Oracle, and Altra has appeared in a number of cloud instances online from Google Cloud and AWS.
2024 Roadmap Update
For this year, Ampere Computing is announcing an expansion to the AmpereOne family. Alongside the top-of-line 192-core part, there will be a 256-core version. This one, interestingly enough, is built on TSMC N3, but will contain the same core design as before. It currently sits ready to go at the fab, looking to make its way to partners either by end-of-year 2024 or early 2025.
While the OG AmpereOne is 192-core with 8-channel DDR5, there will be a 12-channel DDR5 version as well. The 256 core will only be available with 12-channels. These 12-channel parts will be using a different motherboard due to the nature of the how many pins DDR5 needs.
Performance Comparisons
Take all the following numbers as expected given their first party nature, but Ampere claims that the 256-core will offer 40% better per-socket performance than anything available in the market.
When it comes to the current 192-core, Ampere claims up to 50% better performance per watt than AMD's Genoa offering, and 15% better performance per watt over Bergamo. Ampere styles its hardware as 'Cloud Native', so we expect to see most comparisons against other designs focused on cloud customers - that means AMD Bergamo, and in the future, Intel Sierra Forest.
On the cloud native workload side, Ampere is pulling a few strings with this one. With database workloads such as NGINX, MySQL, and others, the company compares performance/rack numbers as well as what is required to meet a goal in performance and power. In this case, Ampere claims you can meet workloads with 15% fewer servers and at 35% lower power, while experiencing up to 58% better performance per rack on some tests.
The company also likes to point out that their processors are designed to run at near 100% with predictable utilization, whereas x86 processors often are not used to the fullest, thereby perhaps increasing costs with both server deployments and licensing costs.
New Partnerships
Ampere is still technically a startup, with funding, though it has been well funded over time. The main cited partner is The Carlyle Group, but Oracle also has a financial and technical partnership. Startups love to talk about how they're integrating with other players in the industry, and Ampere is focusing on servers that use Altra or Altra Max as the CPU.
First up is NETINT - a video transcoding ASIC company (that's a gross oversimplification) which has a number of large customers that serve video that aren't called Google. The NETINT Smart VPU deals with H264, HEVC, VP9, AV1, at a wide variety of resolutions, and if I were to do an article about the video transcoding ASIC business, they'd be one of the key parts of it. In this annual update, Ampere is showcasing that NETINT has chosen Ampere Altra Max (an M96-28, that's 96 core at 2.8 GHz) as the CPU in its Quadra Video Server.
Another partnership is with Qualcomm and Supermicro - it looks like one of Supermicro's offerings in the AI LLM space will be a combination Altra Max/AmpereOne and Qualcomm Cloud AI 100 Ultra server. The goal here is to provide substantial CPU compute paired with a number (8, perhaps more?) high-performance language inference cards. Qualcomm doesn't speak about the AI 100 Ultra much, but we're starting to see it pop up paired with other hardware (Cerebras, NeuReality).
These systems will be pre-qualified for specific LLMs up to and including 100b+ parameters. The headline metric for this platform is going to be a 5x improvement in inferences-per-dollar against NVIDIA hardware.
Ampere are also keen to point out that it is a member of the AI Platform Alliance - a group of companies wanting to create standards around AI hardware to address the scale of inference that is expected to be required globally. This means hardware, frameworks, silicon integration, and provide a quicker overall time-to-market.
All the companies here are startups, funnily enough. It makes sense that a group of them want to bunch together to ensure they have a common competitive platform. Ampere is the only 'CPU' in this group, which likely bodes well for them if anything comes out of it. Most of the rest are inference hardware startups, and a good portion of those are Korea based (Furiosa, Rebellions, Sapeon). Not sure what that means in the grand scheme of things, but it's half of the Korean companies I'm tracking right now.
AmpereOne SKU Updates
So while I continually ask about what's in the AmpereOne core, one thing I do know is that AmpereOne is chiplet based. It has a main monolithic die which has all the cores, and then several chiplets around the outside which have the PCIe and DDR controllers. It's hard to say how many - in most pimp shots Ampere keeps all of that under the very thick shim. But put one up to the light, as I did way back in 2023, you can clearly see chiplets inside. If anyone is familiar with Amazon's Graviton 3/4, they are doing that as well. There may be chiplets inside AmpereOne doing other things than PCIe and DDR, but what it means is that depending on the customer, Ampere can swap out these chiplets if more DDR is needed, or more PCIe is needed, or perhaps something else entirely. This is how the company is able to offer the 192-core version in 8-channel and 12-channel variants. Ampere has designed the chiplets and IP themselves I was told, which isn't easy, but it does showcase how chiplets can be used to create different end-products.
To that point, the hood is being lifted a little at this update. AmpereOne contains two new technologies - FlexSpeed and FlexSKU.
FlexSpeed is designed to increase frequency where power is available. Ampere is quick to point out that this is NOT TURBO as other companies suggest. In this case, the processor will never go above the power limit of the hardware as sold, whereas other turbo options will. This is simply using potential extra power that may be there after binning and you end up with a good chip. Ampere says that FlexSpeed is predictable and deterministic - users can run in frequency-fixed or frequency-variable mode, but in both cases, power-fixed will always be enabled. (I should note here that AMD offers very much the same thing.)
FlexSKU however is a feature more versed to a multi-tenant operation. FlexSKU allows specific groups of cores on the chip to run at different frequencies, optimized for power or memory accesses depending on the workload. This means 10 CPUs could run at 3.0 GHz, whereas the others will run at 2.6 GHz, and that 10-core segment can be walled off virtually for a specific customer. Ampere tells me that it's actually customers that run multiple workloads on one machine that want this, so it does ultimately require a full performance evaluation of the workload. (I should note here that Intel offers something similar, called SST, though it's an incredibly long document of turbo tables even if you manage to find the documentation - Ampere told me that their solution is flexible and easy without that mess.)
2024 Thoughts on Ampere
Honestly, sometimes it's hard to know where to position Ampere. They're the only company selling Arm-based CPU silicon like this. They don't publish financial results because they don't need to - it's still a startup. I'm told most of their customers do like to keep quiet, making it hard for the company to promote its wins and allow the outside to assess how the company is scaling. Despite the launch of AmpereOne, the company is still being hush-hush on the core design, so it's hard to compare.
Breaking into an established market is hard. Qualcomm's going to try it with the Arm-based Oryon chips later this year in HPC, attempting to break the x86 stranglehold. Ampere is trying to do similar in the cloud, with Intel and AMD at the lead. Software is always a challenge, however with the advent of cloud providers rolling their own Arm silicon, cloud software is shifting to being architecture agnostic (to a certain degree). But, the fact that cloud providers are rolling their own and not buying Ampere would suggest difficulty for the company to get adopted. The cloud providers are optimizing for their own workloads - they only sell to themselves, whereas Ampere has to sell to everyone.
The corollary to this is that there are more than seven cloud vendors in the world. There are plenty of tier 2, tier 3, or sovereign installations around the world, servicing small and medium businesses or localized services for the big players. Some of these sovereign clouds have showcased budgets in the billions, and that money has traditionally been spent on x86 hardware. This could be a potential avenue, given that the big cloud custom silicon won’t be available to that market.
On the CPU core side, with the shift in the market to AI, less emphasis is being put in those systems to what the CPU core is doing - NVIDIA is using an Arm Neoverse core ‘because customers are more concerned about +10% performance on the GPU’, according to them. Ampere by contrast says that their customers fight over every last bit of performance on their hardware, even the CPU, and going down a custom core design allows for a greater differentiation versus everyone that uses the standard off-the-shelf designs. If/when the company does IPO, these annual updates are going to be under a lot more scrutiny that they are right now.
However my interview with CEO Renee James a few months ago was enlightening - I did come away with a greater appreciation for the company. They're bigger than you think, and it's clear there's a good amount of secret sauce in the chips that is likely customer-specific, and they'll never tell us about that. But If you want the hardware, simply to play with, then you can go and buy it today.
I'm hoping Ampere will disclose more about the AmpereOne/Siryn core design in due course. If they do, you'll be the first to hear my thoughts.