Early this year, one of the AI hardware startups did something I’d been asking them all to do for a very long time: offer chips for developers. For ALL developers, not just your preferred clients. These companies need to offer something that someone can buy with $$$ to be able to use at their desk for development work, without the need for extensive engagements or large interactions back and forth.
Tenstorrent in 2023 offered two PCIe cards featuring one of their first silicon chips direct from their website - Grayskull e75 and Grayskull e150, for $599 and $799 respectively. The goal of these chips wasn’t a revenue exercise, but a chance to get their architecture and software stack in the hands of people who develop for these sorts of things. Feedback on the hardware or software has been a plus for Tenstorrent, enabling features and approaches that were outside of the box for a traditional startup.
Today, Tenstorrent is enabling a second phase of the development kit strategy. The next generation chip, Wormhole, is going to have a similar launch. There will be cards, and systems.
For the cards:
Wormhole n150 ($999), a PCIe card with one Wormhole chip
Wormhole n300 ($1399), a PCIe card with two Wormhole chips
As the names imply, the n150 is a 150 W chip, while the n300 is a 300 W card with two chips, each at 150 W. Each chip has direct access to 12 GB of GDDR6 (192-bit), and features 80 Tensix cores at 1 GHz for a total of 292 TOPS of FP8 performance.
What’s important here is perhaps the networking - each chip has 3.2 Tbps of ethernet, which breaks down to sixteen 200 Gbps connections enabling full bandwidth scale-out. Tenstorrent’s philosophy here is that the scale out performance of the system is the ultimate goal, and you can’t do that with overpowered hardware limited by connectivity. Tenstorrent shifts it to connectivity, allowing users to scale multiple chips together in a mesh. That’s where the developer systems come into play.
The first system is an air cooled design:
The TT-LoudBox, despite the name, is aimed at developers that need a bit more meat in their AI development topology. It is designed to sit under the desk, as with many workstations, and uses a 2x4 mesh topology to allow developers to understand how data transfer happens between multiple cards both with the TT-Buda (high-level) and TT-Metallium (low-level) software stacks.
The full specifications are:
TT-LoudBox, a developer air-cooled workstation with eight Wormhole chips
2 x Intel Xeon 4309Y (3rd Gen Ice Lake, 8C/16T, 2.8 GHz base, 105 W TDP ea)
512 GB DDR4-3200 ECC RDIMM (16x32 GB)
3.8 TB U.2 NVMe PCIe 4.0 x4
4 x Wormhole n300 Tensix Processors (8 chips)
4 x Warp 100 interconnects + 2 x QSFP-DD 400 GbE Cables, 2 ft
2 x 1200 W PSUs (110V) or 1 x 2200 W PSU (240V), Titanium Rated
2 x RJ45 10GBase-T via Intel X550-AT, IPMI
Based on Supermicro SYS-740GP-TNRT
$12000 pre-order, ships in 4-6 weeks
Tenstorrent is keen to stress that this is a developer workstation, rather than a production environment. For that, there’s the full server rack Galaxy systems, featuring 32 Wormhole chips, or Galaxy racks, with 192 Wormhole chips. As far as pricing goes, compared to other development systems, it is quite competitive.
The other system being made available is a similar design but uses water cooling. The TT-QuietBox.
It’s funny, because developer kits or workstations are designed to be functional. The TT-LoudBox has a functional design. This one however, actually looks the part. The TT-QuietBox aims to provide that - a quieter experience, plus with liquid cooling perhaps extending a bit further in raw performance.
TT-QuietBox, a developer water-cooled workstation, also with eight WH chips
1 x AMD EPYC 8124P (Zen 4, 16C/32T, up to 3 GHz, 125W TDP)
Tyan Tomcat HX S8040 Motherboard
512 GB DDR5-4800 ECC RDIMM (8 x 64 GB)
3.8 TB U.2 NVMe PCIe 4.0 x4
4 x Wormhole n300 Tensix Processors (8 chips)
4 x Warp 100 interconnects + 2x QSFP-DD 400 GbE Cables, 2ft
Custom Liquid Cooling for CPU and all Wormhole PCIe cards
1650 W Platinum Power Supply
2 x RJ45 10G-Base-T via Intel X710 + 2x1GBase-T, IPMI
$15000 pre-order, ships in 6-8 weeks
Honestly, for the extra 25% cost here, you get a better performance processor (Zen 4 vs Ice Lake), a lot higher speed memory (DDR5-4800 vs DDR4-3200), a single memory domain on the CPU (vs dual socket NUMA), and a quieter system.
Both systems ship without an OS, but Tenstorrent recommends Ubuntu 20.04 (Focal Fossa) and the driver stack/development stack is all available through Tenstorrent’s website and github.
I should be visiting the company soon around Hot Chips time - hopefully we’ll be able to get hands on with the units and see how easy it is for developers to get to grips with it.
For those interested in the previous generation Grayskull hardware, which Tenstorrent will continue to supply as stocks last, you can read/watch my unboxing of these cards, along with the interview with Dr Jasmina Vesiljevic, at the link below.