Over the past year, I’ve had the chance to quiz every major foundry about their plans for the future. The latest in this series is TSMC, as part of their recent announcements around their roadmaps and upcoming process node technologies. As the world’s leading manufacturer of leading edge and EUV-based logic, as well as packaging the vast majority of the big AI chips, the pressure on TSMC to execute at speed and scale has only increased over the last few years. As part of their commitments to next-generation technologies, the company announced its A16 process node, Super Power Rail (the marketing name for its backside power delivery), ventures into co-packaged optics, and Silicon-on-Wafer technology, leveraging a demand for bigger substrates for the biggest chips. Here’s my latest video on these announcements.
These announcements were made at TSMC’s US Technology Symposium earlier this year. The symposium is a roadshow, covering continents and key markets for the company. Just before the EU Symposium, I had a chance to catch up with Dr. Kevin Zhang, SVP and Deputy Co-COO of TSMC for an interview. We covered a wide range of topics, from Kevin’s thoughts on Moore’s Law, to how the market has changed under the weight of AI (also, the activations of AI).
Here is a video of this interview, or a transcript is available below.
IC: When CEOs give presentations about the future of computing, some are saying about how Moore’s Law is dead. They say they’re having to innovate on the architecture side because they're not getting much from packaging and process node technologies. What is TSMC position on this?
KZ: Well, my simple answer is - I don't care. As long as we can continue to drive the technology scaling, I don't care if Moore’s Law is alive or dead.
But the reality is that many have narrowly defined Moore’s Law based on two-dimensional scaling - that's no longer the case. As you look at the innovation hype in our industry, we actually continue to find different ways to integrate more functions and more capabilities into smaller form factors. We continue to achieve a higher level of performance and a higher level of power efficiency. So from that perspective, I think that Moore’s Law, or technology scaling, will continue. We will continue to innovate and carry the industry forward.
Q: So should we redefine Moore’s Law? Should we have a new law?
A: (laughing) I'll leave somebody else to define.
Q: TSMC is known for being incremental with its process node updates- you have a major process node and then minor variations that iterate on them. How much of TSMC’s success would you attribute to this incremental strategy?
A: Well, I don't particularly like the word incremental! If you look at our technology roadmap, from 5nm to 3nm, and now from 3nm to 2nm, if you look at the energy efficiency improvement, it's not incremental. It’s over 30% per generation. But in-between the major nodes, we are continuing to drive the incremental enhancements. The reason we do that is that we allow our customers to continue to harvest each new generation of technology in terms of the scaling benefit. So between major nodes - yes, we continue to drive the incremental. But between the major nodes, the enhancement, or the power, performance, density improvement is very substantial.
Q: Is that because when a customer goes into a major node, they have a lot of upfront costs to develop the chip? By using those updates to those major nodes, they can leverage the same or at least similar designs rather than spending a big bucket of money?
A: Correct. For example, after our customer upgrades to 5nm, they can continue to leverage the incremental enhancement from N5 to N5P, when you get the performance boost, then you can go to N4 and N4P and you can get further density improvements. So those incremental enhancements, after you jump on the major node, allow our customers to continue to harvest the scaling benefits and the investment they make upfront.
Q: How many of these enhancements to a major node are from your internal outbound design, versus in-bound customer demand?
A: We work very closely with our customers to pick the right technology node to intercept their product. This is often based on specific product design, to see where they can most achieve their optimum product level benefit. So we work very closely with our customers to make the right choice.
Q: Have you had any surprising requests from customers?
A: No. We don't want surprises from our customers. We actually work very closely with them - we're open with our customers to make sure they choose the right technology. Remember, we are in a foundry business model. Our goal is to help customers achieve a successful product. My boss often tells me “Kevin, you know, we're in the foundry business, we work together, we achieve success, but there is a sequence, the customer has to succeed first, then we can be successful”.
Q: We're here at the TSMC EU Tech Symposium; you've just had the US Symposium. The major announcements are a new major node, A16, and this new Super Power Rail technology also coming in with that generation. What do these bring to the table?
A: A16 is a major technology enhancement, and is revolutionary in terms of bringing further power and performance to future high-performance applications, especially those targeting HPC and AI. A16 features nanosheet transistors, which are the industry-leading and the most advanced transistor architecture.
At the same time we have added a very innovative backside power rail design. This backside power rail design allows the customer to move the power supply routing from the front to the back, opening up the space to enhance the performance and at the same time improve the power supply. Our approach is very different from the conventional design for BSPDN - in the conventional backside power rail, you just a drill hole to connect the backside metal to the front side of metal. In doing so you burn space, and you have to enlarge the footprint of the library cell. But in our design, we have very innovative approach - we move the contact or transistor, the source of the transistor to the back without changing the footprint of the library cells. So this clever way allows us to maintain the footprint and provide the maximum flexibility to our customers
Q: Does that mean that the traditional manufacturing steps go a little bit out of order In order to enable that?
A: I don't want to get into this specific process step because our R&D team would not be very happy, but yes.
Q: It is very much a sandwich design, transistors, signal, and power. Surely that would add a lot of cost to the manufacturing?
A: Absolutely, it definitely will have a cost associated with it, but if you look at the density, the power and the performance benefit, I think it outweighs the costs. This is particularly important for HPC and AI applications where energy efficient compute is the key driver.
Q: So if somebody chooses the A16 node, do they have to take the Super Power Rail with it?
A: A16 by definition will have Super Power Rail, yes. But we do offer the technology option to allow our customer to continue to leverage the existing design collateral and not have to use the backside power. For example, mobile applications where the power supply routing is not as intense, you don't have to use the backside power.
Q: Normally at these events, whether it's you or your competitors, the announcement comes a few years before production. So, where in the timeline are we expecting A16?
A: So we are targeting the second half of 2026 to go into production for A16 for lead customers.
Q: Does that mean you're roughly at version 0.1 in the PDK right now?
A: I don't want to go into detail in our collateral schedule, but in general our collateral schedule is designed to target the customer production days. As I said earlier, we hope to have A16 going into production by the second half of 2026. So our collateral schedule would support that kind of schedule.
Q: We're expecting this all to be manufactured in Taiwan?
A: A16 will start in Taiwan.
Q: Last year, you introduced this term called FINFLEX - the ability to take N3 and refine the fin depopulation to decide whether you wanted high performance or high efficiency. Now you're doing this NanoFlex with N2. How does NanoFlex differ from what we understand with FINFLEX?
A: This is a very innovative approach. You have probably heard about Design-Technology Co-optimization (DTCO), where we continue to drive the collaboration between design and technology in order to further optimize our technology offering. For FINFLEX, in a fin transistor architecture, the number of fins is digitized. So in the past, before this innovative FINFLEX approach, you have to use either three fins or four fins, you can’t swap them easily. Our FINFLEX technology at 3nm allows designers to mix and match different fin-based library designs, but for the nanosheet we call it NanoFlex. This is a similar idea, to allow the designer to mix and match different heights of the libraries, but they differ the sheet width so you can alternate different sizes. Different heights of libraries will allow designers to pick and choose based on specific design target to achieve optimum benefit into more power, more performance and higher density.
Q: So the nanosheet structure is still three sheets high, which seems to be an industry standard?
A: Yes, but you can vary the sheet width, which determines the height of the library.
Q: Then that obviously means you have different VT option as well? NanoFlex and VT?
A: Yes, you do have a lot of options as a designer!
Q: CoWoS (Chip on Wafer on Substrate) is in high demand - we only have to look at NVIDIA, AMD, Intel demanding more. Where are you on being able to supply what the market needs? How is the expansion of CoWoS progressing?
A: For us, CoWoS is the workhorse for AI accelerators. If you look at all the large AI accelerator designs today they’re pretty much all based on TSMC N5 or N4 technology plus CoWoS. CoWoS is in high demand! Last year, the AI surge took a lot of people, including ourselves, by surprise. The CoWoS demand has surged tremendously over the last year.
We are rapidly expanding our CoWoS capacity now, I think the CAGR we're talking about is well above 60%. It’s very very high, but it's still continuing to grow. We work very closely with our customers to make sure we provide their most critical needs. But that's the capacity side, CoWoS, the internal capability. We're also expanding our CoWoS capability.
If you look at today's state of the art AI accelerator, the CoWoS interposer size is roughly 3x of the reticle size, and the reticle size is about 800mm2 - that provides the capability to integrate a full reticle size SoC along with up to eight HBM stacks. But in the future, just two years from now, we will have the ability to expand the interposer size to 4.5x of the reticle size, allowing our customers to integrate up to 12 HBM stacks. We're not going to stop there. Our R&D team already started expanding the CoWoS interposer size beyond that to 7x or 8x the reticle size.
Q: Is 12 HBM stacks enough? I keep hearing that people want more.
A: So at this symposium, we also announced another innovative system-level integration technology we call System-on-Wafer (SoW). If you think about it, the maximum size you can do in a wafer processing facility is a single 300mm wafer, so we take the wafer as our base layer, and we can bring all the logic and high-bandwidth DRAM together to integrate along the whole wafer area. So with this, if you measure using CoWoS terms that the number of “X” of interposer size, that is 40x, so humongous. This is where we provide our customers, to continue to integrate more compute functions, more memory bandwidth to address future AI requirements.
Q: It's very well-known that there are two main companies looking at wafer scale today (Cerebras and Tesla), and they both use you. How much do you assist the customers when it comes to cooling and power management on that side?
A: We work very closely with the customer - we do wafer-level integration. The customer obviously has to design the back-end of the system level in terms of how to bring coolant into the system. Obviously, you have lots of collaboration - we work very closely with our customer on the system provider to work together to find the optimum thermal solution.
Q: For this System-On-Wafer, it's due in 2026/2027?
A: We already have a limited production run, but as you pointed out, you know there will be more AI high-performance customers who want to leverage this system-level integration to address their future needs.
Q: As you're speaking about 3x reticle, 4.5x and 40x reticle - in the future when reticle sizes have to get smaller due to the technology (High-NA EUV) are we going to be saying to double that number?
A: I hope we don't have to reduce the reticle size! What we see is people want to integrate them all so they function closely together.
Q: So does this mean that when we're getting more efficient process node technology, the demand for compute power is always increasing? You've now said we're at wafer scale level packaging. Where's the limit?
A: The sky is the limit. I think we'll continuously see this trend in the demand for energy-efficient compute - it’s insatiable. If we're talking about the AI model ChatGPT, GPT4 already uses a lot of training AI chips. So we continue to expand our capability at the transistor level - we have 3nm next year, then we're going to 2nm and we have A16 will continue to drive the energy efficient compute at the transistor level. At the same time we talk about CoWoS to expand on the wafer level integration. We will also bring the optical signaling into the package. Putting all this together, really we're talking about providing a customer the platform to allow them to integrate more compute functions and more memory bandwidth all together to address the future AI requirements.
Q: When I speak to the companies integrating onboard optical, they're dealing with your competitors for foundry. But you guys have been doing optical for a while and you've got this new technology.
A: It’s the Compact Universal Photonics Engine, which is what COUPE stands for.
We have been doing silicon photonics for quite some time. Actually, we have fabricated components for our customers who put silicon photonics together with an electrical transceiver – these form the optical pluggable transceivers that are widely used in the data center.
But what we're doing today is we’re going one step further, leveraging our most advanced 3D stacking technology. We use the hybrid bonding techniques to bring the electronic die and photonic die closer together to form a small form factor optical engine. That's where you do the electron-to-photon conversion. We know that electrons are good at compute, but photons are better when talking about signaling. So by building this compact optical engine, then we integrate it into the advanced packaging, whether it's today's substrates, or in the future could leverage something like CoWoS to bring them together to significantly improve the bandwidth and power efficiency. If you look at today's pure copper all-electronic system, a 50Tb/s switch can burn over 2000 watts. So by using this tiny small form factor optical engine, we actually can bring the power down by at least 40%. So this is very efficient in terms of achieving high data bandwidth at the lowest possible power.
Q: Will you end up with some customers who want wafer scale, and some who want to go optical?
A: Yes, but I think the key is to bring them together because the compute still has to be done by electrons.
Q: So you're talking about having two separate chips, the electrical and the optical and bonding them together with your most advanced hybrid bonding technology, and some companies are saying we actually want all of that on the same chip. Is that something you can do?
A: It's difficult to make some photonic features on an advanced electronic die - it's difficult to do a monolithic die. However, I think by using our hybrid bonding technique, we achieve the kind of connectivity and power efficiency almost like a monolithic solution, but at the same time it allows us to optimize the electronic die and the photonic die separately. I think this is how you get the best of both worlds.
Q: You spoke about pluggable transceivers for the networking side of things, but what we're talking about here is more about integrated photonics, direct die-to-die into the package. What about pluggable version of that?
A: The pluggable version exists today actually, if you look at the data center today, the prevailing way to do this is using pluggable. At the board level, yes. So you convert from electron to photon at a board level. So in the future, you're converting within the die. So the signal coming out the die already returned electrons to photons. So that's where you get into efficiency.
Q: But will that ever be pluggable?
A: It is already pluggable; it has an optical fiber that you can plug into your die. You just don't want to take it out!
Q: I recently went on a fab tour, to see the latest and greatest from ASML – a new high-NA next-generation EUV machine. Intel is very forthright in talking about this technology and the fact that they want to be the first to deploy it. Where does TSMC officially stand on high-NA?
A: Let's step back - TSMC is the leader in terms of bringing EUV into high-volume manufacturing. Back to our 7nm generation, we were the very first in the industry to bring EUV to a HVM (high-volume manufacturing) environment. We're still the leader today in terms of production use of EUV and production efficiency. I think our R&D team will continue to look at the new EUV capability, including obviously high-NA EUV. We're going to pick the right place to intercept our technology node. There are lots of factors you have to consider right there is obviously the scalability factor there is also a cost manufacturability factor.
Q: There are also discussions about you expanding worldwide production with multiple new fabs around the world announced. How is that progressing?
A: It's progressing very well. And they're very fast too. If you look at our manufacturing footprint, it has expanded quite significantly. Just the over last few years we are expanding quickly in our Arizona site. We built the first fab, focusing on 4nm, which we're going to move into production next year. We are also building the second fab there, and we announced a third fab there. We're going to continue to bring the most advanced node to North America, as that's where our largest customer base is. Right from 4nm to 3nm, down to 2nm, or even A16 in the future. It’s very exciting!
At the same time, we're expanding our specialty technology in both Japan and Europe. In Japan, the Kumamoto project has gone well, and we're going to produce it in the second half of this year. We're going to bring most advanced MCU embedded non-volatile memory, which is very important for the auto industry here in Europe,
Q: Does that extend to packaging in any way?
A: We are evaluating the option, but at the same time, I think right now we are working closely with our partners to bring up the manufacturing capabilities here in the US.
Q: Because we're talking about the demand for AI and machine learning, are you seeing that TSMC through your customers that the R&D is pivoting more to catering to those customers?
A: AI is becoming the world's major technology platform, consuming the most advanced silicon. But don't forget mobile. Mobile continues to be a large-volume consumer, and it also requires the most advanced technology. So we are addressing all the needs with optimized technology for different applications and segments. Our R&D works very closely with different customers on different applications. So I think this is something very exciting. We will continue to drive our technology customization to address future product needs.
A very nice and informative interview!
Hmm .. so for the innovative super power rail approach they have decided on direct contact backside power .. just like Intel adopted and demonstrated 2 years ago? Who uses the "conventional" approach Dr. Zhang holds up as a contrast?
It is good to see consensus approaching on direct contact. It does seem to give the most benefit.