# LSI design toward 2010 and low-power technology

Takayasu Sakurai

# University of Tokyo Center for Collaborative Research, University of Tokyo 7-22-1 Roppongi, Minato-ku, Tokyo, 106-8558 Japan Phone: +81-3-3402-6226 E-mail: tsakurai@iis.u-tokyo.ac.jp

#### **Biography**

Takayasu Sakurai received the B.S., M.S. and Ph.D degrees in EE from University of Tokyo, Japan, in 1976, 1978, and 1981, respectively. In 1981 he joined Toshiba Corporation, where he designed CMOS DRAM, SRAM and BiCMOS ASIC's. He also worked on interconnect delay and capacitance modeling known as Sakurai model and alpha power-law MOS model. From 1988 through 1990, he was a visiting researcher at Univ. of Calif., Berkeley, doing research in the field of VLSI CAD. From 1990 back in Toshiba, he managed RISCs, media processors and MPEG2 LSI designs. From 1996, he is a professor at the Institute of Industrial Science, University of Tokyo, working on low-power and high-performance system LSI designs. Prof. Sakurai served as a conference chair for Symposium on VLSI Cirucits, a vice chair for ASPDAC and a program committee member for ISSCC, CICC, DAC, ICCAD, FPGA workshop, ISLPED. TAU, and other international conferences. He is also consulting to US startup companies.

#### **Summary:**

If we look into the scaling law carefully, we find that three crises can be stringent in realizing LSI's of the year 2010: namely power crisis, interconnection crisis, and complexity crisis.

As for power crisis, there are activities to lower the power consumption from device level, circuit level to system level. Lowering supply voltage ( $V_{DD}$ ) is very effective in reducing the power but the threshold voltage ( $V_{TH}$ ) should be reduced at the same time for high-speed operation. The low  $V_{TH}$ , however, increases the leakage current. To overcome this situation,  $V_{TH}$  and  $V_{DD}$  control through the use of multiple  $V_{TH}$ , variable  $V_{TH}$ , multiple  $V_{DD}$  and variable  $V_{DD}$  are intensively pursued and some have been productized. At the system level, a system LSI approach is promising for realizing low power. The new trend is to exploit cooperation of software and hardware. In the sub 1-volt design, watch out for the abnormal temperature dependence of drain current.

The interconnection will be determining cost, delay, power, reliability and turn-around time of the future LSI's rather than MOSFET's. RC delay problem can be solved through LSI architecture realizing "the further, the less communication" with the help of local memories.

It is just impossible to design LSI's with 100 million transistors from scratch. The complexity issue can only be solved by the sharing and re-use of design data. So-called IP-based design will be preferable. The virtual components are put together on a silicon to build billion transistor LSI's, which can be compared to the present system implementation with pre-manufactured LSI components.

In the year 2012, sensors / actutors can be integrated on a chip with 0.06 $\mu$ m 2G Si FET's with V<sub>TH</sub> & V<sub>DD</sub> control. Globally asynchronous LSI's with locally synchronous 10GHz clock will be implemented.

ICVC '99/10

# LSI Design Toward 2010 and Low-Power Technology

Prof. Takayasu Sakurai Center for Collaborative Research, and Institute of Industrial Science, University of Tokyo E-mail:tsakurai@iis.u-tokyo.ac.jp

- 1 Scaling and three crises
- 2 Power crisis
- 3 Interconnection crisis
- 4 Complexity crisis

Fig.1 Title

Scaling Law

#### Scaling Law Drai Sou Size 1/2 0.2micron Unfavorable effects Favorable effects x1/2 Size Power x1.6 RC delay Voltage x1/2 x3.6 **Electric Field Current density** x1 x1.8 Speed x2 Voltage noise x2.5 x1/4 Cost Design complexity x4



# Three crises in VLSI designs

T.Sakurai

- Power crisis
- Interconnection crisis
- Complexity crisis



# **Ever Increasing VLSI Power**

(Power consumption of processors published in ISSCC)



Fig.6 Ever incressing VLSI power







Fig.3 Limit of miniturization



Fig.7 VDD, power and current trend

# **Necessity for Low-Power Design**

| Power<br>range | Concerns                                                                       | Typical applications (All need high-perf.)          |
|----------------|--------------------------------------------------------------------------------|-----------------------------------------------------|
| < 0.1W         | - Battery life                                                                 | Portable<br>· PDA<br>· Communications               |
| ~ 1W           | <ul> <li>Inexpensive package limit</li> <li>System heat (10W / box)</li> </ul> | Consumer<br>· Set-Top-Box<br>· Audio-Visual         |
| > 10W          | <ul> <li>Ceramic package limit</li> <li>IR drop of power lines</li> </ul>      | Processor<br>· High-end MPU's<br>· Multimedia DSP's |

T.Sakurai

Fig.8 Necessity of low-power design







#### Voltage waveform of CMOS inverter





T.Sakurai

T.Sakurai

Fig.11 Voltage waveform of CMOS inverter

#### Short-circuit power dissipation formula



K. Nose and T. Sakurai, "Closed-Form Expressions for Short-Circuit Power of Short-Channel CMOS Gates and Its Scaling Characteristics," ITC-CSCC (Korea), July 1998.

Fig.12 Short-circuit power dissipation formula



T.Sakurai

Fig.13 Comparison between proposed formula and other formulas



Fig.14 The change of the short-circuit power dissipation with scaling



Voltage dependent gate cap. effect

Fig.15 Voltage dependent gate capacitance effect





# Solving power issues









Fig. 18 Power and delay





# Super Cut-off CMOS Scheme (SCCMOS)



Fig.20 Super cut-off CMOS scheme

# **Delay characteristics (inverter & NAND)**



Fig.21 Delay characteristics of SCCMOS





Fig.22 Losing information in standby

# Dynamic Leakage Cut-off



Fig.23 Dynamic leakage cut-off SRAM

# Leakage Reduction of DLC SRAM



Fig.24 Leakage reduction of DLC SRAM



Fig.25 Power distribution in CMOS LSI's



Fig.27 Positive temperature coefficient in low-voltage region

T.Sakurai

# Cause of positive temp. dependence of I<sub>DS</sub>

| • $\alpha$ -power law model (T = Temp. $\mu$ = Mobility)     |                   |       |          |  |  |
|--------------------------------------------------------------|-------------------|-------|----------|--|--|
| $I_{DS} \propto \mu(T)$ ( $V_{DD}$ - $V_{TF}$                | т 🖊               | т     |          |  |  |
| $\mu(T) = \mu(T_0)(T / T_0)^{-m}$                            |                   | 1     |          |  |  |
| $V_{TH}(T) = V_{TH}(T_0) - \kappa(T - T_0)$                  |                   |       | 1        |  |  |
| Typical Value : α=1.5, m=1.5, κ=2.5[mV/T]                    |                   |       |          |  |  |
| Effects of $V_{TH}$ and $\mu$ on $I_{DS}$ when temp. goes up |                   |       |          |  |  |
| 100[K]                                                       | V <sub>TH</sub> e | ffect | μ effect |  |  |
| V <sub>DD</sub> =2.5V, V <sub>TH</sub> =0.5V                 | 10%               | /     | 35% 🔪    |  |  |
| V <sub>DD</sub> =1.0V, V <sub>TH</sub> =0.2V                 | 55%               | /     | 35%      |  |  |

T.Sakurai Fig.28 Cause of positive temperature dependence of I<sub>DS</sub>



Fig.29 D-Type CMOS

| SOI Processors in ISSCC'99 |                      |                         |                      |                            |  |  |
|----------------------------|----------------------|-------------------------|----------------------|----------------------------|--|--|
|                            |                      |                         |                      |                            |  |  |
| Paper#                     | WP25.1               | WP25.3                  | WP25.7               | WP25.4                     |  |  |
| Company                    | IBM (East Fishkill)  | IBM (Essex & Austin)    | IBM (Rochester)      | Samsung                    |  |  |
| Target                     | PowerPC 604e         | PowerPC 750             | PowerPC              | Alpha                      |  |  |
|                            | 32b                  | for Apple               | 64b                  | 64b                        |  |  |
| PD/FD                      | PD                   | PD (SIMOX)              | PD (SIMOX)           | FD (SIMOX/Unibond no dep.) |  |  |
| Rule                       | 0.25um               |                         | 0.2um (Leff=0.12um)  | 0.25um                     |  |  |
| Interconnect               | 5 Al + Wlocal        | Cu                      | 6 Cu                 | 4 AI                       |  |  |
| Area                       | 49mm2                |                         | 139mm2               | 209mm2                     |  |  |
| # of Tr's                  | 6.5M                 |                         | 34M                  | 9.7M                       |  |  |
| Freq.                      | 500MHz               | 580MHz@85C, fast proc.  | 550MHz               | 600MHz                     |  |  |
| VDD                        | 1.7V                 | 2V                      | 1.8V                 | 1.5V (2V I/O)              |  |  |
| Power                      |                      | 5.1W @2V,400MHz         | 24W                  | 40W                        |  |  |
| Speed gain o               | 25-30%               | 20%                     | 20%                  | 30%@1.2V, 20%@1.5V SRAW    |  |  |
|                            | 22% Ctotal reduction | 12% by Cj               | 15-20% simple gates  |                            |  |  |
|                            | 10-15% more lds      | 15-25% by less body-bia | 25-40% complex gates |                            |  |  |

Fig.30 SOI processors in ISSCC'99

T.Sakurai

# **Hi-Speed is Low-Power**



T.Sakurai

T.Sakurai



Approach to low-power LSI

Example of MPEG2 decoding

~ 25W

~ 4W

~ 0.7W

Fig.32 Approach to low-power LSI

DSP

High flexibility

Low-power

Processor (software)

Dedicated sytem LSI (SW/HW)



Integration (system LSI) is the key to low-power

| Operation       | Energy/Op (pJ) |  |
|-----------------|----------------|--|
| Add             | 7              |  |
| 3-2 Add         | 2              |  |
| Multiply        | 40             |  |
| Latch           | 1.8            |  |
| Internal read   | 36             |  |
| Internal write  | 71             |  |
| I/O             | 80             |  |
| External memory | 16000          |  |

B.M.Gordon, E.Tsern, T.Meng,"Design of a Low Power Video Decompression Chip Set for Portable Applications," J. of VLSI Signal Processing Systems 13, pp.125-142, 1996

Software feedback loop for low-power





T.Sakurai





Fig.33 Homogeneous vs heterogeneous

#### DRAM Embedding



K.Sawada, T.Sakurai, et al, "A 72K CMOS Channelless Gate Array with Embedded 1Mbit Dynamic RAM," in Proc. CICC'88, pp.20.3.1-20.3.4, May 1988.

Two orders of magnitude improvement in bandwidth and power

T.Sakurai

Fig.36 DRAM embedding

# **Neural chip**



3 orders of magnitude smaller power consumption for recognition compared to software implementation S.Takeuchi &T.Sakurai, ICCD'98, Oct.1998.



T.Sakurai

# Compact yet High-Performance (CyHP) Library for Low-Power Technologies



Fig.38 Compact cell library for quick TAT



Fig.39 Lorentz force MOS for micro IDDQ test

# Interconnect determines cost & perf.







#### Interconnect parameters trend







# RC delay and gate delay

Fig.42 RC delay and gate delay

T.Sakurai





Fig.44 The further, the less



Fig.45 Locality in space and time



Fig.47 Coupling noise in RC bus





Fig.48 Coupling among interconnections



**Fig.54 Conclusions**