# **Technical Paper**

### A 0.9V 150MHz 10mW 4mm<sup>2</sup> 2-D Discrete Cosine Transform Core Processor with Variable-Threshold-Voltage Scheme

Tadahiro Kuroda, Tetsuya Fujita, Shinji Mita, Tetsu Nagamatu, Shinichi Yoshioka, Fumihiko Sano, Masayuki Norishima, Masayuki Murota, Makoto Kako, Masaaki Kinugawa, Masakazu Kakumu, Takayasu Sakurai

Toshiba Corp., Kawasaki, Japan

This two-dimensional 8x8 discrete cosine transform (DCT) core processor for portable multimedia equipment with HDTV-resolution in a 0.3 $\mu$ m CMOS triple-well double-metal technology operates at 150MHz from a 0.9V power supply and consumes 10mW, only 2% power dissipation of a previous 3.3V DCT [1]. Circuit techniques for dynamically varying threshold voltage reduce active power dissipation with negligible overhead in speed, standby power and chip area.

Lowering both of supply voltage,  $V_{DD}$ , and threshold voltage,  $V_{th}$ , enables high-speed low-power operation, but raises two problems: 1) degradation of worst case speed due to  $V_{th}$  fluctuation in low  $V_{DD}$  [2], 2) increase in standby power dissipation in low  $V_{th}$  [3]. The variable

### FA 10.3

Table of Contents Paper Index Abstract

threshold-voltage scheme (VT scheme) in Figure 1 solves these two problems by controlling substrate bias, VBB, with substrate-bias feed-back control circuits.  $V_{\rm th}$  is controlled at 0.1V in the active mode and 0.5V in the standby mode.  $V_{\rm BB}$  of -0.5V is applied in the active mode and -3.3V in the standby mode.

Figure 2 depicts the VT scheme block diagram. It consists of leakage current monitors (LCMs), self substrate bias (SSB) circuits, and a substrate charge injector (SCI) circuit. In the active mode, the SSB controls  $V_{\rm BB}$  to compensate  $V_{\rm th}$  fluctuation. In standby mode, the SSB applies deeper  $V_{\rm BB}$  to increase  $V_{\rm th}$  and cut off leakage. The SCI is used for fast transition from standby to active mode. Although other parts of the chip work on  $0.9V\,V_{\rm DD}$ , the VT circuit itself works on  $3.3V\,V_{\rm DD}$  that usually is available on a chip for standard interfaces with other chips.

As illustrated in Figure 3, the VT scheme uses four voltage levels for the V<sub>BB</sub> control; V<sub>active(+)</sub> =-0.3V, V<sub>active</sub> = -0.5V, V<sub>active(-)</sub> =-0.7V, and V<sub>standby</sub> =-3.3V. After a power-on, the SSB begins to draw 100µA from the substrate to lower V<sub>BB</sub> using a 50MHz ring oscillator. When V<sub>BB</sub> goes lower than V<sub>active(+)</sub>, the SSB drops to 5MHz and draws 10µA to control V<sub>BB</sub> more precisely. The SSB stops when V<sub>BB</sub> drops below V<sub>active</sub>. V<sub>BB</sub>, however, rises gradually due to device leakage current through MOS

### FA 10.3





transistors and junctions, and reaches  $V_{active}$  to activate the SSB again. In this way,  $V_{BB}$  is controlled at  $V_{active}$ . When "SLEEP" is asserted ("1") in the standby mode, the SCI is disabled and the SSB is activated. The SSB begins to draw 100µA from a substrate until  $V_{BB}$  reaches  $V_{standby}$ . When SLEEP=0, the SSB is disabled and the SCI is activated. The SCI injects 30mA current into the substrate until  $V_{BB}$  reaches  $V_{active(-)}$ . Active-to-standby mode transition takes about 100µs, and a standby-to-active, 0.1µs.

Figure 4 depicts a circuit schematic of the leakage current monitor (LCM) a key to the accurate control in theVT scheme. Transistors M1 and M2 in a bias generator operate in the subthreshold region. When an MOS transistor is in subthreshold its drain current is:

 $I_{DS} = I_0 / W_0 \bullet W \bullet 10^{(V_{as} \bullet V_{as})/S}$ (1) where S is the subthreshold swing,  $V_T$  is threshold voltage,  $I_0 / W_0$  is the current density to define  $V_T$ , and W is the channel width. By applying (1), the output voltage of the bias generator,  $V_h$ , is:

 $V_b = S \cdot \log (W_2/W_1)$  (2) where W1 and W2 is the channel width of M1 and M2, respectively. Leakage current of DCT,  $I_{leak.DCT}$ , and monitored leakage current,  $I_{leak.LCM}$ , can be calculated from (1), and a current magnification factor of LCM,  $X_{LCM}$ , can be expressed as

 $\mathbf{X}_{\text{LCM}} = \mathbf{I}_{\text{leak.LCM}} / \mathbf{I}_{\text{leak.DCT}} = (\mathbf{W}_2 / \mathbf{W}_1) \bullet (\mathbf{W}_{\text{LCM}} / \mathbf{W}_{\text{DCT}}) (3)$ 

### FA 10.3

Table of Contents Paper Index Abstract

where  $W_{\rm DCT}$  is the total channel width of the DCT and  $W_{\rm LCM}$  is the channel width of M4. This implies that  $X_{\rm LCM}$  is determined only by the transistor size ratio and independent of the power supply voltage, temperature, and process fluctuation. Figure 5 shows simulated variation of  $X_{\rm LCM}$  due to circuit condition changes and process fluctuation. The variation is within 15%, resulting in less than 1% error in  $V_{\rm th}$  control. The power overhead of the monitor circuit is about 0.1% and 10% of the total power dissipation in the active and the standby mode, respectively. Transistor M3 isolates the Nout node from the N1 node and the parasitic capacitance of M4. This keeps the signal swing on N1 small to reduce delay and improve dynamic  $V_{\rm th}$  controllability.

Area penalty induced by the VT scheme is negligible. Since the substrate current generation due to impact ionization is four orders of magnitude smaller in 0.9V  $V_{\rm DD}$  than in 3.3V  $V_{\rm DD}$ , the pumping current in the SSB is several per cent of that in DRAMs. Not many substrate contacts are needed. To reduce substrate noise induced by drain-substrate capacitive coupling, most of the substrate diffusions in the DCT macro are replaced by source diffusions and the rest are used for the substrate-bias separation, which imposes 0.5% area penalty.

### FA 10.3

Table of Contents Paper Index Abstract



In the VT scheme, no transistor sees high-voltage stress of gate oxide and junctions. Transistors are optimized for use at 3.3V. The gate oxide thickness is 8nm. The maximum voltage that assures reliability of the gate oxide is  $V_{DD}$ +10%, or 3.6V. The substrate charge injector (SCI) in Figure 6 receives a control signal that swings between  $V_{DD}$ and GND at node N1 to drive substrate from  $V_{\text{standby}}$  to GND. In standby-to-active transition,  $V_{\text{DD}} + |V_{\text{standby}}|$  is applied between N1 and N2.  $|V_{GS}|$  and  $|V_{GD}|$  of M1 and M2, however, never exceeds the larger of  $V_{DD}$  and  $|V_{standby}|$ . All other transistors in the VT circuits and the DCT macro receive  $({\rm \check{V}}_{\rm DD}\,\text{-}\,{\rm V}_{\rm th})$  on their gate oxide when in depletion and inversion mode, and less than  $|V_{standby}|$  in the accumulation mode.  $V_{standby}$  should be limited to  $-V_{DD}$ .  $V_{standby}$  of  $-V_{DD}$ , however, can shift  $V_{th}$  enough to reduce leakage current in standby by four orders of magnitude below that in active mode. The body effect coefficient,  $\gamma$ , can be adjusted independently to  $V_{\mbox{\tiny th}}$  in any device generations by controlling the doping concentration density in the channel-substrate depletion layer.

A chip micrograph appears in Figure 7. The VT circuits occupy  $0.58 \times 0.74 \text{mm}^2$ . The increase in cost and turn-around time by introducing triple-well process is less than 5%.

### FA 10.3





Acknowledgments:

The authors acknowledge encouragement of A. Kanuma, J. Iwamura, K. Maeguchi, O. Ozawa, and Y. Unno.

References:

[1] Matsui, M., et al. ,"200MHz Video Compression Macrocells Using Low-Swing Differential Logic," ISSCC Digest of Technical Papers, pp. 76-77, Feb., 1994.

[2] Kobayashi, T., T. Sakurai, "Self-Adjusting Threshold-Voltage Scheme (SATS) for Low-Voltage High-Speed Operation," Proc. 1994 CICC, pp. 271-274, May, 1994.

[3] Seta, K., et al., "50% Active-Power Saving without Speed Degradation using Standby Power Reduction (SPR) Circuit," ISSCC Digest of Technical Papers, pp. 318-319, Feb., 1995.

## FA 10.3

Table of Contents Paper Index Abstract





Figure 1: Variable threshold-voltage (VT) scheme.

Table of Contents Paper Index Abstract

**FA 10.3** 



#### Figure 2: VT block diagram.

FA 10.3

Table of Contents

Paper Index

Abstract





## FA 10.3







Figure 4: Leakage current monitor (LCM).

FA 10.3





Figure 5:Current magnification factor of LCM, Xde-pendence on circuit and process deviations.



### FA 10.3

Table of Contents

Paper Index

Abstract



### FA 10.3





#### Figure 7: Chip micrograph.



# FA 10.3

Table of Contents

Paper Index

Abstract



#### Source

1996 IEEE International Solid-State Circuits Conference 1996 Digest of Technical Papers, pp. 166-167.

### FA 10.3

Table of Contents Paper Index Abstract