Topics In Demand
Notification
New

No notification found.

Effective shift between VTs and Drive Strength for maximum timing benefit with less leakage power dissipation
Effective shift between VTs and Drive Strength for maximum timing benefit with less leakage power dissipation

February 28, 2022

2801

0

The current technology trend focuses on lower nodes of the transistor, which makes accommodating a greater number of transistors of the same size quite easy. Scaling down the technology node has its benefits and flaws. We reduce the supply voltage to protect the cells from an enormous electric field across the gate oxide and the conducting channel. Reduction of supply voltage saves dynamic power dissipation, but it slows down the CMOS transistor. By reducing the voltage, the static power dissipation becomes equal to or more than that of dynamic power dissipation. At lower nodes, leakage power can consume more than 50% of the overall chip power, hence, the high-performance chips have enormous power dissipation, even in the standby mode.

While in the signoff phase, we might look at how fast we can close the design concerning the timing. We generally avoid power optimization while fixing the timing with our standard VT swap and drive strength change techniques. We tried a different approach to optimize the timing that can improve the leakage power in the lower technology node designs.

Sources of power dissipation

  1. Static Power (leakage power)
  2. Dynamic Power

  1.  Static Power

Static power is the power consumed when there is no circuit activity, or we can say when the circuit is in quiescent mode. When a supply voltage is present, even if we withdraw the clocks, and don’t change the circuit inputs, it will continue to consume power, and that is called static power consumption.

It is mainly due to the leakage currents that flow when the transistor is in an off state. There are many types of leakage currents. In the diagram below, we have shown only two common leakage currents.

In the nanometre design, leakage power is a big concern. Leakage power dissipation occurs mainly due to sub-threshold current. It increases exponentially with a reduced threshold voltage.

Delay Td = (CL * Vdd ) / (Vdd – Vt )a:

                   Td –     propagation delay

                   CL –   load capacitance

      Vdd – supply voltage

      Vt –   threshold voltage

      a –   coefficient

Figure 1 Leakage currents in a PMOS transistor  

  2.  Dynamic Power

Dynamic power is the power consumed when the circuit is in operation. It means that we apply supply voltage, clocks, and change the inputs. It occurs mainly due to the dynamic currents, such as capacitance currents (switching power) and short-circuit currents (short-circuit power).

Leakage power experiments:

We collected data from the library and compared it as shown in the chart below. We compared this data to see how timing signoff iterations are done without getting too much hit on the leakage power.

Experiment1: Average Net length (40 um to 60 um)

The tables below are of the victim cell contributing to the path delay. Changing this cell’s down model to a suitable one can close the path. However, to choose a preferable down model, we must check its cell delay and leakage power. The highlighted portion with the red box shows a better timing of D2ULVT with lesser leakage as compared to D8LVT.

Cell Type

Cell

delay

(ps)

Slack (ps)

Leakage

(mW)

AN2D2SVT

0.040564

-0.00976

7.32E-06

AN2D4SVT

0.033764

-0.00179

1.07E-05

AN2D6SVT

0.037529

-0.00529

1.82E-05

AN2D8SVT

0.037254

-0.00571

2.64E-05

AN2D2LVT

0.028544

0.00374

3.28E-05

AN2D4LVT

0.024059

0.008856

4.90E-05

AN2D6LVT

0.026528

0.006611

8.56E-05

AN2D8LVT

0.025945

0.00671

1.24E-04

AN2D2ULVT

0.02269

0.010199

1.20E-04

AN2D4ULVT

0.019279

0.014089

1.79E-04

AN2D6ULVT

0.021125

0.012461

3.12E-04

AN2D8ULVT

0.020622

0.01236

4.50E-04

Cell Type

Cell

delay

(ps)

Slack (ps)

Leakage (mW)

AN2D2SVT

0.044853

-0.01703

7.32E-06

AN2D4SVT

0.040014

-0.01239

1.07E-05

AN2D6SVT

0.043736

-0.01586

1.82E-05

AN2D8SVT

0.044748

-0.01945

2.64E-05

AN2D2LVT

0.031974

-0.0028

3.28E-05

AN2D4LVT

0.028685

-0.00043

4.90E-05

AN2D6LVT

0.031716

-0.00322

8.56E-05

AN2D8LVT

0.032456

-0.00633

1.24E-04

AN2D2ULVT

0.02527

0.004412

1.20E-04

AN2D4ULVT

0.022764

0.005812

1.79E-04

AN2D6ULVT

0.025085

0.003701

3.12E-04

AN2D8ULVT

0.025664

0.000603

4.50E-04

 

Table 1: Data based on a weak driver                                                                 Table 2: Data based on a strong driver

There can be two possibilities - the driver of this cell can be weak or strong based on the two tables and graphs that represent the behaviour of the cell in terms of its delay and leakage power.

Figure 2: Strong driver (D4SVT)

 

Figure 3: Weak Driver (D2SVT)

 

  • We can state that in both ways, switching directly to ULVT/LVT steals the power from us. We can choose the correct drive strength with the suitable VT and still close the timing. Also, note that D2ULVT gives more benefit as compared to D8LVT.

Experiment 2: Minimum Net length and Maximum Net length

 

Let’s take another experiment with a different cell, maximum length, and minimum length of the net. In both the graphs, the drivers are strong with respect to the length. This further clarifies the statement of choosing the right down model.

This case is free from Crosstalk, and in the Minimum Net length case, we just have to convert the cell from D2SVT to D4SVT and that closes the timing. In the Maximum Net length case, the transition is the issue, where we can break the net into a number of parts, and then we have to choose the correct down model.

  • When cells are sitting in the vicinity of other cells, nets are not too long. Let’s see how the delay of that path and leakage go together.

Figure 4: Behavior of slack when the net length is minimum

 

  • We often deal with longer nets and transition issues. Let’s see how our analysis goes with longer nets. Note that the lower drive strength of cells gives a bad result in the longer nets case.

 

Figure 5: Behavior of cell when the net length is maximum

                                                         

Experiment 3: Crosstalk Dominated Nets

We also tried to see what happens when the net is marginally long (around 100 um), and victim of the crosstalk of 20 to 35 ps. Above all, the experiments were free from the crosstalk effect. Let’s mix crosstalk in the recipe now.

Crosstalk can be a key factor while fixing timing and we cannot overlook it. Let’s see the crosstalk-dominated timing path as below that boosted the driver.

Figure 6: Behavior of crosstalk-dominated net when the driver is strong

 

  • Let’s see what happens when we have a medium-range driver for a certain amount of crosstalk. The data depicts a strong driver case, cell delay, decrease in higher drive strength cells, and a good amount of hold over the slack.

Figure 7: Behavior of crosstalk-dominated net when the driver is medium

  • One of the reasons for crosstalk is long net in a congested area and that too driven by a weaker driver. This scenario has shown some reverse cell delay to slack characteristic, where going from D24SVT to D2LVT timing is degraded. It also stands true while moving from D24LVT to D2ULVT.

Figure 8: Behavior of crosstalk-dominated net when the driver is weak

Outcome and Observations:

While converting the SVT cells to LVT/ULVT cells for timing optimization, we can keep the below data in our mind and use it to make it more optimized in terms of timing as well as power.

 

For a functional cell (AND cell) case mentioned in the first table, choosing the D2ULVT over D8LVT gives us tremendous timing optimization while surprisingly costing less leakage power that we generally miss while fixing the timing.

Cell name

Slack (ps)

Leakage (mW)

AN2D8LVT

-0.00633

1.24E-04

AN2D2ULVT

   0.004412

1.20E-04

Table 3 D8LVT and D2ULVT

                                                                     

Let's look at some practical scenarios,

  • Based on the AND cell table, if we have five D2SVT cells and the path is violated with -20ps, we can choose to convert D2SVT to D4SVT and gain five to six ps and get a lesser impact on the leakage power instead of choosing two D4LVT and have more impact on the leakage.

Cell Name

Slack

Leakage power (mW)

Scenario

Total leakage

Path slack

AN2D2SVT

-0.01703

    7.32E-06

5 cells of D2SVT

3.66E-05

-20ps

AN2D4SVT

-0.01239

    1.07E-05

5 cells of  D4SVT

5.37E-05

+3ps

AN2D4LVT

-0.00043

     4.90E-05

3 cells D2SVT + 2 cells D4LVT

1.20E-04

+2ps

 

 

Table 4: Real-time scenario

We usually have a case as mentioned below, where mixed types of cells are present in one path. We can observe from the table that to get -25ps to closure, we can convert D2SVT to D4SVT based on the earlier study, but we can still optimize it more if we choose two D4LVT in place of D16SVT and get a lesser impact on the leakage. It can help close the timing.

Table 5 shows the total leakage of D2SVT + D16SVT, D4SVT + D16SVT, and D2SVT + D16SVT + D4LVT.

                       

Practical Scenario (all cells are inverters below)

Total leakage

Path Slack

3 D2SVT (3.93E-06*3 ) + 3 D16SVT (3.35E-05* 3)

1.12E-04

 - 15 ps

3 D4SVT (7.22E-06*3 ) + 3 D16SVT (3.35E-05 * 3)

1.22E-04

  -2   ps

3 D2SVT ( 3.93E-06*3 ) + 1 D16SVT (3.35E-05* 3)  + 2 D4LVT (3.09E-05* 2)

1.07E-04

 + 8  ps

 

Table 5: Practical Scenario

Table 6 shows the summary of how D4LVT has less delay and less leakage power as compared to some SVT cells that we can use while optimizing the timing.

 

  • We have converted cells to their different down model and the table below shows its impact on slack. For example, if we have D18SVT and we convert it to D4LVT, we will see a positive slack in the path and less leakage.

Cell Name

Path slack

leakage

slack diff

leakage diff

INVD4LVT

0.006295

3.09E-05

   

INVD16SVT

-0.009806

3.35E-05

0.016101

-2.63E-06

INVD18SVT

-0.012951

3.77E-05

0.019246

-6.82E-06

INVD20SVT

-0.014544

4.07E-05

0.020839

-9.87E-06

INVD22SVT

-0.014544

4.80E-05

0.020839

-1.72E-05

 

                    Table 6: Summary of D4LVT vs High strength SVTs

  • The case below is of when the driver is weak and is not able to drive bigger cells. In such a scenario, choosing D2ULVT is preferable (all cells are inverters). Despite being a ULVT cell, it has lesser leakage than LVT cells that is surprising, and it can be a key factor to save power without affecting the timing.

We all have some leakage optimization flows, where it converts from ULVT to LVT based on its timing, but the table below shows that even if we have good timing with D8 or D12 LVT, we can convert it to D2ULVT cell and have less leakage.

 

Cell

path slack

Leakage

slack diff

leakage diff

INVD2ULVT

0.0051

6.18E-05

   

INVD8LVT

0.003844

6.21E-05

0.0013

3.20E-07

INVD10LVT

0.002791

7.84E-05

0.0023

1.66E-05

INVD12LVT

0.001499

9.49E-05

0.0036

3.31E-05

INVD14LVT

-0.000925

1.12E-04

0.0060

4.97E-05

INVD16LVT

-0.003176

1.51E-04

0.0083

8.89E-05

INVD18LVT

-0.005885

1.66E-04

0.0110

1.04E-04

INVD20LVT

-0.007082

1.83E-04

0.0122

1.22E-04

INVD22LVT

-0.007082

2.17E-04

0.0122

1.55E-04

INVD24LVT

-0.011441

2.50E-04

0.0165

1.88E-04

Table 7: Summary of D2LVT vs high strength LVTs

Limitation:

 

  • This study and observation have one limitation - when the Net is too long, it causes high net delay. When it is the victim of crosstalk, it has a high delay also. There, we cannot choose D2ULVT or D4LVT cells over high driving strength cells.
  • As mentioned, in crosstalk-dominated case when the driver is weak and net length is too high, smaller cells like D2ULVT tend to perform reverse characteristics on the timing scale.
  • The data below shows that D2ULVT makes the path go more -ve than D24LVT. We must look for these certain cases while applying the above strategy.

 

Cell Name

Long nets/crosstalk path slack

Normal path slack

leakage

BUFD24SVT

-0.24288

-0.040899

6.64E-05

BUFD2LVT

-0.364352

-0.007607

3.09E-05

       

BUFD24LVT

-0.217539

-0.00642

3.00E-04

BUFD2ULVT

-0.320156

0.016439

1.13E-04

 

Table 8: Limitation

  • After looking at the above-depicted graphs, we may be able to state the relationship between cell delay and leakage. It is another way of driving strength and leakage power.
  • We do not suggest that we should always choose certain drive strength or VT over another VT. It is purely dependent on what the problem is that causes timing failure.
  • When the drive strength is weak, we should boost the driver first and when the net length is high, we should break the net. We generally use D8 or D16 buffer to break the net or an inverter pair equivalent to that. But now we can have some help from this study to choose the appropriate drive strength for buffer or inverter.
  • Not only while fixing the timing, but we can also propose this method in the earlier stages, where we have margins in the timing path, and we can convert cells based on the suitable case.

Conclusion:

Based on the experiments above, we can state that choosing the right VT and drive strength surely benefits us in terms of leakage power. We always consider ULVT as a power-hungry VT type but with this paper, we saw that it is not always true. We know that we must take care of all the possible cases like crosstalk-dominated and high-net length with respect to their problems.

From the above-mentioned power optimization technique, we met the timing while saving maximum leakage power. It is to find the optimal trade-off point between leakage power and delay of a cell, keeping in mind the lower-technology node’s design. The results were very convincing that if we fix the timing by keeping these things in mind, we can save a lot of power. To know more about our services please checkout our services page here.

About the Authors

Prerak Dalia works as an Engineer at eInfochips. He holds a Bachelor of Engineering degree in Electronics and Communication from CSPIT, Changa, India. With over 3.6 years of experience in physical designing, he has developed expertise in PnR, physical verification, static timing analysis.

Jaimini Prajapati woks as a Senior Engineer (Level 2) at eInfochips. She holds a Master of Engineering degree in VLSI from LCIT, Bhandu, India and possesses over 6.5 years of experience in lower technology node for complex networking SoCs. she has developed expertise in PnR, static timing analysis and physical verification.

Abhishek Chhajer is a Senior Technical Lead at eInfochips. He holds a Bachelor of Engineering degree in Information Communication Technology ( ICT) from  DAIICT  Engineering College, India. With over 9.5 years of experience in physical designing, he has developed expertise in PnR, physical verification, static timing analysis and fullchip timing and fullchip physical verification. 

Kapil Saxena is an ASIC Delivery Manager (Level 2) at eInfochips. He holds a Bachelor of Engineering degree in Electronics and Communication from IT BHU, India. With over 25+ years of experience in physical designing SoCs, he is delivering complex SOC products and managing large teams.


That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.


eInfochips, an Arrow company, is a leading global provider of product engineering and semiconductor design services. With over 500+ products developed and 40M deployments in 140 countries, eInfochips continues to fuel technological innovations in multiple verticals. The company’s service offerings include digital transformation and connected IoT solutions across various cloud platforms, including AWS and Azure. Visit- https://www.einfochips.com/

© Copyright nasscom. All Rights Reserved.