Progettazione di circuiti e sistemi VLSI Anno Accademico 2010-2011 Lezione 6 18.3.2011 La logica combinatoria La logica combinatoria 1 Combinational vs. Sequential Logic In Combinational Logic Circuit In Combinational Logic Circuit Out Out State Combinational Output = f(In) Sequential Output = f(In, Previous In) La logica combinatoria 2 Static CMOS Circuit At every point in time (except during the switching transients) each gate output is connected to either VDD or Vss via a low-resistive path. The outputs of the gates assume at all times the value of the Boolean function, implemented by the circuit (ignoring, once again, the transient effects during switching periods). This is in contrast to the dynamic circuit class, which relies on temporary storage of signal values on the capacitance of high impedance circuit nodes. La logica combinatoria 3 Static Complementary CMOS VDD In1 In2 PUN PMOS only InN In1 In2 InN F(In1,In2,…InN) PDN NMOS only PUN and PDN are dual logic networks La logica combinatoria 4 NMOS Transistors in Series/Parallel Connection Transistors can be thought as a switch controlled by its gate signal NMOS switch closes when switch control input is high A B X Y Y = X if A and B A X B Y Y = X if A OR B NMOS Transistors pass a “strong” 0 but a “weak” 1 La logica combinatoria 5 PMOS Transistors in Series/Parallel Connection PMOS switch closes when switch control input is low A B X Y Y = X if A AND B = A + B A X B Y Y = X if A OR B = AB PMOS Transistors pass a “strong” 1 but a “weak” 0 La logica combinatoria 6 Threshold Drops VDD PUN VDD S D VDD D 0 VDD VGS S CL VDD 0 PDN D VDD 0 VDD - VTn CL VGS CL S VDD |VTp| S CL D La logica combinatoria 7 Complementary CMOS Logic Style La logica combinatoria 8 Example Gate: NAND P La logica combinatoria 9 Complex CMOS Gate B A C D OUT = D + A • (B + C) A D B C La logica combinatoria 10 Constructing a Complex Gate VDD VDD C F SN4 F SN1 A SN3 D B C B SN2 A D A B D C F (a) pull-down network (b) Deriving the pull-up network hierarchically by identifying sub-nets A D B C (c) complete gate La logica combinatoria 11 Cell Design • Standard Cells – General purpose logic – Can be synthesized – Same height, varying width • Datapath Cells – For regular, structured designs (arithmetic) – Includes some wiring in the cell – Fixed height and width La logica combinatoria 12 Standard Cells N Well VDD Cell height 12 metal tracks Metal track is approx. 3 + 3 Pitch = repetitive distance between objects Cell height is “12 pitch” 2 Cell boundary In Out GND La logica combinatoria Rails ~10 13 Standard Cells VDD 2-input NAND gate VDD B A B Out A GND La logica combinatoria 14 Stick Diagrams Contains no dimensions Represents relative positions of transistors VDD VDD Inverter NAND2 Out Out In GND GND La logica combinatoria A B 15 Stick Diagrams Logic Graph A j X C C B X = C • (A + B) C A PUN i X i B VDD j B GND A B C La logica combinatoria A PDN 16 Two Versions of C • (A + B) A C B A B C VDD VDD X GND X GND La logica combinatoria 17 Consistent Euler Path X C i X B VDD j A GND La logica combinatoria A B C 18 OAI22 Logic Graph A C B D X D X = (A+B)•(C+D) C D A B PUN C VDD X B A B C D A GND La logica combinatoria PDN 19 Example: x = ab+cd x x c b VDD x a c b VD D x a d GND d GND (a) Logic graphs for (ab+cd) (b) Euler Paths {a b c d} VD D x GND a b c d (c) stick diagram for ordering {a b c d} La logica combinatoria 20 Properties of Complementary CMOS Gates Snapshot High noise margins: VOH and VOL are at VDD and GND, respectively. No static power consumption: There never exists a direct path between VDD and VSS (GND) in steady-state mode. Comparable rise and fall times: (under appropriate sizing conditions) La logica combinatoria 21 CMOS Properties • Full rail-to-rail swing; high noise margins • Logic levels not dependent upon the relative device sizes; ratioless • Always a path to Vdd or Gnd in steady state; low output impedance • Extremely high input resistance; nearly zero steady-state input current • No direct path steady state between power and ground; no static power dissipation • Propagation delay function of load capacitance and resistance of transistors La logica combinatoria 22 Switch Delay Model Req A A Rp A Rp Rp B Rn Rp Rp A CL CL A Cint A NAND2 Cint A Rn B Rn B INV La logica combinatoria Rn Rn A B CL NOR2 23 Input Pattern Effects on Delay Rp A Rp B Rn – both inputs go low CL A • delay is 0.69 Rp/2 CL – one input goes low B Rn • Delay is dependent on the pattern of inputs • Low to high transition • delay is 0.69 Rp CL Cint • High to low transition – both inputs go high • delay is 0.69 2Rn CL La logica combinatoria 24 Delay Dependence on Input Patterns 3 Input Data Pattern Delay (psec) A=B=01 67 A=1, B=01 64 A= 01, B=1 61 0,5 A=B=10 45 0 A=1, B=10 80 A= 10, B=1 81 A=B=10 2,5 Voltage [V] 2 A=1 0, B=1 1,5 A=1, B=10 1 -0,5 0 100 200 300 time [ps] La logica combinatoria 400 NMOS = 0.5m/0.25 m PMOS = 0.75m/0.25 m CL = 100 fF 25 Transistor Sizing Rp 2 A Rp B Rn 2 B 2 Rn A Rp 2 4 B CL 4 Cint Rp Cint A 1 Rn Rn A B La logica combinatoria CL 1 26 Transistor Sizing a Complex CMOS Gate A B 8 6 C 8 6 4 3 D 4 6 OUT = D + A • (B + C) A D 2 1 B 2C 2 La logica combinatoria 27 Fan-In Considerations A B C D A CL B C3 C C2 D C1 Distributed RC model (Elmore delay) tpHL = 0.69 Reqn(C1+2C2+3C3+4CL) Propagation delay deteriorates rapidly as a function of fan-in – quadratically in the worst case. La logica combinatoria 28 tp as a Function of Fan-In 1250 quadratic tp (psec) 1000 Gates with a fan-in greater than 4 should be avoided. 750 tpH 500 tp L 250 tpL linear H 0 2 4 6 8 10 12 14 16 fan-in La logica combinatoria 29 tp as a Function of Fan-Out tpNOR2 tpNAND2 tpINV tp (psec) 2 All gates have the same drive current. Slope is a function of “driving strength” 4 6 8 10 12 14 16 eff. fan-out La logica combinatoria 30 tp as a Function of Fan-In and Fan-Out • Fan-in: quadratic due to increasing resistance and capacitance • Fan-out: each additional fan-out gate adds two gate capacitances to CL tp = a1FI + a2FI2 + a3FO La logica combinatoria 31 Fast Complex Gates: Design Technique 1 • Transistor sizing – as long as fan-out capacitance dominates • Progressive sizing InN CL MN In3 M3 C3 In2 M2 C2 In1 M1 C1 Distributed RC line M1 > M2 > M3 > … > MN (the mos closest to the output is the smallest) Can reduce delay by more than 20%; decreasing gains as technology shrinks La logica combinatoria 32 Fast Complex Gates: Design Technique 2 • Transistor ordering critical path In3 1 M3 critical path 01 In1 M3 charged CL In2 1 M2 C2 charged In1 M1 01 C1 charged delay determined by time to discharge CL, C1 and C2 CLcharged In2 1 M2 C2 discharged In3 1 M1 C1 discharged delay determined by time to discharge CL La logica combinatoria 33 Fast Complex Gates: Design Technique 3 • Isolating fan-in from fan-out using buffer insertion CL CL La logica combinatoria 34 Fast Complex Gates: Design Technique 4 • Reducing the voltage swing tpHL = 0.69 (3/4 (CL VDD)/ IDSATn ) = 0.69 (3/4 (CL Vswing)/ IDSATn ) – linear reduction in delay – also reduces power consumption • But the following gate is much slower! • Or requires use of “sense amplifiers” on the receiving end to restore the signal level (memory design) La logica combinatoria 35 Sizing Logic Paths for Speed • Frequently, input capacitance of a logic path is constrained • Logic also has to drive some capacitance • Example: ALU load in an Intel’s microprocessor is 0.5pF • How do we size the ALU datapath to achieve maximum speed? • We have already solved this for the inverter chain – can we generalize it for any type of logic? La logica combinatoria 36 Buffer Example In Out 1 2 N CL N Delay pi g i f i i 1 (in units of tinv) For given N: Ci+1/Ci = Ci/Ci-1 To find N: Ci+1/Ci ~ 4 How to generalize this to any logic path? La logica combinatoria 37 Logical Effort CL Delay k RunitCunit 1 Cin t p g f p – intrinsic delay (3kRunitCunit) - gate parameter f(W) g – logical effort (kRunitCunit) – gate parameter f(W) f – effective fanout Normalize everything to an inverter: ginv =1, pinv = 1 Divide everything by tinv (everything is measured in unit delays tinv) Assume = 1. La logica combinatoria 38 Delay in a Logic Gate Gate delay: d=h+p effort delay intrinsic delay Effort delay: h=gf logical effort effective fanout = Cout/Cin Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size La logica combinatoria 39 Logical Effort • Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates • Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current • Logical effort increases with the gate complexity La logica combinatoria 40 Logical Effort Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current VDD A VDD A 2 2 B F 2 F A A VDD B 4 A 4 2 F 1 A B Inverter g=1 1 B 1 2 2-input NAND g = 4/3 La logica combinatoria 2-input NOR g = 5/3 41 4/ 3; p = 2 Logical Effort of Gates = g D: AN tN 4 rte e nv pu in 3 = g : 1; p= 1 r I 2- Normalized Delay 5 Effort Delay 2 1 Intrinsic Delay 1 2 3 Fanout f 4 La logica combinatoria 5 42 Add Branching Effort Branching effort: b Con path Coff path Con path La logica combinatoria 43 Multistage Networks N Delay pi g i f i i 1 Stage effort: hi = gifi Path electrical effort: F = Cout/Cin Path logical effort: G = g1g2…gN Branching effort: B = b1b2…bN Path effort: H = GFB Path delay D = Sdi = Spi + Shi La logica combinatoria 44 Optimum Effort per Stage When each stage bears the same effort: hN H hN H Stage efforts: g1f1 = g2f2 = … = gNfN Effective fanout of each stage: f i h g i Minimum path delay Dˆ gi f i pi NH 1/ N P La logica combinatoria 45 Optimal Number of Stages For a given load, and given input capacitance of the first gate Find optimal number of stages and optimal sizing D NH 1/ N Npinv D H 1/ N ln H 1/ N H 1/ N pinv 0 N Substitute ‘best stage effort’ hH La logica combinatoria 1/ Nˆ 46 Logical Effort From Sutherland, Sproull La logica combinatoria 47 Method of Logical Effort • • • • • Compute the path effort: F = GBH Find the best number of stages N ~ log4F Compute the stage effort f = F1/N Sketch the path with this number of stages Work either from either end, find sizes: Cin = Cout*g/f Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999. La logica combinatoria 48 Summary Sutherland, Sproull Harris La logica combinatoria 49 Power • If well designed, the dynamic power prevails • P=α01 CL VDD2 α01 is activity factor = 0.5 for the inverter La logica combinatoria 50 Ratioed Logic VDD Resistive Load VDD Depletion Load RL PDN VSS (a) resistive load PMOS Load VSS VT < 0 F In1 In2 In3 VDD F In1 In2 In3 PDN VSS (b) depletion load NMOS F In1 In2 In3 PDN VSS (c) pseudo-NMOS Goal: to reduce the number of devices over complementary CMOS La logica combinatoria 51 Ratioed Logic VDD • N transistors + Load Resistive Load • VOH = V DD RL • VOL = RPN + RL F In1 In2 In3 RPN • Assymetrical response PDN • Static power consumption VSS • tpL= 0.69 RLCL La logica combinatoria 52 Active Loads VDD Depletion Load VDD PMOS Load VT < 0 VSS F In1 In2 In3 PDN F In1 In2 In3 PDN VSS depletion load NMOS VSS pseudo-NMOS La logica combinatoria 53 Pseudo-NMOS VDD A B C D F CL VOH = VDD (similar to complementary CMOS) V2 k 2 OL p V k V – V V – ------------- = -----– V n DD Tn OL DD Tp 2 2 kp V OL = VDD – V T 1 – 1 – -----(assuming that V T = V Tn = VTp ) kn SMALLER AREA & LOAD BUT STATIC POWER DISSIPATION!!! La logica combinatoria 54 Pseudo-NMOS VTC 3.0 2.5 W/Lp = 4 Vout [V] 2.0 1.5 W/Lp = 2 1.0 0.5 W/Lp = 0.5 W/Lp = 1 W/Lp = 0.25 0.0 0.0 0.5 1.0 1.5 2.0 2.5 Vin [V] La logica combinatoria 55 Pass-Transistor Logic La logica combinatoria 56 Pass-Transistor Logic Inputs B Switch Out A Out B Network B • N transistors • No static consumption La logica combinatoria 57 Example: AND Gate B A B F = AB 0 La logica combinatoria 58 NMOS-Only Logic 3.0 In 1.5m/0.25m VDD x 0.5m/0.25m Out 0.5m/0.25m Voltage [V] In Out 2.0 x 1.0 0.0 0 0.5 1 1.5 2 Time [ns] La logica combinatoria 59 NMOS Only Logic: Level Restoring Transistor VDD VDD Level Restorer Mr B A M2 Mn X Out M1 • Advantage: Full Swing • Restorer adds capacitance, takes away pull down current at X • Ratio problem La logica combinatoria 60 Solution 2: Transmission Gate C A C A B B C C C = 2.5 V A = 2.5 V B CL C=0V La logica combinatoria 61 Resistance of Transmission Gate 30 2.5 V Resistance, ohms Rn 20 Rn Rp 2.5 V Vou t Rp 10 0 0.0 0V Rn || Rp 1.0 Vou t , V 2.0 La logica combinatoria 62 Transmission Gate XOR B B M2 A A F M1 M3/M4 B B La logica combinatoria 63 Delay in Transmission Gate Networks 2.5 2.5 V1 In 2.5 Vi Vi-1 C 0 2.5 C 0 Vn-1 Vi+1 C 0 Vn C C 0 (a) Req Req V1 In Req Vi C Vn-1 Vi+1 C C Req Vn C C (b) m Req Req Req Req Req Req In C CC C C CC C (c) La logica combinatoria 64 Delay Optimization La logica combinatoria 65 Dynamic Logic La logica combinatoria 66 Dynamic CMOS • In static circuits at every point in time (except when switching) the output is connected to either GND or VDD via a low resistance path. – fan-in of n requires 2n (n N-type + n P-type) devices • Dynamic circuits rely on the temporary storage of signal values on the capacitance of high impedance nodes. – requires on n + 2 (n+1 N-type + 1 P-type) transistors La logica combinatoria 67 Dynamic Gate Clk Clk Mp Mp Out In1 In2 In3 Clk CL PDN Out A C B Me Clk Me Two phase operation Precharge (CLK = 0) Evaluate (CLK = 1) La logica combinatoria 68 Dynamic Gate Clk Clk Mp off Mp on Out In1 In2 In3 Clk CL PDN 1 Out ((AB)+C) A C B Me Clk off Me on Two phase operation Precharge (Clk = 0) Evaluate (Clk = 1) La logica combinatoria 69 Conditions on Output • Once the output of a dynamic gate is discharged, it cannot be charged again until the next precharge operation. • Inputs to the gate can make at most one transition during evaluation. • Output can be in the high impedance state during and after evaluation (PDN off), state is stored on CL La logica combinatoria 70 Properties of Dynamic Gates • Logic function is implemented by the PDN only – number of transistors is N + 2 (versus 2N for static complementary CMOS) • Full swing outputs (VOL = GND and VOH = VDD) • Non-ratioed - sizing of the devices does not affect the logic levels • Faster switching speeds – reduced load capacitance due to lower input capacitance (Cin) – reduced load capacitance due to smaller output loading (Cout) – no Isc, so all the current provided by PDN goes into discharging CL La logica combinatoria 71 Properties of Dynamic Gates • Overall power dissipation usually higher than static CMOS – no static current path ever exists between VDD and GND (including Psc) – no glitching – higher transition probabilities – extra load on Clk • PDN starts to work as soon as the input signals exceed VTn, so VM, VIH and VIL equal to VTn – low noise margin (NML) • Needs a precharge/evaluate clock La logica combinatoria 72 Solution to Charge Leakage Keeper Clk Mp A Mkp CL Out B Clk Me Same approach as level restorer for pass-transistor logic La logica combinatoria 73 Issues in Dynamic Design 2: Charge Sharing Clk Mp Out A CL B=0 Clk Charge stored originally on CL is redistributed (shared) over CL and CA leading to reduced robustness CA Me CB La logica combinatoria 74 Charge Sharing VDD case 1) if V out < VTn VDD Clk Mp Mp Out Out CL A A = BB 00 Clk CL Ma Ma M Mb b Mee M XX a CC a CC bb C L VDD = C L Vout t + Ca VDD – V Tn V X or Ca V out = Vout t – V DD = – -------- V DD – V Tn V X CL case 2) if V out > VTn Ca --------------------- Vout = –V DD C + C a L La logica combinatoria 75 Solution to Charge Redistribution Clk Mp Mkp Clk Out A B Clk Me Precharge internal nodes using a clock-driven transistor (at the cost of increased area and power) La logica combinatoria 76 Issues in Dynamic Design 3: Backgate Coupling Clk Mp A=0 Out1 =1 CL1 Out2 =0 CL2 In B=0 Clk Me Dynamic NAND La logica combinatoria Static NAND 77 Other Effects • • • • Capacitive coupling Substrate coupling Minority charge injection Supply noise (ground bounce) La logica combinatoria 78 Cascading Dynamic Gates V Clk Mp Clk Mp Out1 Me Clk Out2 In In Clk Clk Me Out1 VTn V Out2 t Only 0 1 transitions allowed at inputs! La logica combinatoria 79 Domino Logic Clk In1 In2 In3 Clk Mp 11 10 PDN Me Out1 Mp Mkp Clk Out2 00 01 In4 In5 PDN Clk La logica combinatoria Me 80 Why Domino? Clk Ini Inj Clk PDN Ini Inj PDN Ini Inj PDN Ini Inj PDN Like falling dominos! La logica combinatoria 81 Properties of Domino Logic • Only non-inverting logic can be implemented • Very high speed – static inverter can be skewed, only L-H transition – Input capacitance reduced – smaller logical effort La logica combinatoria 82 Designing with Domino Logic VDD VDD VDD Clk Mp Clk Mp Out1 Mr Out2 In1 In2 In3 PDN PDN In4 Can be eliminated! Clk Me Clk Me Inputs = 0 during precharge La logica combinatoria 83 Footless Domino VDD Clk VDD Mp Clk Mp Out1 0 0 Clk Mp Out2 1 0 In1 1 VDD Outn 1 0 In2 1 0 In3 1 0 1 Inn 1 0 The first gate in the chain needs a foot switch Precharge is rippling – short-circuit current A solution is to delay the clock for each stage La logica combinatoria 84 Differential (Dual Rail) Domino off Mp Mkp Clk Out = AB 1 on Mkp 0 Clk Mp 1 A !A 0 Out = AB !B B Clk Me Solves the problem of non-inverting logic La logica combinatoria 85 np-CMOS Clk In1 In2 In3 Clk Mp 11 10 PDN Me Out1 Clk Me In4 In5 PUN 00 01 Clk Mp Out2 (to PDN) Only 0 1 transitions allowed at inputs of PDN Only 1 0 transitions allowed at inputs of PUN La logica combinatoria 86