Posts

ZWL(Zero Wire Load) Why its preferred ?

Image
  1) If we provide the Wire details prior to PNR stage then its called wire load Module. ( so we will  provide parasitic information file to the tool) 2) But coming to ZWL we wont provide any wire information to the tool so tool will assume that wire is zero ( we will not provide any parasitic information to the tool). Then how will tool will estimate the parasitic information? so tool will understand that net delay is zero, While checking the Timing from the tool, The tool will pick the cell  net delay from the standard cell libraries. Now we got the extract timing information with respective the standard cells used in the design.  Then how it will be helped in the timing? if you check in the tool by providing with and without parasitic information file you will differ with respective setup time and hold time of a chip.  By doing some analysis with ZWL timing will be clean in setup and hold is under control across the corners. Run time is also fast and implementation time will be less

SDC main checks for chip/block level.

1 Clock Definitions : unclocked flops and  Clocks defined on hierarchical pins 2 IO delays: unconstraint's input and output delays, and any conflicts constant value to the same pins 3 set_case_analysis: any signal values with different enable signal 4 clock to clock false path :  Missing timing exceptions between clocks with non-integer clock periods. 5 Generated clocks : Clocks not defining the master clock The engineer needs to check the clock definitions, clock groups, synchronous relationships, or even mode info about the design; which is again all available in well made SDC files. In fact if SDC is done right then major issues can solve in complete chip level or block level it lets you automatically create high quality setup files many other downstream tasks and tools.

ICG placement and its problems

Image
  In order to over the timing problem, while inserting the clock gating, we need to insert the clock gating at the correct  position such that it should not effect the timing. Even if we the  add the ICG for every flop it leads to more dynamic power  For example ICG cloning will be happened for every 20 flops  again depends on Technology.

Clock- gating to optimize the leakage and dynamic power

Image
  From the above gating circuit using MUX for data diagram the clocking strategy safeguard that the flop updates every clock cycle even though the data (d) signal is valid when the enable line goes high. mainly the number of clock cycles are wasted, so to over the above problem we are inserting the AND gates instead of mux. ( you use any gate not only AND gate ). We can optimize the circuit above for power consumption ensuring that the flop is clocked only when the data (d) line is validated with the enable signal.

AOCV

Image
  AOCV (ADVANCE ON_CHIP_VARITIONS)   Aocv will be calculated with the depth and distance by seeing above figure you will get an idea. for the above figure we will consider depth as 6 because path-1 is the longest path from the above fig. so tool will take the depth as 6 from the aocv file and distance is calculated with the help of will the diagonal distance. But AOCV will be calculated from the diverging point only. Note: just understanding purpose I kept buffer in the above diagram in place of buffers it may gates flops etc.. AOCV setting can be done on both PNR tools and STA tools. While doing analysis we will flow some time Graph based Analysis and Path based Analysis both  With the help of AOCV we will get 3 to 4% of pessimism   lets see some difference between GBA and PBA GBA it will have worst skew (transitions) its not realistic  fast analyzer less memory required worst aocv derating will occurs.. etc  PBA Actual skew will be propagated its more realistic  slow analyzer more m

Noise models

Image
                                                        Above high                                                                                                                      Above low                                    Below low                                                                                                                         Below high Beyond the rails these are more critical part the block if its occurs there is high possibility of gate demands Between the rails will have the power model files like ccs NLPM, etc., depends on the technology To avoid Beyond the rails effect in the block: For these beyond the rails first we will do some reliability checks then after will give some margin. which is not the part of the noise models  margin setting will done manually for this models with the help of set_noise_margin command How much margin ? Its again depends upon the gate withstand effect  which will comes from foundry  its generally 30 to 40% of vdd.

Scan Chain Insertion

Image
  All the flip-flops present in the design are replaced with the scan flip-flops (for a full scan design). The scan flip-flops are connected together in form of a chain so we call it as a scan chain. Scan chain acts as a shift register when the design is in test timing mode, Then Scan_EN (test enable signal) is active high. The first flip-flop of the scan chain is connected to the scan input port and the last flop the scan chain is connected to the scan_op scan output. When some of the flip-flops are intentionally not converted to scan flops , such designs are called partial scan design. Making a design full scan makes the design more testable for manufacturing defects at the cost of complexity, area and power. There are three stages of scan chain operation                                      

KEEPOUT MARGIN

Image
  Keepout margin : This the one of the technique using during placement macros and standard cells in the core area. It is the region around the boundary of fixed cells in a block in which no other cells are placed. The width of the keep-out margin on each side of the fixed cell can be the same or different. Keeping the placement of cells, macros etc.. This technique can  avoids congestion and nets detouring and timings. For the best quality of results. How we decide the margin?  The margin is decided based on the macros or IPs .These macros and IPs will come from vendors its clearly mention in the document regarding the margin around the macros like expected values will be mentioned, by reading the document we can get the values will giving the values we may increase or decrease depend on the problem we faced in the core area. For the std cells, basically we will keep out margin to mainly for AOI ,IOA and Multibit flops etc. Why only AOI, IOA and multibit flops because these cells will

Clock_skew , (+)SKEW, (-)SKEW, Useful skew

Image
  Clock skew Arrival time of the clock Transition or Difference in clock arrival time at two spatially distinct points (TCLK1 and TCLK2) positive skew if the capture clocks comes late than launch clock then its called +skew +skew can lead to hold violations Negative skew if the capture clock comes early than launch clock it is called -skew -skew can lead to setup violations Useful skew it is a concept of delaying the capturing flop clock path this approach will helps in meeting setup requirements. setup requirement with in the launch and capture timing path. But the hold requirement has to be met for the design. Useful skew techniques can be used to fix both setup and hold violations. One disadvantage of this technique is that if the design has multiple modes of operation, then useful skew can potentially cause a problem in another mode.

Basic gates using 2x1 mux identify the gates__ A,B,C,D,E,F

Image
 

3x1 Mux using 2X1 Mux

Image
 

Fullchip Design (sample)

Image
 

Logic_Synthesis_single_shot_overview

Image

Layout versus Schematic and Waveforms

Image
                                                                Inverter                                                                             Layout Transient Response

Clock_Tree_TYPES__Clock_Mesh

Image
  Advantage: Its very small skew Disadvantage : The time that spread from the clock root to the global mesh is basically the same, the time that spread from the global mesh to each local tree register is different and clock mesh structure consumes more routing resources hence redundant interconnect structure will bring more power consumption.

Clock_Tree_TYPES__Fish bone (spine Tree)

Image
  Advantages : it makes easy to reduce the skew Disadvantage : It effect the process parameters and problem with Phase delay

Clock_Tree_TYPES__Balanced --Tree

Image
  Advantages: Its easy to adjust the capacitance of driving net to better achieve of SKEW requirement Disadvantages : But DUMMY cells used to balance the load it leads to increases power and area

CLOCK Tree TYPES___H---TREE

Image
  H-Tree Advantage: It is easy to reduce clock skew Disadvantage: Difficult to fix register placement

Sample Block

Image
  This pic from smartplay

Power Planning

Image