Synthesis : The Soul of Physical Design
ASIC Physical Design consists of various stages. each with its own critical importance and significance. quality of each stage directly impacts the subsequent stages, Hence it is really important to ensure the high quality execution of initial stages to achieve efficient and effective physical design implementation. In this post, will discuss detail significance about "Synthesis".
Synthesis : The Soul of Physical Design
Synthesis can be considered as "The Soul" of Physical Design. Synthesis serves as critical bridge between high level design abstract(RTL) and Physical design implementation. Below are few points makes Synthesis as "Heart" of Physical design.
- Translation of RTL into technology depended gate-level netlist : as ASIC is Technology depended (90nm,65nm,28nm...) and RTL(register transfer logic) is technology independent , Synthesis is the first stage where RTL is converted to technology depended.
- Optimization : During Synthesis, Design is optimized for Key goals such as Power, Performance, Area.
- Foundation for Timing Closure: Synthesis ensures that design meets target frequency, timing constraints by necessary optimization. hence it is critical for achieving timing closures for later stages in Physical Design.
- Hierarchical Design Support : For larger designs, Synthesis enables Hierarchical Design Methodologies which it effective in chandelling larger complex ASIC chips.
- Seamless Handoff to Physical Design : Output of Synthesis, a gate-level netlist is the Key input for Physical design. a well-synthesized netlist ensures a smooth transition from PNR to Signoff stages.
Hence, Synthesis breathes the life into physical design by providing roust and optimized representation of design.
Logic Synthesis Flow
Below is logic synthesis general flow.
Inputs:
- RTL code
- Timing library (.lib) files
- Design Constraints : There are multiple type of Design constraints available:
- Timing Constraints: it defines the clocking structure and timing requirements.
- Timing exceptions : helps to modify/relax any specific path.
- Area constraints : guides the tool to optimize Area.
- Design constraints : define physical design rules such as Timing DRV(max_capacitance, max_transition, max_fanout...)
- Power Constraints: guides the tool to optimize Power.
- UPF (only for low power design)
- Timing Constraints: it defines the clocking structure and timing requirements.
- Timing exceptions : helps to modify/relax any specific path.
- Area constraints : guides the tool to optimize Area.
- Design constraints : define physical design rules such as Timing DRV(max_capacitance, max_transition, max_fanout...)
- Power Constraints: guides the tool to optimize Power.
Elaborate:
- during elaborate, Synthesis tools converts the sequential/combinations elements based on high level information of RTL syntax.
- posedge/negedge --> infers the flip-flops
- If.....else --> Multiplexers (if any missing condition, Latch is inferred along with Mux)
- Case statements --> encoders
- High-level optimization such as removing combinations loops, floating inputs etc.
Sanity checks:
- before proceeding to further stages, it is really important to check the quality of inputs.
- for more information about sanity checks, refer below link:
Mapping:
- This is the stage where RTL is converted to gate-level representation. mapping is done in two stages:
- Generic Mapping : In Generic mapping cells are mapped to generic library which is independent to technology library.
- Technology Mapping: In this process, toll converts generic gate-level representation to technology dependent based on provided timing libraries.
Optimization:
- This is main step where design is optimized based on given constraints of Power, Performance, Area.
- in this stage, logic optimization done consider power, performance, area. techniques such as Boolean Logic Optimization is used. Timing optimization is done based on given timing constraints such as clock information and timing exceptions.
- logic optimization example:
- Y = AB + AC
- This can be re-structured into A (B+C), which helps to reduce logic gates hence area and power reduction.
Advanced Optimization techniques
Below are few Advanced Optimization Techniques used to achieve better PPA(Power, Performance, Area) :
- Sequential/Combinational Merging
- to reduce Power and Area, "Sequential/Combinational Merging" is more effective techniques.
- Combination Merging :
- This process involves identifying and merging multiple individual combinational cells which can be implemented more efficiently as single logic structure. main thing to note here is that, in this process , functionality is not affected.
- as this process reduces the Power, Area Synthesis tools also ensures that it doesn't degrade the timing quality. to ensure the quality of timing, tool will exclude the combinational cells for merging in timing critical paths.
- Sequential Merging/Multi-Bit Flip-Flops :
- This is combining of multiple single bit flops into a single multi-bit flop. This technique is used to reduce power, area, timing in Advance nodes.
- reducing number of flops can helps to reduce number of clock tree cell hence it will help to reduce skew(which ultimately helps timing) and power.
- below is example:
- Retiming:
- In this process, position of flip-flops are adjusted alogn the data-path without affecting the functionality of design. retiming can reduce delay of critical logic hence help to operate design at higher frequency.
- retiming is limited to path having same clock domain only.
- in below example, path having 500 delay is critical and next path is havign only 200 delay. by adjusting middle flops, delay from first path is able to reduced from 500 --> 400.
- ungrouping/boundary merging:
- If module boundary is not restricted, Synthesis tool can flatten/ungroup small modules to help overall reduction on datapath delay.
- Consider below example where, there modules A,B,C are part of below timing path. In order to reduce datapath delay, tool can merge and flatten module B and module C.
- sequential removal:
- In order to reduce area and power, Synthesis tool can remove registers which are not part of design functionality.
- removing unwanted/unused flops helps to reduce area as well during Clock tree Synthesis requires less clock tree cells.
- There are two main category of unused flops in design.
- Unloaded registers : Flops having output floating are not used in design and it can be removed easily without affecting Design Functionality.
- Constant registers:
- Flops having data input pins are tied off or on, doesn't contribute in Design Functionality hence it can be removed without affecting Functionality.
- If constant flops is removed , need to update formality constraints accordingly.
- Boundary Optimization : Using Boundary optimization techniques, Synthesis tool can optimize across hierarchical interface of Design, Module, Cell, Pin objects. It helps in achieving better PPA matrix. Below are few method to optimize design w.r.t boundary .
- Constant Propagations :
- In below example, as one of input of AND gate was tied off , by enabling boundary optimization ,Synthesis tool can remove AND gate.
- equal/opposite propagation
- In below example upper flops is connected to one
- unload propagation
- In below example, flop and buffer is unloaded. hence it is removed during boundary optimization.
- Assign statements:
- If Boundary Optimization is not enabled, Synthesis tool can add buffers where input and output pin is directly connected to avoid assign statements in netlist. dummy modules having only pin connectivity leads to adding lots of additional buffers to fix assign statements.
- check below scenario , Module B,C,D are empty module having just connectivity. with boundary optimization disabled, 3 buffers are added in each modules to fix the assign statements.
- Below is one more example of including all boundary optimization techniques:
- DFT Insertion :
- in this step, DFT is inserted which added logic related to test and scan-chain.
- incremental optimization :
- after DFT insertion, it is recommended to have incremental optimization to have better PPA requirement.
- Output :
- After completion of Synthesis, Gate-level Netlist is generated as output.
- along with netlist, additional reports are generated which help to analyze the quality of results.
Physical Synthesis :
- In Physical Synthesis, flow is same as Logical Synthesis with few additional inputs.
- Below are additional inputs required for Physical Synthesis:
- Floorplan information (Floorplan shape/size and Port, Macro placement)
- Technology file (.tf)
- Physical library (.lef)
- Physical constraints (such as min-max routing layer)
- RC coefficient file (.tluplus)
- MMMC (multi-mode multi-corner) file.
to Learn more about Physical synthesis and detailed comparison between logical and physical synthesis w.r.t practical aspects, click on below post :
In this post i have tried to cover Synthesis. If any additional information which need to be added Please comment. That would be great to enhance the concept.
If you have any queries or suggestions , Let me know in the comments.
I hope you find this post useful, if Yes, like and share the post with Friends and Colleagues.
If you have any queries or suggestions , Let me know in the comments.
I hope you find this post useful, if Yes, like and share the post with Friends and Colleagues.
HI Jignesh nice explanation
ReplyDeletecould you please add what are the methods will follow if we are using LOL issue at synth stage and also at what cases we can approach for pipelining method advantages and disadvantages of it?
Thank you for becoming 1st one to comment.
DeleteBy LOL I am assuming it's large levels of logic.
Synthesis tools can do Register retiming, by default synthesis tool can consider timing and register count both. If you want to focus only on timing, at the cost of more registers, timing can be optimised better way.
Regarding manual pipeline, it will increase throughout hence to add pipelining, it's required design level information (PD owner can't take call to add pipelining)
If many to many connections (e.g. priority decoder logic),it ends up in large logic level. By restructuring RTL code such issues can be fixed (though you need to convince RTL designer for that.....!!)
Hi ,
ReplyDeleteCould you please explain how synthesis tool will understand it has to convert rtl to gate level netlist? Or what guidance synthesis guy has to provide for synthesis tool ??
Well, your question is very deep.
DeleteAs far as I know, algorithm for synthesis tool is to convert technology independent RTL to technology dependent gate level netlist. This is part of internal algorithm of Synthesis tool. User just need to run commands for synthesys optimization.
However, user can guide tool for different optimization based on PPA requirments.
Thank you for the comprehensive explanation.
ReplyDeleteFollowing up on your explanation:
Sequential Removal:
Question: Is this option being enabled in the tool?
Context: I remember this required additional confirmation from the frontend team in a previous project. Please advise on the process.
Boundary Optimization:
Request: sometime it may leads to functional inequivalence at the block level. What are the specific risks or conditions we should watch for?
Also please include the exact tool commands needed for each stage you described.
Hi Sushmita ,
DeleteThank you.
Sequential removal:
This is not enabled by default.
Yes. Formality needs to be checked. If it is PASS it should be fine. (If RTL designer wants to reserve few flops not to be deleted, need to ask such lists and apply size_only.
Boundary optimization:
Yes. Formality must need to be run (this will come as very next post..! Stay tuned)
If Formality is PASS , you can use both methods as it will help to reduce power/area drastically.
As this is global platform, I am not allowed to share the tool command.
How to fix congestion in synthesis,i mean if we facing pin density, and if any other congestion will come ? Then how to fix
ReplyDeleteI am assuming you are referring here Physical Synthesis (planning to create seperate post on this topic...stay tuned)
DeleteHowever, for PnR and Physical synthesis, fixing congestion method are same only.
Thank you so much jignesh sir...
ReplyDeleteNo need to call me Sir..jignesh is enough..! (I am also learning along with you all.!)
Delete