Full digital flow with Cadence tools and NCSU standard library

 

The intention of this post is to show all the steps required to successfully fabricate a chip with Cadence environment using NCSU 0.5um standard digital library. It is important to highlight that this tutorial will show a lot of details that can be adapted to other CAD tools or technologies, it can also be used to implement mixed-signal projects by including the digital layout inside Virtuoso. I find it helpful since it is difficult to find such content on the internet. Likewise, I am not going to go deep on the tools capabilities, but explore the basic steps to have a final product. With the exception of the Cadence tools, you can get a totally free chip using the NCSU design kit and the MOSIS service (if you are attached to a university). The process steps may differ depending on the system is being used.

The beginning of the digital flow starts with a Verilog/VHDL code specified by the desired project. For the sake of simplicity, let’s use a Verilog code which implements a 8-bit counter:

module up_counter(
  out,
  enable,
  clk,
  reset
);

output [7:0] out;
input enable, clk, reset;
reg [7:0] out;

always @(posedge clk)
if (reset) begin
  out <= 8'b0 ;
end else if (enable) begin
  out <= out + 1;
end

endmodule

And its testbench:


`include "up_counter.v"

module up_counter_tb;

wire [7:0] out;
reg enable, clk, reset;

up_counter U0(
  .out (out),
  .enable (enable),
  .clk (clk),
  .reset (reset)
);

initial begin

  $monitor("out=%8b enable=%b clk=%b reset=%b",
            out,enable,clk,reset);
  enable=1;
  reset=0;
  clk=0;
  #1
  reset=1;
  #1
  clk=1;
  #1
  reset=0;
  clk=0;
  #1
  repeat(300) #20 clk = ~clk;

end

endmodule

The simulation can be performed with the NCSim tool “irun”, running on a graphical window or the terminal. To run on a graphical window (SimVision):

irun -gui -access rwc up_counter_tb.v

After the window pops-up, click on the instance “up_counter_tb” and after “Send To: Waveform”. Press “Run” and the simulation results can be seen:

SimVision

With the successful verification of the Verilog functionality, it’s time to synthesize it! The “Encounter RTL Compiler” is used to translate the produced Verilog code into the combination of digital cells from a standard library, NCSU in our case. The file containing the information of the digital cells usually has a “.lib” suffix, the name of the NCSU file for the 0.5um technology is “osu05_stdcells.lib”, make sure to include this file in the same directory you are working. Next, run the compiler:

"path_to_RC"/bin/rc

When the RC terminal opens, insert the following commands:

  1. set_attribute library osu05_stdcells.lib
  2. set interconnect_mode wireload
  3. read_hdl up_counter.v
  4. elaborate
  5. synthesize -to_mapped
  6. report area (optional)
  7. report gates (optional)
  8. report timing (optional)
  9. write_hdl > up_counter_synth.v
  10. exit

If you want to know about those commands better, please refer to the user manual inside the tool package. After following the last steps, the tool returns the synthesized Verilog file:

module up_counter(out, enable, clk, reset);
  input enable, clk, reset;
  output [7:0] out;
  wire enable, clk, reset;
  wire [7:0] out;
  wire n_0, n_1, n_2, n_3, n_4, n_5, n_6, n_7;
  wire n_10, n_11, n_12, n_14, n_15, n_16, n_17, n_18;
  wire n_19, n_20, n_21, n_22, n_23, n_24, n_25, n_26;
  wire n_27, n_28, n_29, n_30, n_31, n_32, n_33, n_34;
  wire n_35, n_36, n_37, n_38, n_39, n_40, n_52;
  DFFPOSX1 \out_reg[7] (.CLK (clk), .D (n_40), .Q (out[7]));
  NOR2X1 g129(.A (reset), .B (n_38), .Y (n_40));
  DFFPOSX1 \out_reg[6] (.CLK (clk), .D (n_39), .Q (out[6]));
  NOR2X1 g132(.A (reset), .B (n_36), .Y (n_39));
  DFFPOSX1 \out_reg[5] (.CLK (clk), .D (n_37), .Q (out[5]));
  AOI22X1 g131(.A (enable), .B (n_34), .C (n_35), .D (out[7]), .Y
       (n_38));
  NOR2X1 g137(.A (reset), .B (n_32), .Y (n_37));
  DFFPOSX1 \out_reg[4] (.CLK (clk), .D (n_33), .Q (out[4]));
  AOI22X1 g134(.A (enable), .B (n_29), .C (n_35), .D (out[6]), .Y
       (n_36));
  OAI21X1 g133(.A (out[7]), .B (n_30), .C (n_31), .Y (n_34));
  NOR2X1 g143(.A (reset), .B (n_28), .Y (n_33));
  DFFPOSX1 \out_reg[3] (.CLK (clk), .D (n_27), .Q (out[3]));
  AOI22X1 g139(.A (enable), .B (n_26), .C (n_35), .D (out[5]), .Y
       (n_32));
  NAND2X1 g136(.A (out[7]), .B (n_30), .Y (n_31));
  DFFPOSX1 \out_reg[2] (.CLK (clk), .D (n_25), .Q (out[2]));
  OAI21X1 g138(.A (out[6]), .B (n_23), .C (n_24), .Y (n_29));
  AOI22X1 g145(.A (enable), .B (n_19), .C (n_35), .D (out[4]), .Y
       (n_28));
  NOR2X1 g149(.A (reset), .B (n_22), .Y (n_27));
  OAI21X1 g144(.A (out[5]), .B (n_21), .C (n_20), .Y (n_26));
  NOR2X1 g155(.A (reset), .B (n_18), .Y (n_25));
  NAND2X1 g141(.A (out[6]), .B (n_23), .Y (n_24));
  OR2X2 g142(.A (n_23), .B (n_0), .Y (n_30));
  AOI22X1 g151(.A (enable), .B (n_15), .C (n_35), .D (out[3]), .Y
       (n_22));
  OR2X2 g147(.A (n_21), .B (n_3), .Y (n_23));
  NAND2X1 g148(.A (out[5]), .B (n_21), .Y (n_20));
  OAI21X1 g150(.A (out[4]), .B (n_16), .C (n_17), .Y (n_19));
  AOI22X1 g157(.A (enable), .B (n_52), .C (n_35), .D (out[2]), .Y
       (n_18));
  DFFPOSX1 \out_reg[1] (.CLK (clk), .D (n_14), .Q (out[1]));
  NAND2X1 g153(.A (out[4]), .B (n_16), .Y (n_17));
  OR2X2 g154(.A (n_16), .B (n_2), .Y (n_21));
  OAI21X1 g156(.A (out[3]), .B (n_11), .C (n_12), .Y (n_15));
  NOR2X1 g161(.A (reset), .B (n_10), .Y (n_14));
  NAND2X1 g159(.A (out[3]), .B (n_11), .Y (n_12));
  OR2X2 g160(.A (n_11), .B (n_1), .Y (n_16));
  DFFPOSX1 \out_reg[0] (.CLK (clk), .D (n_7), .Q (out[0]));
  AOI22X1 g163(.A (enable), .B (n_4), .C (n_35), .D (out[1]), .Y
       (n_10));
  NAND2X1 g165(.A (out[2]), .B (n_6), .Y (n_11));
  NOR2X1 g167(.A (reset), .B (n_5), .Y (n_7));
  MUX2X1 g170(.A (n_35), .B (enable), .S (out[0]), .Y (n_5));
  HAX1 g169(.A (out[1]), .B (out[0]), .YC (n_6), .YS (n_4));
  INVX1 g174(.A (out[5]), .Y (n_3));
  INVX1 g173(.A (out[4]), .Y (n_2));
  INVX1 g171(.A (out[3]), .Y (n_1));
  INVX1 g175(.A (enable), .Y (n_35));
  INVX1 g172(.A (out[6]), .Y (n_0));
  XOR2X1 g2(.A (out[2]), .B (n_6), .Y (n_52));
endmodule

The above code illustrates the synthesis result, from a behavioral Verilog it is possible to get a structural code compatible with the provided standard library cells.

Now, it is time to perform another verification after the synthesis process. The Verilog file “osu05_stdcells.v” containing the digital characteristics is required. After including it in your working directory, change the include file from the previously created testbench as:

`include "osu05_stdcells.v"
`include "up_counter_synth.v"

Then, just run the simulation again with the “irun” tool. Double-check the results and correct the project if they are wrong.

The next step is encapsulating the created Verilog instance to the chip pads. The NCSU pads orientation which is compatible with Cadence placement tool is described in the file “ex_encounter.io” inside the NCSU package. To encapsulate, create another instance inside the synthesized Verilog file, and connect the wires of the circuit to the pads as:

module up_counter_synth_pads(out_pad,enable_pad,clk_pad,reset_pad);

  input enable_pad, clk_pad, reset_pad;
  output [7:0] out_pad;
  wire enable, clk, reset;
  wire [7:0] out;

  up_counter CORE(
    .out(out),
    .enable(enable),
    .clk(clk),
    .reset(reset)
  );

   PADOUT p00 (.YPAD(out_pad[0]),
	.DO(out[0]));
   PADOUT p01 (.YPAD(out_pad[1]),
	.DO(out[1]));
   PADOUT p02 (.YPAD(out_pad[2]),
	.DO(out[2]));
   PADOUT p03 (.YPAD(out_pad[3]),
	.DO(out[3]));
   PADOUT p04 (.YPAD(out_pad[4]),
	.DO(out[4]));
   PADOUT p05 (.YPAD(out_pad[5]),
	.DO(out[5]));
   PADOUT p06 (.YPAD(out_pad[6]),
	.DO(out[6]));
   PADOUT p07 (.YPAD(out_pad[7]),
	.DO(out[7]));
   PADINC p08 (.YPAD(enable_pad),
	.DI(enable));
   PADINC p09 (.YPAD(clk_pad),
	.DI(clk));
   PADINC p10 (.YPAD(reset_pad),
	.DI(reset));
   PADGND p11 ();
   PADVDD p12 ();
   PADGND p20 ();
   PADVDD p21 ();
   PADGND p30 ();
   PADVDD p31 ();

   PADNC p13 ();
   PADNC p14 ();
   PADNC p15 ();
   PADNC p16 ();
   PADNC p17 ();
   PADNC p18 ();
   PADNC p19 ();

   PADNC p22 ();
   PADNC p23 ();
   PADNC p24 ();
   PADNC p25 ();
   PADNC p26 ();
   PADNC p27 ();
   PADNC p28 ();
   PADNC p29 ();

   PADNC p32 ();
   PADNC p33 ();
   PADNC p34 ();
   PADNC p35 ();
   PADNC p36 ();
   PADNC p37 ();
   PADNC p38 ();
   PADNC p39 ();

   PADFC c01 ();
   PADFC c02 ();
   PADFC c03 ();
   PADFC c04 ();
endmodule

The “PADOUT” and “PADINC” stands for the I/O pads, “PADNC” for dummy, “PADVDD” and “PADGND” for power supply, and “PADFC” for corner pads.

The Verilog is ready for layout placement by Cadence Encounter tool, but before that, make sure you include “ex_encounter.io”, “osu05_stdcells.stacks.lef” and “osu05_stdcells.lef” inside your directory, the “.lef” files contains the layout information of the digital cells, such as it’s area and which coordinates are it’s ports, it acts as a symbol from a Virtuoso schematic for example. Open the Encounter:

"path_to_Encounter"/bin/encounter

And open File->Import Design…

  1. Files: up_counter_synth.v
  2. By User: up_counter_synth_pads
  3. LEF Files: osu05_stdcells.lef osu05_stdcells.stacks.lef
  4. IO Assignment File: ex_encounter.io
  5. Power Nets: vdd!
  6. Power Nets: gnd!
  7. OK

import_design

Result:

pads_placed

Moving foward, let’s add some space for power management and nets routing. Open Floorplan->Specify Floorplan…->Die/IO/Core Coordinates->subtract 100 units of space from width and height (Core LL->UR):

core_margins

You should also change the orientation of the pads since they are 180 degrees shifted from the right position (with exception to the corners, that will be adjusted later). Floorplan->Specify Floorplan…->Advanced->Bottom IO Pad Orientation->R180

IO_orientation

Result:

core_space

Then, let’s insert the rings and stripes. Open Power->Connect Global Nets…

  1. Tie High->Pin name:vdd->Apply All->To Global Net:vdd!->Add to List
  2. Tie Low->Pin name:gnd->Apply All->To Global Net:gnd!->Add to List
  3. Apply

global_nets

Power->Power Planning->Add Ring…

  1. Net(s): vdd! gnd!
  2. Width Top/Bottom/Left/Right: 15
  3. Spacing Top/Bottom/Left/Right: 10
  4. Center in channel
  5. OK

rings

Result:

rings_fig

Power->Power Planning->Add Stripes

  1. Net(s): vdd! gnd!
  2. Width: 15
  3. Spacing: 10
  4. Number of sets: 1
  5. Relative from core or selected area->X from left 350 ||(1100-400)/2
  6. OK

stripes.png

Result:

stripes_fig

Route->Special Route->Net(s): vdd! gnd!->OK

special_r

After the power supply configuration, it is time to place the standard cells in the layout, as well as the filler cells to meet design constraints. Do Place->Standard Cell->Run Full Placement->Include Pre-Place Optimization->OK

placement

Place->Physicall Cell->Add Filler->Cell Name->Select->FILL->Add

fill_menu

Result:

fill_fig

Route the placed instances: Route->NanoRoute->Route->OK

nano_route

Finally, finish the Encounter work by saving your design. File->Save->Netlist | File->Save->DEF. The Netlist file will be used to import the schematic on Cadence Virtuoso, and the DEF to import the layout.

Now, it is time to check if the produced layout meet the project specifications, as well as the technology Design Rule Check (DRC). Open the Virtuoso software:

virtuoso

Create a new library to import the design in the library manager, “counter_import” for example and attach it to the “NCSU_TechLib_ami06”. Then go to the main window and do File->Import->Verilog

  1. Target Library Name: counter_import
  2. Reference Libraries: OSU_stdcells_ami05 NCSU_Analog_Parts
  3. Verilog Files To Import: “path_to_encounter_file”/up_counter_encounter.v
  4. OK

verilog_import

Futhermore, import the layout: File->Import->DEF

  1. DEFIn File Name: “path_to_def_file”/up_counter_encounter.def
  2. Target Library Name: counter_import
  3. Ref. Technology Libraries: *leave it blank*
  4. Target Cell Name: up_counter_synth_pads
  5. Target View Name: layout
  6. OK

def_in

After importing both Verilog and DEF files, you will be able to see the schematics created and the layout instances placed gracefully. However, the layout layers can’t be seen yet because it is needed to map the views of the cells to layout.

pre_layer

Shift+s->Search for “inst” in “current cellView”->Replace “view name”->layout->Apply->Replace All:

layers_nois

The corners seem to be 180 degrees rotated, to fix that, just right click on the top of the instance and choose a rotation property that makes the corner go on on the right position:

corner_prop

Result:

corner_good

Some remarks: for some reason that I didn’t figure out yet, the virtuoso could not import the layout with the power supply vias from the stripes and the pads, so the project manager has to manually include those vias in order to fix this power supply issue. Also, take care if there are some unconnected nets on the pads port.

no_vias
net_crazy

Before sending to fabrication, the chip needs to be verified with the schematic and with the DRC rules (DRC and LVS), other post-layout analysis will not be made in this post. The pads rules are not the same as the standard rules for the technology, for this reason, it is required to remove the pads in order to successfully run the DRC and LVS to check if the core is functional. To do so, create a clone of the hole library, and remove the pads from the layout and schematic of the clone.

The next image shows the final layout of an FPGA Configurable Logic Block from a research project:

chip_clb word.png

The above layout were sent to fabrication (MOSIS) after adjusting the DRC errors such as tiling, and the LVS too. I will make a post with the testing of the physical chip soon!

Controlling Virtuoso schematic inputs with Digital Vector File

The utilization of a Vector File makes controlling multiple input and output signals independently possible. Its implementation is important for a variety of circuits which need different pulses at specific times, relieving the effort to manage PWL voltage sources at schematic.

Note: comments begins with “;” character

radix 1 1 4 4
; radix specifies the number of ports and number of bits for each port,
; the base number is also specified ( from binary '1' to hexadecimal '4' )

io i i i i

; io defines the state of each port, as input "i", output "o"
; or bidirectional "b"
vname Primeiro_Sinal Segundo_Sinal Terceiro_Sinal[0:3] Quarto_Sinal,Quinto_Sinal,Sexto_Sinal,Setimo_Sinal

; vname sets the name of each port

tunit 1ns

; tunit configures the time unit of simulation

trise 1ps
; trise defines the rise time of each vector

tfall 1ps
; tfall defines fall time of each vector

vih 5.0
; vih defines logical '1' voltage

vil 0.0
; vil defines logical '0' voltage
period 10.0

; period determines the time interval between each vector step, the
; number is multiplied by the previously time unit set

idelay 5 0 1 1 0

; idelay inserts a delay at the selected ports multiplied by the time unit

; The next lines will define the states of each port as the
; transient simulation happens, each line represents a time step

; The binary number can be represented with 0 or 1, and the
; hexadecimals with 0-F

0 0 0 0 ; All signals initially set to 0
1 0 F F ; First signal with a logical '1', second with a logical '0' , the other ones set to a logical '1'
0 1 C C ; And goes on
1 0 A A
1 1 3 3

The ports declaration and manipulation are indented by its position from left to right.

Continue reading

SRAM write and read basics (1/2)

The Static Random Access Memory is widely used among digital circuits for its good trade-off between memory architectures. Being not large like register files, at the same time not slow as DRAM or hard disk management. It is also compatible with standard CMOS process.

sram_t

The standard SRAM cell architecture is shown above, a pair of pMOS and nMOS transistors create cross-coupled inverters that can hold its state without needing to have its input driven by an extern signal.

The word line “WL” enables the bit lines “BL and “!BL” to drive the net “Q” to a logical value (either 0 or 1) and at the same time, opening the cell data to be read by a special circuitry.

Sizing those transistors correctly is required. If the inverter transistors “M1 to M4” are considerably stronger than “M5” and “M6” access transistors, the current passing through the transistors “M5 and M6” will not be enough to flip the state of the memory cell. The stable state created by the cross-coupled inverters needs to be overpowered to change its logical level. Moreover, the opposite situation with unbalanced strength (size) causes the logical state to be changed from bit lines noise even with the word lines disabled. The same problem can happen during the read operation which is going to be presented further.

Continue reading