Early power analysis in System-on-chip (SoC) design is a hot topic today. The power budget of electronic devices is decreasing every day even though area and complexity are increasing. The situation is the same for wired and wireless devices. A wireless device cannot afford to have higher power simply because that will increase the battery consumption, whereas for a wired device, high power means an increase in cooling costs. In a design flow, the common practice is to wait until RTL code is complete, then synthesize and run power analysis on the netlist. But it would be difficult to re-visit and re-factor RTL at this stage as that would mean repeating simulation, linting, synthesis, etc. repeatedly which will increase costs substantially. An improved design flow would be the one that accounts for power analysis and reduction at an early stage so that power budgeting is complete as early as possible. There are a few great tools such as ‘Joules RTL power solution’ by Cadence, PowerArtist by Ansys, etc. that does Power Analysis and Power Reduction early in the design cycle. In this blog, we discuss a feature called ‘PowerBots’ by PowerArtist that could be used for early RTL power reduction.
PowerBots by PowerArtist:
PowerBots are power reduction modules by PowerArtist. These PowerBots scan the design, looking for a specific design feature and then perform analysis and checks to make recommendations on power saving numbers or power wastage numbers to help make the decision to save power. If PowerBots can determine an example change for the design, then it could also provide a code snippet. The figure below shows how PowerBots fit into the PowerArtist.
There are basically two types of PowerBots used - ‘Power Reduction PowerBots’ and ‘Power Linter PowerBots’. ‘Power Reduction PowerBots’ look at the implementation and will try to analyse power saving, power penalty and area impact for a different implementation technique. While ‘Power Linter PowerBots’ help to identify the areas of the design where power is wasted. Power Linter PowerBots do not recommend a means of recovering the wasted power, as the method of recovering depends heavily on the design and test pattern. Listed below are different PowerBots supported:
- Power Reduction PowerBots
a. Low-Activity Non-Enabled Register (LNR)
b. Datapath Operator Isolation (DOI)
c. Local Explicit Clock Enable (LEC)
d. Split Memory Words (SMW)
e. Gate Memory Clock (GMC)
f. Power Reduction in State Machines (PRISM)
g. Observability Don’t Care (ODC)
- Power Linter PowerBots
a. Clock Enable Condition Linter (CEC)
b. Memory Power Linter (MEM)
c. Register Power Linter (REG)
d. MUX Power Linter (MUX)
Power Reduction PowerBots:
Low-Activity Non-Enabled Register (LNR) PowerBot:
This PowerBot identifies and reports all Registers in the design that don’t change frequently and does not have an enable. The PowerBot will also estimate the savings gained if a clock enable is generated based on the change in register input. An example design is shown as below:
In the above circuit, power is wasted as most of the clock toggles are un-used and unnecessary. It would be great, if the clock is gated so that the clock gets enabled only when ‘D’ input changes. But note that there is also an impact on the area when extra logic is added to save power.
Datapath Operator Isolation (DOI) PowerBot:
The LNR PowerBot as explained in the previous section suggests adding enable for Registers especially when it’s a low-activity Register. But consider if a clock enable is present and there are datapath operators such as a multiply or add in front of the enabled register. DOI PowerBot is used here to estimate the power savings gained from latching the datapath inputs when the output is not read.
For example, the above figure shows datapath operators going to a register that has an explicit clock enable. When the enable is off, the datapath output is ignored by the register though the datapath is consuming power to compute a result. DOI PowerBot will analyze the power saved if the input to the datapath is kept quiet by isolating using the same enable by adding logic like latch as in figure below.
NOTE: As in most cases, the power saving methodology suggested by the PowerBot could add extra area and a latch could cause negative impact to the timing especially if the datapath is in a critical path.
Local Explicit Clock Enable (LEC) PowerBot:
LEC PowerBot is used to estimate power savings if a register that uses ‘a mux in a feedback loop’ as enable is replaced by a gated clock.
The figure above shows a simple register with a mux-based enable. If D and Q of the register are multi-bit bus and the enable is low for a significant percentage of the circuit operation, then a lot of power is wasted on the clock power. A good synthesis tool with the right constraints could automatically add clock-gate cells for the above circuit during synthesis cycle. But a power analysis report early in the development cycle will give a lot of power to the designer to take the right decisions either to manually insert own clock gate or add the right constraints for synthesis tool to clock gate only those registers that are chosen.
Split Memory Words (SMW) PowerBot:
SMW PowerBot is designed to reduce dynamic memory power by splitting a memory into smaller symmetrical and asymmetrical parts. Consider the diagram below that shows a memory with many words split into two smaller memories, each with half the number of words.
Assume that power of one full-size memory is Fd (Dynamic) and Fs (Static) so that total power (Pf) of the full-size memory (Eg: RAM0) is:
Pf (RAM0) = Fd + Fs
SMW PowerBot looks for a memory in the library that is half-size and let's name it RAM00 and RAM01 with power as Hd (Dynamic) and Hs (Static). Now if only one (eg: RAM00) of the half-size memory is active, the total power of two Half-Sized memory is:
Ph (RAM00 + RAM01) = Hd (RAM00) + Hs (RAM00) + Hs (RAM01)
Power Artist will also consider power consumed by additional circuitry as (Pc) so that the total Power (Pt) will be:
Pt = Ph + Pc
SMW accepts memory substitution only if Pt < Pf. User can set constraints so that SMW will consider splitting memory into more than 2 memories.
For Eg. If the option is ‘3’ and memory size is 2048x32,
then SMW will look at options of different smaller memory options such as 3 memories of 1024x32 + 512x32 + 512x32.
For most architectures, the power of a read or write to the half-sized memory is far less than the power of a read or write to the full-size memories. SMW PowerBot partitions a memory based on following:
- Activity on the MSB and LSB of the address bus
- State of the chip enable – it must be active
- Whether there is activity on any other bits of the memory
It selects either LSB or MSB of the address bus – whichever has lower activity.
NOTE: SMW will also report area overhead for the memory substitutions. Area and routing congestion are key trade-offs that should be used by a designer to decide the change.
Gate Memory Clock (GMC) PowerBot:
GMC PowerBot identifies redundant memory accesses. GMC also identifies redundant write cycles for memories with byte write enables. GMC can determine a way to disable the clock in a redundant read/write access mode. GMC determines memory access as redundant if:
- Read/Write data at the memory output does not change or
- Read/Write data is not observed in the downstream cone of logic
GMC can also attempt a different method to identify a memory model that lacks internal clock gating and then turns off the clock when the memory select signal is not asserted.
GMC during this analysis performs three critical calculations:
- Estimates power saved by implementing potential reduction
- Estimates penalty power to implement additional circuitry for potential reduction
- Estimates Area penalty
Power Reduction in State Machines (PRISM) PowerBot:
PRISM PowerBot looks at opportunities where registers that are downstream of enabled registers can also be enabled using a cycle delayed version of those existing enables. PRISM looks for chains of registers where a register early in the chain is enabled while registers later in the chain are not enabled. The PowerBot will then try to determine if the upstream gated register enable signal can be used to gate it; if so, it will estimate power savings and penalties to determine effectiveness of gating.
Observability Don’t Care (ODC) PowerBot:
ODC PowerBot generates enable signals by examining topology of circuit and determine conditions where the outputs of registers are not observable by downstream registers. These conditions are then used as clock enable signals on the upstream register. Depending on the design, this may save a significant amount of dynamic power at the cost of increased area and slight timing impact.
ODC does the following:
- Locates register banks that are not clock-gated, which are the candidate registers
- Locates all downstream cone of logic
- Locates all 2-1 muxes, unencoded muxes and tri-states in the paths that connect to all downstream registers. These instances form critical steering logic that determines if the register output is observed downstream
- PowerBot then examines select lines of all steering logic to determine conditions under which register output is not observable downstream
- If such conditions exist, then that becomes a potential candidate for the enable
Power Linter PowerBots:
Memory Power Linter (MEM) PowerBot:
MEM PowerBot will monitor the data inputs of all memories in the design to see if the data input ports were wasted because the memory was not selected for a write access. As in the figure below, power is wasted when DATA is toggled more than once but WE is disabled. This behavior could be desirable or undesirable and there are several different ways to implement a more efficient circuit. The PowerBot is essentially an analysis tool that points out the areas of concern. The analysis is very much design and simulation dependent.
MUX Power Linter (MUX) PowerBot:
MUX PowerBot will monitor the data inputs of all multiplexers in the design to see if the data input ports were wasted because the data input was not selected. As in the figure below, power is wasted when DATA ‘A’ is toggled more than once while SEL is still ‘0’. This behavior could be desirable or undesirable and there are several different ways to implement a more efficient circuit. The PowerBot is essentially an analysis tool that points out the areas of concern. The analysis is very much design and simulation dependent.
Register Power Linter (REG) PowerBot:
REG PowerBot will monitor the data inputs of all registers in the design to see if the data input ports were toggled more than once before the clock of the register completes one cycle. As in the figure below power is wasted when DATA ‘D’ is toggled more than once before CLK completes one cycle. This behavior could be desirable or undesirable and there are several different ways to implement a more efficient circuit. The PowerBot is essentially an analysis tool that points out the areas of concern. The analysis is very much design and simulation dependent.
Clock Enable Condition Linter (CEC) PowerBot:
CEC PowerBot will monitor clock gating situations where the data input to the register is driven by a feedback mux. The goal is to determine situations where the mux select line, which acts as the clock gate enable signal, is not optimally designed. As in the figure below, power is wasted when ‘ENABLE’ is high but data ‘D’ doesn’t toggle for some time. This behavior could be desirable or undesirable and there are several different ways to implement a more efficient circuit. The PowerBot is essentially an analysis tool that points out the areas of concern. The analysis is very much design and simulation dependent.
Power tools are considered as power reporting tools that are used at a later stage of the design cycle. But features such as PowerBots make it well qualified to be used for early-stage analysis like the way we use Linting and CDC tools, especially if it is a power-critical device. PowerBots can be used at different stages of the design cycle with RTL, mixed RTL, gate, etc. and can be configured as per the design requirements.