Analysis of Adiabatic Hybrid Full Adder and 32-Bit Adders for Portable Mobile Applications

In VLSI, power optimization is the main criteria for all the portable mobile applications and developments because of its impact on system performance. The performance of an adder has significant impact on overall performance of a digital system. Adiabatic logic (AL), a new emerging research domain for optimizing the power in VLSI circuits with high switching activity is discussed, in this paper, for implementing the adder circuits. Various adiabatic logic styles full adder designs are reviewed and multiplexer-based hybrid full adder topology is designed and implemented with ECRL and 2PASCL AL styles. Moreover, in this paper, 32-bit adders such as Ripple Carry Adder (RCA), Carry Select Adder (CSLA), Carry Save Adder (CSA), Carry Skip Adder (CSKA) and Brent Kung Adder (BKA) are realized using proposed ECRL and 2PASCL adiabatic full adders. All the adders are implemented and simulated using TANNER EDA tool 22nm technology, parameters like power, area, delay and power delay product (PDP) of all the adders are observed at different operating frequencies, with supply voltage of 0.95 v and load capacitance of 0.5 pF. The observed parameters are compared with the existing adiabatic full adder designs and concluded that the proposed adiabatic full adders have the advantages of less power, delay and transistor count. In conclusion ECRL full adder is 31% faster, has equal PDP and less area than 2PASCL full adder. At 1000MHz ECRL 32-bit carry save adder is having less delay among all the 32-bit adder and 65% less PDP than 2PASCL adder and it is concluded that ECRL 32-bit carry save adder can be selected for implementation of circuits that can be used in portable mobile applications. Keywords—Adiabatic Logic, Full Adder, ECRL, 2PASCL, Ripple Carry Adder (RCA), Carry Select Adder (CSLA), Carry Save Adder (CSA), Carry Skip Adder (CSA) And Brent Kung Adder (BKA)


Introduction
The usage of battery-operated portable devices has been increased drastically. The main reasons residing behind is that, the rapid scaling of device dimensions, demands growth of the operating frequency and processing capacity per chip which results in increased power dissipation [1] [2]. The Moore's law prediction i.e. the device density increases twice for every eighteen months. All such devices are powered by batteries, but batteries are not experiencing the similar rapid density growth compared to device density. It has been generally concluded that the battery technology alone will not solve the low power problem in the near future. Next the demand of high-performance computing systems and direct impact of power dissipation on the packaging cost of the chip and the cooling cost of the system. These factors motivated the VLSI designers to discover new methods for designing low power digital circuits which consumes less power with high performance [1]- [4] Full adder cell is the main core part in all arithmetic operations, which are largely employed in dedicated algorithms such as transformations, digital filtering, correlation, convolution etc. Moreover, all portable mobile applications, have been incorporated with, different digital circuits such as media processors, DSP processors etc... uses all the algorithms which are mentioned for processing the input data effectively and efficiently [5]. The efficiency and performance of these algorithms largely affected by the design of the full adder cell. Therefore, optimized power full adder cell design is essential and should be taken into consideration, in order to make such devices commercially feasible. [6].
In the literature many full adder designs are proposed and designed, the most basic design is the standard conventional static CMOS full adder high provides full voltage swing output with good driving capability. But it has drawback of high input capacitance, declined speed and more on chip area [6]. Another approach is pass transistor 16 T full adder cell design to reduce power consumption but this logic suffers from low output swing and cannot sustain low voltage operations [7]. M. Aguirreet al [8] presented an alternative internal structure for full adder cell. These full adders are designed based on double pass transistor logic (DPL) and swing restored complementary pass transistor logic (SR-CPL) which has full voltage swing but not efficient with power consumption and transistor count (28T and 26 T). Kumar et al [9] presented two energy efficient full adder cells which occupies less area than adders mentioned [6] - [8]. The most important drawback is high carry propagation delay due to sharing of input carry with two modules. Y. Jiang et al proposed full adder cell which comprises of two XOR gates and one multiplexer [10].later by using the GDI technique this adder is implemented by Mohan Shoba et al [11],which has the advantages of improved output voltage swing but not full output voltage swing and propagation delay of carry is reduced compared to full adder design proposed by Kumar et al. In this paper different adiabatic logic styles are selected to implement the full adder cell. Adiabatic Logic (AL)a new approach of optimizing power and hence heat. Adiabatic promises large reductions in power because it does not dissipate power. In addition to minimizing the power dissipation, adiabatic logic brings down the amount of power consumed by using a power supply that is capable of recycling or recovering energy, which is made possible by using a AC power supply called power clock [12].
In the previous works of adiabatic logic, different designs of full adder have been taken and implemented using different adiabatic logic styles, the main drawback of these works is the area consumption, critical path delay and the carry bit generation time is more [29] [30].In the proposed full adder design with adiabatic logic overcomes the observed drawbacks in the previous work and discussed later in the paper. This paper presents study on different adiabatic logic styles in section II. Section III provides review of different full adder designs and mentioned the selected adiabatic logic styles. Implementation methodology of proposed full adder design and 32-bit adders using different adiabatic logic styles is presented in Section IV. In section VI the simulation results and performance analysis of all the adder circuits is discussed. Finally, section VI provides the main conclusions drawn from this work.

Conventional CMOS switching and adiabatic switching
In conventional CMOS switching, it is concluded that large amounts of energy are dissipated through the devices in the form of heat and also do not permit the recycling of energy or recovery of energy [12] [13]. Fig.1. (a) shows the conventional switching for an inverter and (b) shows the equivalent circuits during charging and discharging [12]. The total power dissipation in conventional switching is given by the equation In order to reduce the power dissipation in conventional CMOS, reduction of Vdd supply voltage can be done or node capacitance CL must be reduced or by reducing the switching activity α of the circuit. Instead of varying these parameters adiabatic logic design is preferred over conventional CMOS logic design [12].The main feature of the adiabatic logic technology is that it can reduce the power consumption of the circuit at the circuit level, by using time varying power supply called power clock [6],to power the circuit, while the conventional CMOS circuit uses a DC voltage source as a power supply. In adiabatic switching during charging process the energy from the power clock is stored in the load capacitor and during discharging process the energy which is stored in the capacitor is recovered back to the power clock [12][13] [14].Since the whole cycle consists of charging and recovering. The recovery process also leads to the amount of energy. Therefore, overall energy dissipation of Adiabatic Logic (AL) is given by the equation (2) as (2) From equation (3) it is observed that the operating speed has impact on the energy dissipation in adiabatic switching. The slower the circuit charged the lesser the energy is dissipated. If equations (1) and (2) are equated, then T > 4 is the lower limit of transition time T which is the value up to which adiabatic circuits are more energy efficient than conventional CMOS circuits [12].

Adiabatic logic circuits
Based on the amount of energy recovered back to the power supply, adiabatic logic circuits ate classified into Partial and Full Adiabatic logic circuits.
Partial adiabatic logic circuits: In this logic circuits energy is not fully recovered back to the power supply due to unintended loss of energy during operation of the circuit but the energy consumption will be less than that of conventional CMOS logic circuits. The partial adiabatic logic circuit architectures are easy to implement and analyse with a power clock system [15]. Some of the popular partial adiabatic logic styles are [16] • Efficient Charge Recovery Logic (ECRL) Fully adiabatic logic circuits: In this logic circuits power supply aids the load capacitance to recover all the charges stored in the circuit. Fully adiabatic logic design is complicated and face problems regarding speed of operation and synchronization of power clock with inputs. Some of the fully adiabatic logic styles are: • Pass transistor adiabatic logic (PAL) • Split Rail charge recovery logic (SCRL) In this paper ECRL and2PASCL from partial adiabatic logic style are selected for implementing different adder and discussed in detail in the following sections, because of their lower design complexity, more feasibility and popularity than the full adiabatic logic styles [15][16].

Efficient charge recovery logic (ECRL)
Moon and Jeong [17] proposed Efficient Charge Recovery Logic (ECRL), ECRL structure consists of pair of cross-coupled PMOS transistors, in order to recover charge from output to PCK. Both PMOS transistors source terminals are connected to the power-clock (pck), and the gate of each one of the PMOS transistor is connected to the drain of the other. These nodes form the complementary output signals. The function blocks F Block, Block, which are complement to each other, consists of series of pull-down NMOS transistors, realizes the actual logic function. The Basic ECRL structure is shown in Fig.1. (a). This logic family also generates both out and , so that the power clock generator can always drive a constant load capacitance independent of the input signalECRL uses four phase clocking rule, to efficiently recover the charge delivered by pck, are Precharge and Evaluation, Hold, Recover and Wait as [18] shown in Fig.1.(b). Efficient charge recovery logic (ECRL) adopts a new method that performs precharge and evaluation simultaneously. ECRL eliminates the precharge diode and dissipates less energy than other adiabatic circuits [18] [19].
The ECRL inverter is shown in Fig.1. (c), working is discussed as, at the beginning of clock cycle shown in Fig.2. (c), it is assumed that in at 'high' and is at 'low'. During evaluation phase, when the power clock pck rises from zero to Vdd,in signal turns on MN2 and out remains at the ground level, due to this MP1 is ON. The outb follows the pck, sinceMP1 is ON.
In hold phase, when pck reaches Vdd, the output holds the valid logic levels. During hold phase the outputs are maintained at these levels and used as inputs for the evaluation of next stage. After hold phasepck decreases gradually from Vdd to 0, the returns the energy back to the power clock through MP1, so that the delivered charge is recovered. The wait phase is inserted for the purpose of clock symmetry.
The reason for ECRL comes under partial adiabatic logic style is that when pck approaches to threshold voltage of PMOS transistor i.e. │Vtp│, MP1 will be turn off, so that the charge recovery path from outb to pck will be disconnected. Thus resulting incomplete recovery of energy. The advantage of ECRL adiabatic logic style is it offers full voltage swing because of cross coupled PMOS transistors [19] [20]. Due to this advantage, by properly choosing the phases of power clocks, ECRL circuits can be operated in a pipelining style without separate pipeline circuit between the stages. shows a basic structure of 2PASCL adiabatic logic. It consists of two MOSFET diodes i.e. MN2 and MP2 are used to recycle charges from the output node and to improve the discharging speed of internal signal nodes. In which one diode is connected between output node and power clock and other diode is connected between the NMOS and the other power source. Due to these MOSFET diodes connections, discharging of the circuit nodes does not necessarily occur during every clock cycle. Hence, node switching activities are suppressed to a significant extent, and, consequently, energy dissipation is also reduced. Fig.2. (b) shows the power clock of 2PASCL logic, it uses a two-phase clocking split-level trapezoidal power supply, wherein pck and replaces Vdd and Vss, respectively. By using these two split-level trapezoidal waveforms, the voltage difference between the current-carrying electrodes can be minimized, and consequently power consumption can be suppressed. This is achieved because of two MOSFET diodes are used for charge recycling and to improve the discharging speed of internal nodes [21] [22].

Two Phase Adiabatic Static CMOS Logic (2PASCL)
The circuit operation is divided into two phases, namely: • Evaluation • Hold [11]. Fig.2 (c) shows the 2PASCL inverter circuit, during evaluation phase, pckswings up and swings down, whereas in the hold phase, swings up and pckswings down. The working of the 2PASCL inverter is as follows: 1) Evaluation phase a) pMOS tree is turned ON, when the output node Y is LOW, then Cload is charged through the pMOS transistor; hence, the output is at HIGH state. b) nMOS is ON, when node Y is LOW and there will not be any transistion. c) pMOS is ON, When the output node is HIGH and no transition occurs. d) When node Y is HIGH and the nMOS is ON, discharging via nMOS and D2 causes the logic state of the output to be "0" [11 -13].

2) Hold phase
a) When output node Y is LOW then nMOS is ON, no transition occurs. b) Discharging occurs through D1, when the output node is HIGH and the pMOS is ON One of the advantages of the 2PASCL circuit is that this circuit can be made to behave as a static logic circuit [22]. The 2PASCL circuit has advantages of low power dissipation and high fan out [23].

Adiabatic full adder design -A
A full adder is a combinational circuit which performs the arithmetic sum of three bits A, B and carry in from a previous addition and produces the corresponding SUM and CARRY out [24]. The Boolean expressions for sum and carry respectively are  The most common full adder, implemented using adiabatic logic is based on the Boolean equations (1) and (2) that are obtained from the truth table of full adder and shown in fig.3. (a) and (b) respectively. The main drawback of this type of implementation is that sum circuit and carry circuit are implemented separately; due to this the critical path delay is more, more design complexity. Full voltage swing cannot be achieved by this circuits, and hence cascading will be limited to few stages [24].

Adiabatic full adder design -B
The second full adder topology that is implemented using adiabatic logic is based upon the following equations (3) and (4),and its equivalent logic diagram is shown in the Fig.4.In this implementation ,the basic gates involved in the full adder design are XOR ,AND and XOR gates are implemented first and connected according to the The drawback of this design is more area consumption, delay and requires three different logic gates, which leads to design complexity. Moreover, sum is generated after two propagation delays of XOR gate and carry is generated after gate delays, which results in different time delays for generating sum and carry.

Multiplexer based full adder design
The full adder is implemented using equations (5) and (6), where carry is generated using multiplexer instead of logic gates as in equation (4), and this adder is also called hybrid adder [25]. This multiplexer based full adder design has less design complexity and area consumption. The time taken for generating sum and carry outputs are the same i.e. 2 gate delays. Because of these advantages, in this work, this topology of full adder is chosen for implementing and analysing the selected adiabatic logic styles [25] [26].

Implementation of Proposed Adiabatic Adder Circuits
In this paper, proposed full adder using ECRL and 2PASCL adiabatic logics and 32 Bit Adders like Ripple Carry Adder, Carry Select Adder, Carry Save Adder, Carry Skip Adder and Brent Kung Adder are implemented which are implemented using the ECRL and 2PASCL proposed full adder are implemented in TANNER EDA tool and simulated for PTM 22 nm technology model. In this model MOSFET has Metal Gate with high-K gate oxide material and supply voltage (Vdd) is 0.95 V.

Multiplexer based full adder using ECRL and 2PASCL logic
In this paper, multiplexer based full adder is implemented using selected adiabatic logic styles i.e. ECRL adiabatic logic style and 2PASCL adiabatic logic style. This implementation of full adder requires two basic logic circuit, they are XOR gate and 2:1 Multiplexer.

Bit ripple carry adder (Rca)
The ripple carry adder is constructed by cascading full adders (FA) blocks in series, in which the carry-out of each full adder is the carry in of the succeeding next most significant full adder. The delay of ripple carry adder is linearly proportional to n, the number of bits, therefore the performance of the RCA is limited when n grows bigger [24] [26]. Fig.10.(a) shows 32 bit ripple carry adder implemented using proposed ECRL adiabatic full adder.

Carry save adder (CSA)
The carry-save unit consists of 'n' number of full adders, each one of them computes a single sum and carries bit based on the corresponding bits of the three input numbers. In carry-save addition, the carry propagation happens only in the last step, while in all the other steps partial sum and a sequence of carries are generated separately [24] [26]. Fig.10. (b) shows 32 bit carry save adder implemented using proposed ECRL adiabatic full adder.

Carry skip adder (CSKA)
A carry-skip adder also known as a carry-bypass adder, improves on the delay of a ripple carry adder with little effort compared to other adders [24] [26]. The design of a carry-skip adder is based on the generate and propagate signals as follows where is the propagate signal and is the generate signal, and Xi and Yi are the input operands to the i th adder cell. The ith adder cell, carry out is expressed as where Ci is the carry input to the i th cell.

Carry select adder (CSLA)
In many computational systems, carry select adder is used to ease the problem of carry propagation delay by generating multiple carries independently and then select a carry to generate the sum. A carry-select adder creates a true and partial sum by taking the two input bits A and B and. These outputs are given to multiplexer which chooses the correct output based on the actual carry in. Carry-select adders are made by linking 2 adders together, one being fed a constant 0-carry, the other a constant 1-carry [26]. Fig.11.(a) shows 32 bit carry select adder implemented using proposed 2PASCL adiabatic full adder.

Brent Kung Adder (Bka)
Brent Kung Adder (BKA) is a parallel prefix adder with low power consumption, as it uses less circuitry to obtain the result. Parallel in the name defines that the process involves the execution of the operation in parallel. This is done by segmentation into smaller pieces that are then computed in parallel. Then all the bits of the sum will be processed simultaneously which leads to the faster execution of operation with reduced delay [27].This adder uses a limited number of propagating and generates cells than the other 3 adders. The cost and wiring complexity is less in Brent Kung adders [28]. The delay of the structure is given by and the number of computation nodes is given by [27] [28]. Fig.11 (b) Shows 32-bit BRENT KUNG adder implemented using proposed 2PASCL adiabatic full adder Simulation Results and Discussions All the adders i.e 1 bit ECRL ,2PASCL full adder and 5 different 32 bit adders are implemented and simulated TANNER EDA tool at 22 nm technology with supply voltage Vdd = 0.95 volts and load capacitance = 0.5pF.Number of transistors for implementing full adders for selected adiabatic full adders are listed in table 1.The parameters like area, power consumed, delay and power delay product(PDP) are observed for proposed full adders at different operating frequencies and the comparison of parameters between existing and proposed adiabatic full adders are listed in the table 2. Fig .13.is the graph of PDP (pJ) of different full adder designs for 2PASCL logic at different frequencies   fig.12.at 250 MHz 93% and 83% power savings observed by the proposed 2PASCL Full adder design compared to design A and design B respectively. At 1000MHz ,69% less PDP is observed for proposed 2PASCL full adder than the design B.
At 1000MHZ, 34% power savings observed for proposed 2PASCL than ECRL full adder, and both proposed full adders have almost same PDP. And also it is observed that ECRL full adder shows consistency for PDP at all mentioned frequencies .In conclusion ECRL full adder is 31% faster ,has equal PDP, more consistent and has less area than 2PASCL full adder. Fig.13.(a)and (b).shows the comparison of power for different 32 bit adders using ECRL and 2PASCL adiabatic logic styles at different operating frequencies respectively Fig .14.(a)and (b) shows the comparison of PDP for different 32 bit adders using ECRL and 2PASCL adiabatic logic styles at different operating frequencies. From the table 3, regarding 32-bit ECRL adders at 1000MHZ, ripple carry adder consumes less power than all other adders, but delay is high. Carry save adder is having less delay and PDP when compared to other adders. Brent Kung adder consumes less power and Carry save adder has less delay and PDP for 2PASCL 32-bit adders at 1000MHZ.The 1000MHz frequency is selected because all the multicore, fixed point, floating point Digital signal processors which are commercially available are working on this operating frequency for different applications. For both adiabatic logic adders carry select adder is having high PDP than other adders. fig.15. (a) and (b),Shows the graph of power delay product(PDP) for both 32 bit adders(ECRL and 2PASCL)at 500MHz and1000MHz respectively .From fig.15.(a) and (b), it is observed that 32 bit ECRL adders are having better PDP compared to 2PASCL adders. At 500MHZ, ECRL carry save adder has less 18% less PDP than 2PASCL added. At 1000MHz ECRL 32-bit carry save adder is having less delay among all the 32-bit adder and 65% less PDP than 2PASCL adder.so it is concluded that ECRL 32-bit carry save adder can be selected for implementation of multipliers, which plays a key role in signal processing at high frequency mobile applications.

Conclusion
In this paper a new approach for implementing adiabatic full adder is a proposed and implemented using ECRL and 2PASCL adiabatic logic style. In depth analysis is conducted for the proposed adders, from the observations it is concluded that proposed adiabatic full adders, shows high savings not only in power but also in delay and area as well. When considering the two proposed adiabatic full adder, ECRL full adder has more consistency in all parameter at different frequencies, faster and has less area compared to 2PASCL full adder. In this paper along with full adder, five 32bit adders are implemented using both proposed adders. At 1000MHz frequency all 32-bit ECRL adders showing better performance than 2PASCL adders. For high frequency applications 32-bit ECRL carry save adder has less PDP product compared to all 32 adders, and can be a choice for designing high performance and low power VLSI circuits such as multipliers, Multiply Accumulate Units etc. that are used in mobile applications.