Search Content

RRAM-based PUF: design and applications in cryptography

Description

The recent flurry of security breaches have raised serious concerns about the security of data communication and storage. A promising way to enhance the security of the system is through physical root of trust, such as, through use of physical unclonable functions (PUF). PUF leverages the inherent randomness in physical…

The recent flurry of security breaches have raised serious concerns about the security of data communication and storage. A promising way to enhance the security of the system is through physical root of trust, such as, through use of physical unclonable functions (PUF). PUF leverages the inherent randomness in physical systems to provide device specific authentication and encryption.

In this thesis, first the design of a highly reliable resistive random access memory (RRAM) PUF is presented. Compared to existing 1 cell/bit RRAM, here the sum of the read-out currents of multiple RRAM cells are used for generating one response bit. This method statistically minimizes any early-lifetime failure due to RRAM retention degradation at high temperature or under voltage stress. Using a device model that was calibrated using IMEC HfOx RRAM experimental data, it was shown that an 8 cells/bit architecture achieves 99.9999% reliability for a lifetime >10 years at 125℃ . Also, the hardware area overhead of the proposed 8 cells/bit RRAM PUF architecture was smaller than 1 cell/bit RRAM PUF that requires error correction coding to achieve the same reliability.

Next, a basic security primitive is presented, where the RRAM PUF is embedded in the cryptographic module, SHA-256. This architecture is referred to as Embedded PUF or EPUF. EPUF has a security advantage over SHA-256 as it never exposes the PUF response to the outside world. Instead, in each round, the PUF response is used to change a few bits of the message word to produce a unique message digest for each IC. The use of EPUF as a key generation module for AES is also shown. The hardware area requirement for SHA-256 and AES-128 is then analyzed using synthesis results based on TSMC 65nm library. It is shown that the area overhead of 8 cells/bit RRAM PUF is only 1.08% of the SHA-256 module and 0.04% of the AES-128 module. The security analysis of the PUF based systems is also presented. It is shown that the EPUF-based systems are resistant towards standard attacks on PUFs, and that the security of the cryptographic modules is not compromised.

ContributorsShrivastava, Ayush (Author) / Chakrabarti, Chaitali (Thesis advisor) / Yu, Shimeng (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)

Created2015

Reliable arithmetic circuit design inspired by SNP systems

Description

ABSTRACT Developing new non-traditional device models is gaining popularity as the silicon-based electrical device approaches its limitation when it scales down. Membrane systems, also called P systems, are a new class of biological computation model inspired by the way cells process chemical signals. Spiking Neural P systems (SNP systems), a…

ABSTRACT Developing new non-traditional device models is gaining popularity as the silicon-based electrical device approaches its limitation when it scales down. Membrane systems, also called P systems, are a new class of biological computation model inspired by the way cells process chemical signals. Spiking Neural P systems (SNP systems), a certain kind of membrane systems, is inspired by the way the neurons in brain interact using electrical spikes. Compared to the traditional Boolean logic, SNP systems not only perform similar functions but also provide a more promising solution for reliable computation. Two basic neuron types, Low Pass (LP) neurons and High Pass (HP) neurons, are introduced. These two basic types of neurons are capable to build an arbitrary SNP neuron. This leads to the conclusion that these two basic neuron types are Turing complete since SNP systems has been proved Turing complete. These two basic types of neurons are further used as the elements to construct general-purpose arithmetic circuits, such as adder, subtractor and comparator. In this thesis, erroneous behaviors of neurons are discussed. Transmission error (spike loss) is proved to be equivalent to threshold error, which makes threshold error discussion more universal. To improve the reliability, a new structure called motif is proposed. Compared to Triple Modular Redundancy improvement, motif design presents its efficiency and effectiveness in both single neuron and arithmetic circuit analysis. DRAM-based CMOS circuits are used to implement the two basic types of neurons. Functionality of basic type neurons is proved using the SPICE simulations. The motif improved adder and the comparator, as compared to conventional Boolean logic design, are much more reliable with lower leakage, and smaller silicon area. This leads to the conclusion that SNP system could provide a more promising solution for reliable computation than the conventional Boolean logic.

ContributorsAn, Pei (Author) / Cao, Yu (Thesis advisor) / Barnaby, Hugh (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)

Created2013

Digitally controlled average current mode buck converter

Description

During the past decade, different kinds of fancy functions are developed in portable electronic devices. This trend triggers the research of how to enhance battery lifetime to meet the requirement of fast growing demand of power in portable devices. DC-DC converter is the connection configuration between the battery and the…

During the past decade, different kinds of fancy functions are developed in portable electronic devices. This trend triggers the research of how to enhance battery lifetime to meet the requirement of fast growing demand of power in portable devices. DC-DC converter is the connection configuration between the battery and the functional circuitry. A good design of DC-DC converter will maximize the power efficiency and stabilize the power supply of following stages. As the representative of the DC-DC converter, Buck converter, which is a step down DC-DC converter that the output voltage level is smaller than the input voltage level, is the best-fit sample to start with. Digital control for DC-DC converters reduces noise sensitivity and enhances process, voltage and temperature (PVT) tolerance compared with analog control method. Also it will reduce the chip area and cost correspondingly. In battery-friendly perspective, current mode control has its advantage in over-current protection and parallel current sharing, which can form different structures to extend battery lifetime. In the thesis, the method to implement digitally average current mode control is introduced; including the FPGA based digital controller design flow. Based on the behavioral model of the close loop Buck converter with digital current control, the first FPGA based average current mode controller is burned into board and tested. With the analysis, the design metric of average current mode control is provided in the study. This will be the guideline of the parallel structure of future research.

ContributorsFu, Chao (Author) / Bakkaloglu, Bertan (Thesis advisor) / Cao, Yu (Committee member) / Vermeire, Bert (Committee member) / Arizona State University (Publisher)

Created2011

45-nm radiation hardened cache design

Description

Circuits on smaller technology nodes become more vulnerable to radiation-induced upset. Since this is a major problem for electronic circuits used in space applications, designers have a variety of solutions in hand. Radiation hardening by design (RHBD) is an approach, where electronic components are designed to work properly in certain…

Circuits on smaller technology nodes become more vulnerable to radiation-induced upset. Since this is a major problem for electronic circuits used in space applications, designers have a variety of solutions in hand. Radiation hardening by design (RHBD) is an approach, where electronic components are designed to work properly in certain radiation environments without the use of special fabrication processes. This work focuses on the cache design for a high performance microprocessor. The design tries to mitigate radiation effects like SEE, on a commercial foundry 45 nm SOI process. The design has been ported from a previously done cache design at the 90 nm process node. The cache design is a 16 KB, 4 way set associative, write-through design that uses a no-write allocate policy. The cache has been tested to write and read at above 2 GHz at VDD = 0.9 V. Interleaved layout, parity protection, dual redundancy, and checking circuits are used in the design to achieve radiation hardness. High speed is accomplished through the use of dynamic circuits and short wiring routes wherever possible. Gated clocks and optimized wire connections are used to reduce power. Structured methodology is used to build up the entire cache.

ContributorsXavier, Jerin (Author) / Clark, Lawrence T (Thesis advisor) / Cao, Yu (Committee member) / Allee, David R. (Committee member) / Arizona State University (Publisher)

Created2012

Built-in-self test of transmitter I/Q mismatch and nonlinearities using self-mixing envelope detector

Description

Built-in-Self-Test (BiST) for transmitters is a desirable choice since it eliminates the reliance on expensive instrumentation to do RF signal analysis. Existing on-chip resources, such as power or envelope detectors, or small additional circuitry can be used for BiST purposes. However, due to limited bandwidth, measurement of complex specifications, such…

Built-in-Self-Test (BiST) for transmitters is a desirable choice since it eliminates the reliance on expensive instrumentation to do RF signal analysis. Existing on-chip resources, such as power or envelope detectors, or small additional circuitry can be used for BiST purposes. However, due to limited bandwidth, measurement of complex specifications, such as IQ imbalance, is challenging. In this work, a BiST technique to compute transmitter IQ imbalances using measurements out of a self-mixing envelope detector is proposed. Both the linear and non linear parameters of the RF transmitter path are extracted successfully. We first derive an analytical expression for the output signal. Using this expression, we devise test signals to isolate the effects of gain and phase imbalance, DC offsets, time skews and system nonlinearity from other parameters of the system. Once isolated, these parameters are calculated easily with a few mathematical operations. Simulations and hardware measurements show that the technique can provide accurate characterization of IQ imbalances. One of the glaring advantages of this method is that, the impairments are extracted from analyzing the response at baseband frequency and thereby eliminating the need of high frequency ATE (Automated Test Equipment).

ContributorsByregowda, Srinath (Author) / Ozev, Sule (Thesis advisor) / Cao, Yu (Committee member) / Yu, Hongbin (Committee member) / Arizona State University (Publisher)

Created2012

Programmable analog device array (PANDA): a methodology for transistor-level analog emulation

Description

The design and development of analog/mixed-signal (AMS) integrated circuits (ICs) is becoming increasingly expensive, complex, and lengthy. Rapid prototyping and emulation of analog ICs will be significant in the design and testing of complex analog systems. A new approach, Programmable ANalog Device Array (PANDA) that maps any AMS design problem…

The design and development of analog/mixed-signal (AMS) integrated circuits (ICs) is becoming increasingly expensive, complex, and lengthy. Rapid prototyping and emulation of analog ICs will be significant in the design and testing of complex analog systems. A new approach, Programmable ANalog Device Array (PANDA) that maps any AMS design problem to a transistor-level programmable hardware, is proposed. This approach enables fast system level validation and a reduction in post-Silicon bugs, minimizing design risk and cost. The unique features of the approach include 1) transistor-level programmability that emulates each transistor behavior in an analog design, achieving very fine granularity of reconfiguration; 2) programmable switches that are treated as a design component during analog transistor emulating, and optimized with the reconfiguration matrix; 3) compensation of AC performance degradation through boosting the bias current. Based on these principles, a digitally controlled PANDA platform is designed at 45nm node that can map AMS modules across 22nm to 90nm technology nodes. A systematic emulation approach to map any analog transistor to PANDA cell is proposed, which achieves transistor level matching accuracy of less than 5% for ID and less than 10% for Rout and Gm. Circuit level analog metrics of a voltage-controlled oscillator (VCO) emulated by PANDA, match to those of the original designs in 90nm nodes with less than a 5% error. Voltage-controlled delay lines at 65nm and 90nm are emulated by 32nm PANDA, which successfully match important analog metrics. And at-speed emulation is achieved as well. Several other 90nm analog blocks are successfully emulated by the 45nm PANDA platform, including a folded-cascode operational amplifier and a sample-and-hold module (S/H)

ContributorsXu, Cheng (Author) / Cao, Yu (Thesis advisor) / Blain Christen, Jennifer (Committee member) / Bakkaloglu, Bertan (Committee member) / Arizona State University (Publisher)

Created2012

Path selection based branching for coarse grained reconfigurable arrays

Description

Coarse Grain Reconfigurable Arrays (CGRAs) are promising accelerators capable of

achieving high performance at low power consumption. While CGRAs can efficiently

accelerate loop kernels, accelerating loops with control flow (loops with if-then-else

structures) is quite challenging. Techniques that handle control flow execution in

CGRAs generally use predication. Such techniques execute both branches of an

if-then-else…

Coarse Grain Reconfigurable Arrays (CGRAs) are promising accelerators capable of

achieving high performance at low power consumption. While CGRAs can efficiently

accelerate loop kernels, accelerating loops with control flow (loops with if-then-else

structures) is quite challenging. Techniques that handle control flow execution in

CGRAs generally use predication. Such techniques execute both branches of an

if-then-else structure and select outcome of either branch to commit based on the

result of the conditional. This results in poor utilization of CGRA s computational

resources. Dual-issue scheme which is the state of the art technique for control flow

fetches instructions from both paths of the branch and selects one to execute at

runtime based on the result of the conditional. This technique has an overhead in

instruction fetch bandwidth. In this thesis, to improve performance of control flow

execution in CGRAs, I propose a solution in which the result of the conditional

expression that decides the branch outcome is communicated to the instruction fetch

unit to selectively issue instructions from the path taken by the branch at run time.

Experimental results show that my solution can achieve 34.6% better performance

and 52.1% improvement in energy efficiency on an average compared to state of the

art dual issue scheme without imposing any overhead in instruction fetch bandwidth.

ContributorsRajendran Radhika, Shri Hari (Author) / Shrivastava, Aviral (Thesis advisor) / Christen, Jennifer Blain (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)

Created2014

Statistical characterization and decomposition of SRAM cell variability and aging

Description

Memories play an integral role in today's advanced ICs. Technology scaling has enabled high density designs at the price paid for impact due to variability and reliability. It is imperative to have accurate methods to measure and extract the variability in the SRAM cell to produce accurate reliability projections for…

Memories play an integral role in today's advanced ICs. Technology scaling has enabled high density designs at the price paid for impact due to variability and reliability. It is imperative to have accurate methods to measure and extract the variability in the SRAM cell to produce accurate reliability projections for future technologies. This work presents a novel test measurement and extraction technique which is non-invasive to the actual operation of the SRAM memory array. The salient features of this work include i) A single ended SRAM test structure with no disturbance to SRAM operations ii) a convenient test procedure that only requires quasi-static control of external voltages iii) non-iterative method that extracts the VTH variation of each transistor from eight independent switch point measurements. With the present day technology scaling, in addition to the variability with the process, there is also the impact of other aging mechanisms which become dominant. The various aging mechanisms like Negative Bias Temperature Instability (NBTI), Channel Hot Carrier (CHC) and Time Dependent Dielectric Breakdown (TDDB) are critical in the present day nano-scale technology nodes. In this work, we focus on the impact of NBTI due to aging in the SRAM cell and have used Trapping/De-Trapping theory based log(t) model to explain the shift in threshold voltage VTH. The aging section focuses on the following i) Impact of Statistical aging in PMOS device due to NBTI dominates the temporal shift of SRAM cell ii) Besides static variations , shifting in VTH demands increased guard-banding margins in design stage iii) Aging statistics remain constant during the shift, presenting a secondary effect in aging prediction. iv) We have investigated to see if the aging mechanism can be used as a compensation technique to reduce mismatch due to process variations. Finally, the entire test setup has been tested in SPICE and also validated with silicon and the results are presented. The method also facilitates the study of design metrics such as static, read and write noise margins and also the data retention voltage and thus help designers to improve the cell stability of SRAM.

ContributorsRavi, Venkatesa (Author) / Cao, Yu (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Clark, Lawrence (Committee member) / Arizona State University (Publisher)

Created2013

Energy-efficient DSP system design based on the Redundant Binary number system

Description

Redundant Binary (RBR) number representations have been extensively used in the past for high-throughput Digital Signal Processing (DSP) systems. Data-path components based on this number system have smaller critical path delay but larger area compared to conventional two's complement systems. This work explores the use of RBR number representation for…

Redundant Binary (RBR) number representations have been extensively used in the past for high-throughput Digital Signal Processing (DSP) systems. Data-path components based on this number system have smaller critical path delay but larger area compared to conventional two's complement systems. This work explores the use of RBR number representation for implementing high-throughput DSP systems that are also energy-efficient. Data-path components such as adders and multipliers are evaluated with respect to critical path delay, energy and Energy-Delay Product (EDP). A new design for a RBR adder with very good EDP performance has been proposed. The corresponding RBR parallel adder has a much lower critical path delay and EDP compared to two's complement carry select and carry look-ahead adder implementations. Next, several RBR multiplier architectures are investigated and their performance compared to two's complement systems. These include two new multiplier architectures: a purely RBR multiplier where both the operands are in RBR form, and a hybrid multiplier where the multiplicand is in RBR form and the other operand is represented in conventional two's complement form. Both the RBR and hybrid designs are demonstrated to have better EDP performance compared to conventional two's complement multipliers. The hybrid multiplier is also shown to have a superior EDP performance compared to the RBR multiplier, with much lower implementation area. Analysis on the effect of bit-precision is also performed, and it is shown that the performance gain of RBR systems improves for higher bit precision. Next, in order to demonstrate the efficacy of the RBR representation at the system-level, the performance of RBR and hybrid implementations of some common DSP kernels such as Discrete Cosine Transform, edge detection using Sobel operator, complex multiplication, Lifting-based Discrete Wavelet Transform (9, 7) filter, and FIR filter, is compared with two's complement systems. It is shown that for relatively large computation modules, the RBR to two's complement conversion overhead gets amortized. In case of systems with high complexity, for iso-throughput, both the hybrid and RBR implementations are demonstrated to be superior with lower average energy consumption. For low complexity systems, the conversion overhead is significant, and overpowers the EDP performance gain obtained from the RBR computation operation.

ContributorsMahadevan, Rupa (Author) / Chakrabarti, Chaitali (Thesis advisor) / Kiaei, Sayfe (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)

Created2011

Aging predictive models and simulation methods for analog and mixed-signal circuits

Description

Negative bias temperature instability (NBTI) and channel hot carrier (CHC) are important reliability issues impacting analog circuit performance and lifetime. Compact reliability models and efficient simulation methods are essential for circuit level reliability prediction. This work proposes a set of compact models of NBTI and CHC effects for analog and…

Negative bias temperature instability (NBTI) and channel hot carrier (CHC) are important reliability issues impacting analog circuit performance and lifetime. Compact reliability models and efficient simulation methods are essential for circuit level reliability prediction. This work proposes a set of compact models of NBTI and CHC effects for analog and mixed-signal circuit, and a direct prediction method which is different from conventional simulation methods. This method is applied in circuit benchmarks and evaluated. This work helps with improving efficiency and accuracy of circuit aging prediction.

ContributorsZheng, Rui (Author) / Cao, Yu (Thesis advisor) / Yu, Hongyu (Committee member) / Bakkaloglu, Bertan (Committee member) / Arizona State University (Publisher)

Created2011

Filtering by