# An FPGA-Based Advanced Lightweight Cryptography Architecture for IoT Security and Its Cryptanalysis

## Introduction: Background and Driving Forces

In this era smart era with IoT-based data transfer and communication, digital and information security have become indispensable (Stallings, 2002; Forouzan, 2007; Kaur, 2009). The primary goal of confidentiality is achieved by a proposed cipher (Stallings, 2002; Forouzan, 2007) and lightweight implementation. This is a simple scheme and the result is comparable with the RSA block cipher and TDES block cipher (Stallings, 2002; Forouzan, 2007), which are well known and widely used. Lightweight cryptography is such that its implementation must be within 2000-2500 gates/GEs. Symmetric ciphers or private key block ciphers have the same key for both the encryption scheme/encoding and the scheme for decryption/decoding, if anyhow two keys are used then the first key should be equated from the second key and vice versa. Symmetric ciphers or private key block ciphers have some advantages over asymmetric ciphers or public key block ciphers, which are as follows:

- • Symmetric ciphers or private key block ciphers are comparable with asymmetric ciphers or public key block ciphers with respect to cryptographic parameter strength and hardware parameter strength.
- • Private key or symmetric ciphers or private key block ciphers generate lesser encryption and decryption scheme throughput time and execution time than those of public key or asymmetric ciphers or public key block ciphers.
- • Symmetric scheme ciphers or private key scheme block ciphers, embedded in and for resource-constrained hardware communicating devices like mobile phones and IoT devices, are very easy, which is quite impractical for asymmetric ciphers or public key block ciphers.

The proposed cipher (encryption and decryption) has been designed and implemented in FPGA-based devices (Navabi, 2008; Wolf, 2009) in Spartan 3E Series and for this it has been programmed in IEEE VHDL (Bhasker, 2004; Kaur, 2009). The Spartan 3E series is enough for implementing a lightweight security solution and symmetric scheme key cryptography is widely used for the same purpose. The main reasons for the FPGA-based design and implementation are:

- • It helps to create new useful products and lower the cost of useful products by involving a smaller number of working people and used resources in less time.
- • It is suitable for and widely accepted as a "green" environment concept.
- • Throughput total time and total design area are less.

The above discussion drives the motivation to do research work with this new technology, a Rotational Conical Cipher (RCC), along with SPN and CBC mode, which is proposed and described in this chapter. It is also an FPGA-based lightweight (Stallings, 2002; Forouzan, 2007) symmetric scheme key cryptography or private key block cipher. Figure 7.1 (Chatterjee and Mandal, 2013) gives the formation of the cone from the source block. The method is that we take n-bits and XORED the successive bits to get the next row of (n - 1) bits. This XORING will continue unless we get a single bit. Thus it will form a cone as shown in Figure 7.1, and the source block may be the plaintext or ciphertext. There are four options for encryption and decryption. In Option A, we take all the leftmost bits from top to bottom, Option В involves taking all the leftmost bits from bottom to top, Option C involves taking all the rightmost bits from corner top to corner bottom, and using Option D involves taking all the rightmost bits from corner bottom to corner top. So with every discussed option we will get the output as n-bits. The encryption and decryption is done within the same process and finally circular left rotation of the plaintext is done to achieve the ciphertext.

### Literature Review

This proposed block cipher, RCC, is inspired by the NIST (McKay et al., 2017) report about lightweight cryptography published in 2017. In a cryptography project, Kerry A. McKay et al. proposed the working scope of NIST's lightweight cryptography and security project

FIGURE 7.1

Formation of cone from source block (SB).

including all cryptographic existing primitives and existing modes which are very much needed in constrained-resource environments. This report's emphasis is on lightweight cryptography, lightweight crypto-hash, lightweight authentication and some protocols. The long-term effective security is very much needed. By using these algorithms, the aim should be implementation in post-quantum security, or these applications should allow all of them to be easily replaceable by those algorithms which are implemented and designed for post-quantum security.

The next survey was carried out in part VI of the Springer book by Peng et al. (2020). In Chapter 19, Dhanda et al. (Dhanda et al., 2020) give a comprehensive security overview of the high perception layer, which is considered the lowermost layer used in IoT architecture and is also used in wireless sensor networks (WSN) and many resource- constrained embedded hardware devices. The discussed devices are very limited in many parameters such as memory, computation, power, and energy. So a large number of attacks are performed against these limited resources devices. Thus a number of solutions, such as lightweight cryptography, and various protocols have been suggested by engineers and researchers. Jha et al. in Chapter 20 (Jha et al., 2020) discuss some security threats for time synchronization lightweight protocols used in IoT architecture that are in the IoT ecosystem. The sensors are generally and eventually located in a remote and unattended environment where there is a high chance of the existence and interference of malicious nodes. Bhanu Chander and Kumaravelan Gopalakrishnan in Chapter 21 (Chander and Gopalakrishnan, 2020) outline the latest vulnerability issues in IoT architecture and security, discussing cyber-attacks in the IoT.

Sebastian Modersheim and Luca Vigano defined and implemented a cryptography protocol named Alfa-Beta Privacy (Modersheim and Vigano, 2019), This protocol is based on specifying two formulae a and (5 in first-order logic with Herbrand universes, where a equates to the intentionally released information and [3 equates to the actual cryptographic messages the intruder can observe. Therefore (a, (i)-privacy means that the attacker cannot derive *a* knowing the (3 and vice versa.

Andy Rupp et al. proposed a new lightweight cryptographic ecommerce payment technique for transit systems (Rupp et al., 2015) known as P4R (Privacy-Preserving Pre- Payments with Refunds) that is very suitable for low-cost user IoT devices in resource-constrained environments. This payment solution technique builds on Brands's e-cash technique to calculate the prepayment system and on Boneh-Lynn-Shacham (BLS) signatures used for refund options. Estimation results demonstrate that the data required for 20 rides consume less than 10 кВ of memory, and the payment and refund transactions during a ride take less than half a second. This system also protects the privacy of honest users where the transactions are made anonymous (except for deposits) and trips are unlik- able. This total system has been implemented through a microprocessor-based system.

The final survey is based on an SCI-indexed publication (Colombier and Bossuet, 2014) involving a survey of crypto-hardware security design with IC that can be used in intellectual property rights. Total cost and total performance are considered to be the main parameters for IC-design; the design and implementation of a secure, efficient, lightweight security protection scheme for a design data set is a serious challenge considering the hardware security community. However, some techniques, schemes and works propose many different ways to protect and provide security to design data that consist of functional locking, hardware obfuscation and 1C/IP identification. This study also presents a survey of academic research performed on the protection of design data. It concludes with the need to design and implement an efficient protection technique based on several hardware and security properties.

Section 7.2 describe the lightweight security architecture using RCC and the substitution and permutation technique and CBC, Section 7.3 gives the key generation process, Section 7.4 provides the cryptanalysis, Section 7.5 gives lightweight cryptography analysis through simulation, Section 7.6 gives applications of this proposed cipher, and in Section 7.7 a conclusion is drawn.

## The Lightweight Security Architecture

We have studied many cryptographic algorithms for security but the question is how to implement the security requirement in hardware (Navabi, 2008; Kaur, 2009; Wolf, 2009). As per the definition of lightweight cryptography, one can use only 2500 GEs for implementation. The hardware implementation is also suitable to achieve IoT security. Figure 7.2 gives the micro-architecture (Bhasker, 2004; Navabi, 2008; Kaur, 2009; Wolf, 2009) of lightweight cryptography.

This is proposed working model architecture of actual implementation of lightweight security. This micro-architecture consists of external signals (Navabi, 2008; Kaur, 2009; Wolf, 2009) and internal modules (Navabi, 2008; Kaur, 2009; Wolf, 2009). There are five types of external signals/bus (Bhasker, 2004; Navabi, 2008; Kaur, 2009; Wolf, 2009), 130-bits key bus, 20-bits control bus, l-bit option bus, 128-bits data bus, power, ground and clock signal buses. Internally there is only 130-bits multiplexed internal bus and this bus is connected with every internal modules. The internal modules are 130-bits key memory register, 21-bits control memory registers, encryption and decryption modules with sub-modules RCC Unit, SPN Network (Stallings, 2002; Forouzan, 2007) and CBC unit, 128-bits data memory register, 64 кВ of general purpose memory, Multiplexer and De-multiplexer (MUX-DEMUX) module and control units. The description of each modules and buses are given from Sections 7.2.1 to 7.2.6.

### External Signals and Buses

The 130-bits key input bus provides 130 bits of key block to perform encryption and decryption. The 20-bits control bus provides various modes of operation of this microarchitecture which is performed by the control unit. The 1-bit option bus states whether

FIGURE 7.2

Micro-architecture of lightweight cryptography model.

encryption is performed or decryption is performed, value "0" for encryption and value "1" for encryption. The 128-bits data bus is bi-directional. When encryption is performed the input is 128-bits plaintext and output is 128-bis ciphertext, and when decryption is performed the input is 128-bits ciphertext and output is 128-bits plaintext. The general power signal and ground signal are provided. The clock signal is also provided so that this micro-architecture may work in both a synchronized and an asynchronized manner when connected to an IoT network.

### Internal Bus

The internal bus proposed in this architecture is 130 bits, as it is the maximum limit of data transfer per clock cycle. This bus is multiplexed and controlled by the control unit. All the other internal modules are connected with this bus.

### Internal Memory Registers

Four types of internal registers (Bhasker, 2004; Navabi, 2008; Kaur, 2009; Wolf, 2009) are proposed in this micro-architecture:

- • A 130-bits key memory registers holds the 130-bits key and is generated by the key generation algorithm as discussed in Section 7.3 and provides to the architecture externally. The key generation module is provided externally to reduce the gate requirements of the proposed micro-architecture. Moreover, the key requirement is pseudo-random number so the user can use other key generation algorithms also.
- • The 21-bits control memory has a 1-bit option signal and 20 bits of modes of operation of this micro-architecture which will be illustrated in the discussion of the control unit.
- • The 128-bits data memory is basically plaintext or ciphertext to be encrypted or decrypted. This data memory is similar to the accumulator register used in microprocessors. Both the source and destination of one cycle of encryption and one cycle of decryption in unit clock/machine cycle is this 128-bits data register.
- • The 64 кВ of general purpose register is used here to store 64 кВ of plaintext and ciphertext. This register enables this micro-architecture to perform 4096 blocks of encryption or decryption at a time. External buses are generally slower than internal buses, which is why this register will provide 4096 times faster encryption or decryption than that of single block architecture.

### Internal Encryption and Decryption Module

This is the main module of this architecture and contains all the logical circuit to perform encryption and decryption. This module has three sub-components:

• RCC Module: RCC performs encryption and decryption of 128-bits blocks at a time. Encryption and decryption are given in Figure 7.1. Here, 128 bits of plaintext are taken and successive bits are XORED to get the next row of 127 bits. This process is repeated to reduce the bits to 1 bit forming a cone. Out of the four options, any option can be taken for resultant ciphertext. The same operation is done for decryption.

FIGURE 7.3

CBC—Mode of encryption and decryption.

- • Option A (00) is taking all the left bits from top to bottom.
- • Option В (01) is taking all the left bits from bottom to top.
- • Option C (10) is taking all the right bits from top to bottom.
- • Option D (11) is taking all the right bits from bottom to top.
- • SPN Network: The next phase of encryption is the SPN, where the S-box of AES is used. To get permutation, the shift row and mix column operation of AES is performed.
- • CBC Unit: This is a multiple blocks unit encryption mode. The block size is provided in the control bus. Figure 7.3 illustrates the CBC mode of operation. In the CBC mode of operation, during encryption, the plaintext is broken into multiples of 128- bits block sizes, the first block is XORED with initial vector (IV) and fed to the encryption function and the result is the first ciphertext block. In the second round, CBC, the first round ciphertext is XORED with the second plaintext block and generates the second ciphertext block. This process is repeated until all blocks are encrypted. The decryption is the reverse process where the input is ciphertext and the output is plaintext, as shown in Figure 7.3. The IV is same for encryption as well as decryption.

### Internal Mux-DeMux Module

This multiplexer and de-multiplexer (Bhasker, 2004; Navabi, 2008; Kaur, 2009; Wolf, 2009) control the internal data bus. When there is encryption or decryption or exchange of plaintext/ciphertext between the 64 кВ of general purpose memory and 128-bits data memory through the 128-bits external data bus, it enables the first 128 MSB-bits of internal data bus and the 2 LSB-bits on internal data bus are put in high impedance state. When there is key exchange, this module enables all 130 bits of the internal data bus. When there is control flow, it enables the first 21 MSB-bits of the internal data bus and puts the remaining 109 bits of the internal data bus in high impedance state.

### Internal Control Unit

The control unit (Bhasker, 2004; Navabi, 2008; Kaur, 2009; Wolf, 2009) works on the value stored in the 21-bits control memory. Figure 7.4 gives the 21-bits values.

The internal control unit is proposed to work as a 21-bits control memory. The notation CB00 means the first LSB control bit 00 and CB20 means the first MSB control bit 20. The working of the control unit is as follows:

- • Option/CB20: This bit contains the 1-bit option external data bus. Value "0" represents encryption and value "1" represents decryption.
- • Set-Reset/CB19: This bit contains the reset value of the proposed micro-architecture. When the value is "1" the architecture resets. All the data registers and key registers become zero and encryption/decryption is set to initial vector.
- • RCC OptionO and Optionl/CB18 and CB17: This is a 2-bits value. It is already seen that RCC encrypts and decrypts in four ways. So, 00 value selects Option A encryp- tion/decryption, 01 value selects Option В encryption/decryption, 10 value selects Option C encryption/decryption, and 11 value selects Option D encryption/decryption as described in Section 7.1. The left circular rotation is done for encryption and right circular rotation is done for decryption. The number of bits rotated is based on block number. The n-times rotation is done for the nth block during both encryption and decryption.
- • CBC Enable/CB 16: This bit enables the CBC mode of encryption/decryption if there are multiple blocks to be encrypted/decrypted. All the blocks are transferred from 64 кВ general purpose memory and the results are stored in the same location. This continues until the value provided CB00 - CB11.
- • Sync/Async/CB15: This bit represents synchronous and asynchronous modes of operation of this micro-architecture.
- • Key/Data/Control Transfer/CB14-CB13: This is a 2-bit value, where “00" means 128- bits data transfer, “01" means 130-bits key transfer, "10" means 21-bits control signal transfer through internal data bus, and "11" means memory access.
- • Impedance/CB12: This is a flip-flop to put the internal common bus in high impedance state and data/memory/control value transfer state.

FIGURE 7.4

21-bits control memory structure.

• BS10 - BS00/CB10 - CBOO: Since 64 кВ internal memory is there, 4096 blocks can be encrypted/decrypted using CBC mode of encryption/decryption. Since 2^{12} = 4096, these 12 bits store the address of 4096 blocks of 128 bits each. This is basically the memory address or the 12 bits represent how many blocks are to be encrypted/decrypted.

This section discussed in detail how we can actually realize lightweight cryptography through architecture and how it can be used in IoT security. Section 7.4 gives an idea of cryptanalysis through which we can optimize a block cipher. Section 7.5 gives an idea of some simulation-based results through which we can optimize an FPGA security micro-architecture.