This model is obtained through adding TDLs to the FFNN structure. The RVTDNN was proposed in [14] and has been found to be effective in modeling strongly dynamic nonlinear systems, such as wideband PAs and wireless transmitters. This model has also been designated as a real value focused time-delay neural network (RVFTDNN) [12]. As illustrated in Figure 7.10, RVFTDNN architecture is based on the previously discussed static DIDO-CC-NN architecture and taking further into account the memory effects and assuming that the output of the amplifier depends only on the present and 2p past input values, but not on the network’s output values.

Figure 7.10 shows a three-layer RVFTDNN with two real inputs (I_{in} and Q_{in}), and two real outputs (I_{out} and Q_{out}). The inputs I_{in} and Q_{in} are both delayed by p samples; using two sets of TDLs. The first set of TDLs is made of p branches and is applied on the in-phase component of the input signal (I_{in}), while the second set of TDLs is also made of p branches and is applied to the quadrature-phase component of the input signal (Q_{in}). Here, p represents the memory depths of the system and the length of the input vectors are2(p + 1)-by-1. The delayed response is achieved by using aplurality

Figure 7.10 Block diagram of a three-layer real valued focused time-delay neural network

of unit delay operators(z^{-1}), where the unit delay operator yields to the delayed sample x(k - 1) when operating on sample x(k).The input-output relationship for this model with n neurons and two layers can be written as,

At any time instant k, the output value of neuron n of layer l is given by:
where w^{l}nj is the synaptic weight between the output of neuron j at layer (l - 1) and

^{nj} i

the input of neuron n at layer l, b_{n} refers to the bias applied to the neuron n at layer l, and Oj~^{1}(k) is the output, at time instant k, of the neuron and j at layer (l - 1). O^{l-1}(k) is given by:

The synaptic weights wj, between the output of neuron j at layer (l - 1) and the input of neuron i at layer l, are chosen such that the output values of all neurons lie at the transition between the linear and saturated parts of the sigmoidal activation function, f, when no initial knowledge of the weights are assumed. Very high values of initial weights can drive the NN into the saturation part of the activation function, which will slow down the learning process. Conversely, very small initial values may lead the network to operate in the flat region, stopping the training for that neuron [1]. To avoid extreme values of -1 and 1 for the activation function, the weights are initialized randomly within the interval of [-0.8,0.8]. Gradually, weights converge to their optimal values as the training proceeds. The hidden layers are fully connected, as shown in Figure 7.10.

The output of any layer works as an input for the next layer. Thus, Equation 7.14 is applicable for all neurons (n = 1, ???, N) and all layers except the final one (l = 1, ???, L - 1). For the final layer,

The output layer has a purely linear activation function (sometimes referred to as a purelin function), which sums up the outputs of hidden neurons and linearly maps them to the output. The activation function, f, for the two hidden layers can be chosen as one of the functions depicted in Figure 7.2. It is worth noting that when the memory depth is zero (i.e., p = 0), the two TDLs are eliminated and in this case the RVFTDNN architecture is reduced to a RVFFNN architecture.

Training is carried out in batch modes, supervised with a back-propagation algorithm. Detailed descriptions of the back-propagation algorithm are given in [1, 16, 19, 20]. To summarize, two passes are made during one ensemble of iterations (commonly referred to as an epoch which can include tens to hundreds of actual iterations): a forward propagation and a backward propagation. During the forward propagation, the cost function is calculated by:

where I_{out}(k) and Q_{out}(k) are the desired model outputs representing the Cartesian components of the system’s complex output; and I_{out}(k) and Q_{out}(k) are their predicted values by the actual NN model.

Based on the error signal given by Equation 7.17, a backward computation is performed to adjust the synaptic weights of the network in layer l according to:

In Equation 7.18, w^{l} (k) and w^{l}(k + 1) denote the values of the synaptic weight

^{nj nj}

w^{l}. during the training step at instants k and (k + 1), respectively; and Aw^{l} (k) is the

^{nj} Щ

adjustment applied to modify the value of the synaptic weight w^{l} (k) to obtain its new

_{l}^{n}

value at instant (k + 1). Awk (k) is calculated, at instant k, using the one-dimensional Levenberg-Marquardt (LM) algorithm [20]. The LM algorithm was found to be the most appropriate among various algorithms for its fast convergence properties, as shown in the next section. The whole procedure is carried out until the desired performance is met, or the NN fails the validation procedure by drifting away from the generalization criterion [1, 16, 21, 22].

Due to their dynamic modeling capability, the RVRNN and RVFTDNN are considered as good candidates among all the other NN topologies for the dynamic modeling and linearization based on digital predistortion techniques for PAs and transmitters.

In [12], the performance of the RVFTDNN was benchmarked against the RVRNN for a Doherty PA prototype driven by multi-carrier WCDMA signal. Each model had two hidden layers and the number of neurons in each of these layers was decided through the same optimization process. The output layer of each model contained two linear neurons. The ability of the models to predict the magnitude and phase of the output signal are reported in Figures 7.11 and 7.12, respectively. Both figures illustrate the superior performance of the RVFTDNN as it can accurately predict the output signal’s magnitude and phase.