# Weibull Model

The Weibull model is defined as

which is another generalization of the one-hit model. In fact for *к=*1, the one- hit (linear) model is resulted. When *к <* 1, the model is concave (sub-linear) and when *k>* 1, the model is convex (supra-linear). The risk estimates derived using the Weibull model tend to lie between the estimates from the one-hit and multi-hit models.

# Linearized Multistage Model

This model which is an approximation derived from the multistage model is probably the most popular for dichotomous responses in cancer modeling. It is based on the assumption that several genetic changes are required to transform a normal cell to a cancer cell. The model is most widely used by the regulatory agencies and is given by

where *k* is the number of stages. Clearly, the linearized multistage model is a generalization of the one-hit and the Weibull models. In practice, often a quadratic exponent suffices for an adequate fit to the data. We will discuss the multistage models of carcinogenesis in the next section of this chapter.

Example 4.2

Littlefield et al. (1980) describe an experiment conducted at the National Center for Toxicological Research (NCTR) in which BALB/c female mice were exposed to the drug 2-acetylaminofluorene (2-AAF) through diet.

The number of animals in each dose group and the number dying with bladder carcinoma along with the associated proportion is given in Table 4.3. We will refer to this data set as the bladder carcinoma dose- response (BCDR) data set.

Note that as the dose increases, the number of animals in the non-zero dose groups decreases. This is because animals die of other causes as the exposure level rises. Additionally, even though a relatively large number of doses was used with a large number of animals in each dose group, it is still not possible to establish a specific dose-response model.

To fit a model, the simplest approach would be to make a transformation to linearize the function, if possible and apply linear regression. However, that

TABLE 4.3

Dose-Response Data for Proportion of Mice with Bladder Carcinoma Exposed to 2-Acetylaminofluorene (2-AAF) (Littlefield et al., 1980)

Dose |
Number in Group |
Number Affected |
Proportion |

0 |
360 |
3 |
0.00833 |

30 |
1432 |
15 |
0.01047 |

35 |
707 |
4 |
0.00566 |

45 |
347 |
4 |
0.01153 |

60 |
242 |
4 |
0.01653 |

75 |
248 |
4 |
0.01613 |

100 |
124 |
39 |
0.31452 |

150 |
119 |
88 |
0.73950 |

approach often does not lead to satisfactory estimates. For example, to fit the extended one-hit model, we can linearize as

Using a simple linear regression for the BCDR data set, we obtain the following estimates for *f) _{0}* and

with standard errors of respectively 0.1603 and 0.00212. But these estimates lack many of the desired statistical properties, especially since the correlations between the transformed probabilities are ignored. Improved estimates are derived by using the maximum likelihood method. However, this latter approach leads to nonlinear equations that need to be solved using a computer program. Using the binomial distribution, the likelihood is given by

where *g* is the number of dose levels with *g*_{0}=0 being the background dose, P, is the probability of response at dose *i* and *X,* is the number of responses at that dose level. Now, *p,* is replaced by an assumed model and the likelihood is maximized in order to derive the parameter estimates. To illustrate, once again assuming the extended one-hit model, the log-likelihood is given by

Differentiating this function with respect to Д, and *fS _{x}* yields

Upon setting the derivatives equal to 0, we get the following two likelihood equations

s-l

where N = */ n _{:}* is the total number of animals in all dose groups and

*i=0*

1 ^ ^{1}

*d =* — is the weighted average dose. Equations (4.14) and (4.15) clearly

^{v} i=0

cannot be solved analytically and a numerical algorithm such as the Fisher's method of scoring has to be used in order to find the roots. Here, we will not discuss the details for this method since there are a number of computer programs that can be applied to solve these equations. In fact, using the nlm() function in R, which is a nonlinear minimization function in R, we find that maximum likelihood estimates for *fJ _{0}* and //, are respectively given as

Similarly, we can use any of the other mechanistic dose-response models discussed above.

# Multistage Models of Carcinogenesis

According to Whittemore and Keller (1978) perhaps the earliest attempt to model the process of carcinogenesis can be attributed to Iverson and Arley (1950) who postulated the one-stage theory of carcinogenesis. According to this one-stage theory, normal cells are transformed at a certain rate and become malignant. If we denote by P_{0}(f) the probability that a cell is normal and at time f, and by *P _{t}(t)* that it is malignant i.e. it is transformed at time t, then we can write (Whittemore and Keller, 1978)

where 2(f) is the transition probability rate. From (4.16), we have

Integrating both sides and imposing the initial condition, we obtain

as the solution of (4.16). Similarly noting that *P _{0}(t) +* P,(f) = 1, we find the solution of (4.17) as

Let Tj and *W* be the transformation time and growth time of the tumor with distributions gj(f) and g_{vv}(f) respectively. Then assuming *T _{y}* and IV are independent, the distribution of the total time to detection of tumor

*T = T*W is the convolution of /i

_{y}+_{1}(f) and /i

_{lv}(f) given by

Now, from (4.18) we can find *g _{%}(t)* as the transformation rate given by
Substituting in (4.19), we get

Once the transition rate 4(f) and the tumor growth distribution g_{vv}(f) are specified, then *g(t*) can be determined from (4.21). Iverson and Arley (1950) assume that 4(f) is a linear function of the dose and that the growth of the tumor is governed by a pure birth process with a constant birth rate. Specifically, if it can be assumed that *X* is constant, then (4.21) reduces to

The one-stage theory of Iverson and Arley was soon overshadowed by the fact that many investigators observed that the incidence rate was proportional to a certain power (about fifth or sixth) of exposure duration. This led to the belief that a single cell goes through a finite number of stages to become malignant and generate a tumor. The pioneers of this theory were Muller (1951) and Nordling (1953).

The multistage theory is based on the assumption that cancer is the result of a sequence of irreversible genetic transformations. A normal cell can become malignant only after gong through a series of mutational changes induced by exposure. This theory was utilized by Armitage and Doll (1954) to derive a mathematical description of the cancer process and derive the Armitage-Doll multistage model, which is still a widely accepted model for cancer. We outline the derivation of the Armitage-Doll model. For more details, we refer to Meza (2010).

Suppose that a normal cell would have to go through *n* transformations (stages) to become malignant (see Figure 4.2).

Let *Pj(t); j = 0,...,n* be the probability that the cell is in stage *j* at time *t. *Then the process of cell malignancy can be thought of as a pure birth process with an absorbing barrier at the *nth* stage. By using the Chapman- Kolmogorov equations (see for example Ross, S., 2019), we can express the state probabilities P,(f) as the following set of equations:

where *A _{k}* is defined to be 0. By rearranging the above equations, dividing by

*h,*and taking the limit as

*h*-> 0, we get

Note that the set of *(n* + 1) differential equations represented in (4.23) are the natural extension of (4.16) and (4.17) with constant transition rates *A _{0}, A_{u} ..*

FIGURE 4.2

Multistage model.

A„. In the more general case, the rates may be time dependent, but for simplicity, here we assume they are constants. The solution of the above system of equations may be derived as

Which by repeated integration leads to

Now, if we assume that there is a total of N susceptible cells independently going through full malignancy transformation at times *T _{b} T_{2},* ...,

*T*respectively and if

_{N}*T=*min(T„

*T*is the time to cancer development, the survival function for T is given by

_{2}, T_{N})

and the overall hazard for malignancy is given by

By writing *P„(t)* as the cumulative distribution function of a random variable which is the sum of *к* independent exponential variables with rates Ao,..., A„_i, Moolgavkar (1991) shows that the problem of computing P((f) reduces to the convolution of *к* independent exponentially distributed random variables. If we assume further that the malignancy at the cell level occurs with negligible probability, i.e. *P _{k}(t*)« 0 , then by using Taylor expansion, we can show that the hazard function

*h(t)*can be expressed as approximately

j»-l

where A is the mean of the transition rates, i.e. A = — and /(/, *t)* is a

*i*

function that involves second- and higher-order terms in the product of transition rates. Further, by keeping only the first non-zero term of the Taylor series, we obtain

which is the Armitage-Doll approximation of the hazard function. This approximation indicates that the age-specific hazard rate is proportional to the power of age. This result is consistent with the epidemiological observed rates in many different types of cancer. In fact Armitage-Doll (1985) states that if the number of distinct stages is between 2 and 6, the agreement with the epidemiological results is quite good.

Denoting the cumulative distribution of the time to tumor development *T *by *F(t),* i.e. the probability of having cancer by time *t,* then we have

and so Thus

Substituting from (4.26), we get

**with **_{A=}^{NAoAi}-^{A|}'-^{1}** **. *n*

This is the cumulative distribution function of the Weibull model. Thus the hazard function given in (4.26) corresponds to a Weibull model. The significance of the Armitage-Doll model is that it demonstrates that the hazard (at least approximately) behaves as the power of age, which is observed to be consistent with many types of cancer. On the other hand, a major limitation of the model is that it does not allow for cell death implying that any susceptible cell will eventually become a cancer cell. However, because of its simplicity and wide applications, the model is one of the most commonly used tools for cancer modeling. Note also that if we further assume that the each transition rate, Я,- is a linear function of dose, i.e.

then from (4.8) the probability distribution of time to tumor, that is, the dose- response model associated with the Armitage-Doll model, can be written as

which can be simplified as

where *Q(d)* is a polynomial of degree *n* in *d* with coefficients determined by the product of the *n* linear functions *a,+l),d.* In toxicological bioassay experiments, the animal exposure is through lifetime, and therefore *t* is a constant and the dose-response model reduces to

This model has been extensively used in toxicological dose-response model and risk assessment. In practice, often a liner function or a polynomial of degree 2 suffices and it is shown that not much is gained in using polynomials of higher degree. Using the US Environmental Protection Agency (EPA) Integrated Risk Information System (IRIS) database, Nicheva et al. (2007) showed that in animal carcinogenicity experiments the addition of the second-order term improves the fit of the model in only 20% of cases and addition of higher-order terms does not contribute to the fit of the model at all.

One of the mathematical properties of the dose-response model in (4.29) in risk assessment is that if the relative risk defined in Chapter 2 as

is used as the measure of risk, then by substituting from (4.29) we have By expanding the exponential term, we get

Now, since humans are typically exposed to much lower levels of carcinogens than the dose levels in animal bioassay experiments, to determine the risk at the low human dosage levels, one needs to extrapolate from the dose-response model below the lowest experimental non-zero dosage level. Equation (4.21) shows that, in practice, to estimate the cancer risk for low- dose extrapolation, i.e. as *d* approaches 0, ignoring the terms involving the quadratic and higher orders of *d,* the practical implication is that one needs to consider only the slope of the line connecting the lowest animal exposure level to the origin. That is, for small values of *d* we have

This is called the linearized multistage model which has been used frequently for low-dose extrapolation (see Crump (2018) and references therein). Equation (4.15) further means that the marginal increase in cancer risk for small doses is approximately equivalent to the slope of the risk function at that dose level. This clearly implies that the rise in risk for small increase in exposure level at a given dose is determined by the first derivative of the risk function evaluated at the given dose level.

## Solution of the Armitage–Doll Model

By solving the system of differential equations (4.3), the exact solution for the survival and hazard functions of the Armitage-Doll model can be derived. As noted by Meza (2010), the solution depends on eigenvalues of the system, which in this situation are exactly the transition rates of the model. If all the transition rates are different and no two of the transition rates are equal, then (Meza, 2010) survival and hazard functions are respectively given by

where

In the special case when all transition rates are equal, the survival and hazard functions are given by

# Two-Stage Clonal Expansion Model

The Armitage-Doll multistage model is probably the most widely used model in animal carcinogenicity studies. The mathematical simplicity and biological plausibility of the model has attracted the attention of many toxicologists. However, the model is based on certain assumptions that, if violated, could make the model completely invalid. For example, Moolgavkar (1978) points out that if the transition rates are not sufficiently small, the Armitage-Doll model is inappropriate. The two-stage clonal expansion (TSCE) model which was first developed by Moolgavkar and Venzon(1979) and Mookgavkar and Knudson(1981), is based on specifics of cell kinetics such as cell growth and death or differentiation and provides a plausible alternative to the multistage model. The original model has undergone some extensive development to make it amenable for studying various human cancers and for applications in tumor promotion studies. The TSCE is based on the assumption that the transition of a normal cell to a malignant cell is the result of two irreversible events. A normal cell is first changed into an initiated or intermediate cell. This is a rare event and it is a process in which normal cells are transformed so that they are able to form tumors and it is considered to be the first phase in tumor development. This process is usually modeled as a non-homogeneous time-dependent Poisson process. In the second or the promotion stage, the initiated cells undergo clonal expansion which means that the cells either are divided into two initiated cells, die or are differentiated, or divide into initiated and malignant cells. The conversion of the initiated cell to a malignant cell is also called progression and can involve one or more mutations. This process is modeled as a stochastic birth-death process. Figure 4.3 depicts the process of TSCE.

The TSCE model has been considered by many authors. Not only have several investigators discussed the biological plausibility application of the model in different types of cancer (see e.g., Kruse et al., 2002), the statistical properties and various extensions have also been the subject of numerous research papers (e.g., Brouwer et al., 2017). Here, we outline the general mathematical derivation of the TSCE model, and for more details we refer to Moolgavkar and Luebeck (1990).

Let *X(t)* be the number of normal susceptible cells at time *t.* In its most general form, the TSCE model assumes that *X(t)* is a stochastic process. But, for simplicity, here we consider only the situation where the normal stem cells follow a deterministic process. As mentioned before, the process of initiation of normal cells into intermediate cells is modeled as a non-homogeneous Poisson process. Let *i/(t)* be the first mutation rate per cell at time *t.* Then, *u(t)*

FIGURE 4.3

Two-stage clonal expansion model (Meza, 2010).

*X(t)* is the intensity of the Poisson process. Now, according to the TSCE model the initiated cells divide into two normal cells, die, or divide into a normal and a malignant cell. Let the rates at which the three events occur be respectively *a(t), ]){(),* and //(f). Let Y(f) and Z(f), respectively, denote the number of intermediate and malignant cells at time f. Let

Then, the vector |Y(f), Z(f)} can be regarded as a two-dimensional Markov process and therefore it satisfies the Chapman-Kolmogorov equation. Thus we have

Subtracting P_{; k}(f) on both sides of the above equation, dividing by *h,* and taking the limit as *h* approaches 0, we get the Kolmogorov differential equation (see Moolgavkar et al., 1988) as

Now, let f' (y, z; *t)* denote the joint probability generating function of Y(f) and *Z(t),* that is

then multiplying (4.34) by *yi z ^{k}* and summing over

*j*and

*k,*the Kolmogorov forward differential equation is resulted as

Assuming that all cells are initially normal, the initial condition for the above equation is *4'* (y, 2; 0) = 1. Note also that *4'* (1, 0; *t) = P(Z(t)=0)* is the survival function *S(t)* for time T to the first malignant cell. The corresponding hazard function is given by

Therefore, we see that in order to derive expressions for the survival and hazard functions, we need to evaluate *4** (1, 0; *t*) and its derivative. Note that from (4.35) above, we have

and therefore since

Now, as pointed out before, since the occurrence of cancer is a rare event,

i.e. *P[Z(t)* = 0] « 1, we can replace the above conditional expectation with the unconditional expectation. Substituting in (4.35) and solving, we finally find the following approximation for the hazard function

Dewanji (2008) notes that the above approximation besides being inapplicable in animal carcinogenicity studies, also lacks flexibility as it depends on two parameters, namely the net cell proliferation rate (« - /У) and the product *fi.vX.* In other words, the above solution gives us no information about the role of parameters *a* and *fi* alone. Also, in experimental data, often the probability of malignancy is not small and therefore approximating the conditional expectation with the unconditional one may lead to inaccuracy. The hazard calculated from the above approximation and the exact solution could be very different in carcinogenesis experiments.

## Exact Hazard for Piecewise Constant Rate Parameters

In its most general form, the model parameters are age-dependent and in order to evaluate the carcinogenic effects with age, we need to have solutions for the hazard and survival functions. However, closed form solutions for these functions are not available for age-dependent parameters. But, it turns out that the computation of the hazard and survival functions is possible when the rate parameters are piecewise constant. In this case, the model leads to a recursive procedure that results in closed form expressions for the hazard and survival functions of the TSCE model. The complete derivation of the solution is complex and involves solving a certain Ricatti differential equation. Here, we give an outline and refer the reader to Heidenreich et al. (1997). In practice, constant rate parameters most frequently occur in epidemiologic data and indeed in animal bioassay experiments.

Suppose therefore that the parameters a, *fi,* and *v* are constant within the time interval h], i.e.

where we note that the time interval [0,f] is broken down to n subintervals f,] for *i=,* ..., *n.* Now, the derivation of the exact solution of the Kolmogorov equation (4.15) requires the associated characteristic equations that are given by

In order to derive the hazard, as was shown earlier, we need to only solve for *4'* (1,0, *t).* This means that we only need to find *4'* (y, *z; t*) along the characteristic through (y(0), 0,0) with y(f) = 1. Now, the Kolmogorov ordinary differential equation can be solved along characteristics, which gives the following solution:

for which the initial condition is ^{l}f'_{0} = 4K(y(0),z(0);0) = 1. The equation for y(s, 0 is a Ricatti differential equation which can be solved analytically when the model parameters are piecewise constant. In this case the solution to the characteristic equation is (Heidenreich et al., 1997; Meza, 2010)

where

with *p,* and *q,* being the lower and upper roots of the quadratic equation

and

From the expressions, the equations for the hazard and survival functions when the rate parameters are age-dependent can be derived as