# Model Parameter Identifiability and Study Design

Although the work in [1,2] had enormous impact on identifying the rapid dynamics of within-host viral dynamics, 68% confidence intervals were reported in [1] for the cell-free viral RNA clearance rate, c, since 95% intervals were too large to be meaningful. This suggested that "information" about the parameter c is very limited based on the observations of viral load used to estimate that parameter. In other words, observed viral load is not "sensitive" to the viral clearance parameter, c, in the model used in that work.

Similarly, Lewin et al. [23] used measurements of hepatitis B viral load after treatment with potent antiviral therapy and a mathematical model that included parameters for the death rate of infected cells, 5, the clearance of free virion, c, and two parameters, e and n, representing drug efficacy at different points in the viral replication cycle to evaluate complex decay profiles after treatment for hepatitis B infection. In that work, they were unable to estimate the efficacy parameter, n, which represents the efficacy of drug therapy in preventing new cells from becoming infected. They conducted their estimation of the remaining three parameters by setting n = 0.5 and repeating the analysis with n = 0 and n = 1. They found very little difference in the resulting estimates of c, 5, and e in all three analyses, suggesting that hepatitis B viral load is not "sensitive" to the parameter n in the model used in their work.

This inability to accurately estimate certain parameters is well known in a variety of fields. In numerical linear algebra, when not all parameters can be estimated from the available data the model is referred to as "ill-posed" (if no unique solution exists) or "ill-conditioned" (if solutions are unstable) [24]. In the study of differential equations, parameters are referred to as "sloppy" or "stiff," depending on whether their uncertainty is large or require small step sizes to be accurately estimated from the data, respectively; see, for example, [25,26,27]. In statistics, the parameters which cannot be estimated from the available data are referred to as being nonidentifiable.

In the mathematical differential equations literature, sensitivity analysis is a widely used approach to provide guidance on which observations and at what time points measurements should be obtained in order to provide the most precise estimates of individual model parameters. Specifically, sensitivity analysis evaluates which compartments in a system of ODE's are *sensitive* to changes in a specific parameter in the system by calculating the rate of change of the compartment with respect to the parameter (over time), thereby providing information about times at which compartments are most sensitive (i.e., vary the most with respect to) a specific parameter in the system. Unfortunately, sensitivity analysis is a univariate method, and it is not clear how to evaluate sensitivity when one wants to estimate two or more parameters simultaneously, for example, estimation of HIV or hepatitis B viral clearance and infected cell decay parameters.

A powerful statistical tool for simultaneous parameter estimation is the FIM, which has been well described in the statistical literature. However, the properties of ODE system are not widely studied in the statistical literature, while the mathematical literature generally does not focus on the concepts of identifiability and inference for ODE parameter estimates. Fortunately, there is a simple relationship between sensitivity analysis and the FIM. This insight allows matrix characteristics of the FIM to be explored by evaluating the corresponding characteristics of the simpler matrix derived from sensitivity analysis, which will be referred to as the sensitivity matrix. As a consequence, a combination of sensitivity analysis and the FIM can be used to determine how data should be collected in order to ensure the stable estimation of specific parameters in systems of ODEs. That is, these two tools can be used in combination for study design by identifying which model compartments, and at which time points within these compartments, observations should be measured in order to precisely estimate parameters of interest.