# Integer Programming

Integer Programming (IP; Dantzig et al. 1954; Gomory 1958) is a widely used and powerful optimization technique to solve a variety of decision-making problems (Wolsey 1998; Jünger et al. 2010). IP methods maximize or minimize a linear objective function subject to linear constraints (equalities or inequalities) when one or more variables are restricted to be integers. Theoretically solving IP is NP-hard. However, thanks to powerful solvers like CPLEX (2012) and GUROBI (Gurobi Optimization Inc. 2013), problems with thousands of variables and constraints can be solved optimally within reasonable time (Jünger et al. 2010; and references therein).

The first application of IP in conservation problems goes back to (Cocks and Baird 1989), who solved the reserve selection (Problem 4) under species richness. Such IP formulations have been extended to more realistic scenarios (Underhill 1994; Church et al. 1996; Possingham et al. 2000), to maximize PD (Rodrigues and Gaston 2002; Rodrigues et al. 2005), and to maximize SD (Minh et al. 2010).

Here, we show how to model biodiversity optimization problems 1–5 in IP parlance, which allows available IP software packages to solve the problem. We first introduce some notations and definitions and further exemplify IP formulations for Problems 1–5 using the pheasant data set.

#### IP for Taxon Selection Problems

Given a set of n taxa, we encode a subset S by an n-element binary vector with entries of 0 and 1 indicating the absence and presence of the corresponding taxa in

S. The elements of this vector are called taxon variables. For the pheasant data set there are ten taxon variables xPB, xPC, xPE, xPG, xPI, xPM, xGG, xGL, xGS, xGV (indices follow initials of species names). We say that a split σ = A|B is preserved in S if A and B each contain at least one taxon from S. For each split σ we introduce a binary split variable , where = 1 if σ is preserved in S and 0 otherwise.

Each is fully identified from taxon variables by two split constraints as follows. σ1 is a trivial split that separates P. bicalcaratum from the remaining taxa. σ1 is preserved (i.e. y1 = 1) if P. bicalcaratum and at least another taxon are preserved (see Fig. 2a for the definition of the splits). This condition is expressed by two inequalities:

y1 £ xPB , y1 £ xPC + xPE + xPG + xPI + xPM + xGG + xGL + xGS + xGV

In fact, the second inequality always holds because k ≥ 2 and thus is ignored. Now consider the non-trivial split σ17, which separates G. gallus, G. lafayetii, and G. varius from the remaining taxa. σ17 is preserved if at least one of G. gallus, G. lafayetii, and G. varius and one of the remaining taxa are preserved. Therefore,

y17 £ xGG + xGL + xGV , y17 £ xPB + xPC + xPI + xPG + xPE + xPM + xGS

The remaining split constraints are listed in Table 3.

Based on split variables one can rewrite SD of S as:

where λσ is the weight of split σ. This is the objective function that we want to maxi-

mize for all problems (1–5).

In the taxon selection Problem 1 the size of an optimal subset is constrained by a

predefined number k, meaning that:

xPB + xPC + xPE + xPG + xPI + xPM + xGG + xGL + xGS + xGV £ k (2)

We also require that taxon and split variables are binary

Suppose we are given a total budget B. Let ci denote conservation costs for taxon

i. We can then substitute constraint (2) by the budget constraint

Together with previous constraints we have the IP formulation of Problem 2 by:

We now model viability constraints that operate on taxon variables as follows.

G. sonneratii depends on P.malacense and P.germaini (Fig. 4). Therefore, the viability constraint for G. sonneratii is simply

xPM + xPG ³ xGS

This ensures that xGS is 1 (i.e., G. sonneratii is selected for conservation) only if at least one of xPM and xPG is also 1. Viability constraints for all the other taxa are listed in Table 3. Now, the IP formulation for viable taxon selection can be obtained by simply including viability constraints to Problem 1:

#### IP for Reserve Selection Problems

For reserve selection we encode a subset W of m areas by a binary vector (z1,z2,…,zm), where zr is 1 if area r is present in W, and 0 otherwise. We call zr area variables. For the pheasant habitat (Table 1) we have eight area variables zID, zLK, zBT, zIN, zPH, zMY, zTH, zVN (indices follow two-letter country codes). We now redefine split constraints

in terms of area variables instead of taxon variables as follows.

Split σ18, which separates G. lafayetii and G. varius from the others, is preserved if (1) G. lafayetii or G. varius is preserved and (2) at least one of the remaining taxa is preserved. Because G. lafayetii or G. varius occur in Indonesia and Sri Lanka, condition 1 is equivalent to: y18 £ zID + zLK

Similarly condition 2 is equivalent to:

y18 £ zBT + zID + zIN + zPH + zMY + zTH + zVN

since the remaining taxa are found in all areas except Sri Lanka. Such area-split

constraints for all other splits are listed in Table 4.

The subset size constraint has to be rewritten for countries:

zID + zLK + zBT + zIN + zPH + zMY + zTH + zVN £ k (8)

We keep binary constraints for split variables and also include such for area variables

Reserve selection problem is then formulated as follows:

For budgeted reserve selection we are given a total budget B. Let cID, cLK, cBT, cIN, cPH, cMY, cTH, cVN denote conservation costs for each country. Then a budget constraint for areas is

To obtain the IP formulation for Problem 5 we simply substitute subset size constraint (8) by the budget constraint (11).

# Other Algorithms

While greedy algorithms and IP are general strategies for all Problems 1–5, other algorithms have been applied to solve special cases. For example, simulated annealing algorithms (Possingham et al. 2000) were introduced to solve the reserve selection Problems 4 and 5 under species richness with an opportunity to minimize the connectivity between the areas such as the boundary lengths. Dynamic programming algorithms (DPA) have been applied to solve Problem 2 under PD (Pardi and Goldman 2007). DPA was further extended to maximize SD on circular split networks (Minh et al. 2009a, b). Other special types of split networks were exploited to solve Problem 1 (Spillner et al. 2008; Bordewich et al. 2009).