# Discussion of scaling strategies

In Section 4.14, we described three quite different strategies for scaling a sparse matrix: scaling the entries to make their absolute values close to one (Section 4.15.1), scaling the row and column norms (Section 4.15.2) and I-matrix scaling (Sections 4.15.3 and 13.5.2). They all seem entirely reasonable. So the question is: ‘which should be used and is one better than another?’. The answer depends on context and criteria.

As to context, Duff (2007) has compared these scaling algorithms when used with MA57 for symmetric indefinite matrices. In Figure 13.5.1, we show a performance profile (Dolan and More 2002) from the thesis of Pralet (2004), who ran the scaling algorithms on a set of augmented system matrices. For a given value x on the x-axis, the plots show the proportion of matrices for which the factorization time after scaling with each algorithm was within the factor x of the best of the five. We see that all do better than no scaling, that I- matrix scaling is best, and that scaling the rows and columns (in either norm) is between this and scaling the entries (Section 4.15.1). We believe that the success of I-matrix scaling is because it permutes entries onto the diagonal to maximize the size of their product and most of these entries are retained when symmetry is restored (Section 13.5.2). This is not true of the other scalings. Similar conclusions were made by Hogg and Scott (2008). Clearly, scaling is very helpful from this perspective.

Note that we have not paid attention to the right-hand sides in this section. It is possible that scaling the matrix with right-hand sides appended may be desirable. We discussed this for full matrices at the start of Section 4.15, and the same considerations can apply in the large sparse case.

We note that the criteria being used here is the speed of the factorization time. More difficult to assess but equally as important is whether the resulting factorization yields a solution to the original problem with high accuracy. The problem here is that different automatic scaling methods assign meaning to different small numbers. In making this assignment, they lead to different scaled matrices. Each resulting matrix has a stable factorization that can be achieved rapidly as shown. However, the solutions using these factors may look different Fig. 13.5.1. Figure from Duff and Pralet (2005). Performance profiles for factorization times on a set of augmented system matrices. MC30 scales the entries, MC77 scales the rows and columns in the infinity or one norm, and MC64 uses I-matrix scaling.

from each other. Which one was the better solution to the problem intended by the modeller? It is easy to demonstrate this dilemma with a 2 x 2 matrix, which we invited the reader to explore in Research Exercise R.4.1. We invite the reader to explore the effect of different scalings on solution accuracy for larger, relevant problems in Research Exercise R13.2.