Hypersparsity and linear programming
In the previous section, we assumed that the righthand side or the solution was full or nearly so. We exploited the occasional zero by the test on zero in the code of Figure 10.5.1. However, if the righthand side is very sparse and the matrix is reducible, there may be more tests for zero performed than actual arithmetic operations. This often occurs for linear programming problems and has been considered by Hall and McKinnon (2005) who call this hypersparsity and report speedups by a factor of about 5 from exploiting it.
One of the mechanisms that they use is the GilbertPeierls algorithm, which we met in Section 10.4. There, we were applying a sequence of elementary Gaussian elimination operations to a column of the matrix that usually begins and ends as sparse. Exactly the same process is applicable to solving
when L is held by columns and both L and b are sparse. The total computational effort needed is proportional to the number of nonzero arithmetic operations. For the backsubstitution
it is important for hypersparsity to hold U by columns, for then the Gilbert Peierls algorithm can again be applied.
If it is desired to solve several systems for the transposed matrix
with hypersparsity, which is the case for linear programming where A is the basis matrix, it is worthwhile to hold copies of L and U by rows so that the factorization
is held by columns and the GilbertPeierls algorithm is applicable.
Hall and McKinnon (2005) pay detailed attention to hypersparsity for every part of the linear programming calculation. It is outside the scope of this book to describe it all and we refer the reader to their paper. However, we will mention one simple device that may be applicable elsewhere. There is a need to calculate the product p^{T}N between a sparse vector p (found by a hypersparse solution) with the sparse matrix N that is the nonbasic part of the constraint matrix. Usually, N is held by columns. In this case, each p^{T}N_{:}j for a single column may be computed with effort proportional to the number of entries in N_{:}j. Many of the products pjNj may be zero, although Nj =0 and many of the resulting vectors p^{T}N_{:}j may themselves be zero. The solution is to hold a copy of N by rows and form p^{T}N as ^j pjN_{i:}. Now only nonzero products are calculated.
Table 10.7.1 Speeds (GFlops) for vector operations on a Dualsocket 3.10 GHz Xeon E52687W, using the Intel compiler with O2 optimization. x is addressed directly or indirectly; y is addressed directly.
Size 
x = x + ay 
a = x^{T} y 

Direct 
Indirect 
Direct 
Indirect 

BLAS 
In line 
In line 
BLAS 
In line 
In line 

5 
0.5 
1.6 
1.2 
0.6 
1.2 
1.1 
10 
0.8 
1.7 
1.2 
1.0 
1.8 
1.6 
20 
1.4 
1.8 
1.4 
1.6 
2.0 
1.6 
50 
1.6 
1.7 
1.4 
1.9 
2.1 
1.7 
100 
1.5 
1.7 
1.4 
2.0 
2.0 
1.7 
200 
1.5 
1.5 
1.4 
2.1 
1.9 
1.5 
500 
1.6 
1.8 
1.4 
2.2 
2.3 
1.7 
1000 
1.7 
1.8 
1.4 
2.2 
2.3 
1.8 