# Methods

A signal f e <d can be represented as a sparse linear combination of prototype signals from an overcomplete dictionary , D e <d*K, K > d. These prototype signals are the columns of D, usually termed “atoms”, and designated {aj}f=1. The representation coefficients constitute the vector x e RK, so that for exact signal representation f = Dx. We seek the sparsest representation of f, using the dictionary D. This sparsest representation is the solution to the problem in Eq. (1). The solution of Eq. (1) is an NP-hard problem. However as long as sparse enough solution exists, its uniqueness can be verified [5].

The overcomplete dictionary D that leads to sparse representation can be chosen in advance, or especially adapted to fit a training set of signal examples. This latest approach is adopted here. The dictionary is trained using K-SVD algorithm [5], which minimizes the following objective function [Eq. (2)].

where T0 is a predetermined number of non-zero entries in a coefficient vector and F is a matrix of the size d xN, containing N original data points f 2 Rd, as its columns. X is a matrix of the size К x N, containing N coefficients vectors X 2 RK, as its

columns. The notation ||AF stands for Frobenius norm, given by ||A||f = Щ- K-SVD iteratively minimizes the function in Eq. (2), in two stages: first D is fixed and the best coefficient matrix X is found using Orthogonal Matching Pursuit (OMP [6]). This stage is termed the sparse coding stage. The second stage is the codebook update stage. It searches for a better dictionary, by fixing all the columns of D but one, and finding a new column and its updated coefficients such that the mean square error, ||F — DX||F , is reduced.

In case of fibers, the matrix FdxN contains the original fibers as its columns and XKxN contains the sparse representations of the fibers as its columns. N is the number of fibers, d is the length of original fiber representation and K is the length of the sparse representation. A fiber, f, is represented as a sequence of m points in a 3D space, sampled equidistantly: f = {xi, yb zi, ,xm, ym, zm},where m = 20 for all fibers (d = 60). This choice of m was found to be a good compromise between the representation length and representation precision [7]. Once the dictionary is finalized, the fibers can be sparsely represented as a combination of dictionary atoms. This step is termed sparse coding, or atom decomposition. The representation coefficients x are found based on the given signal f and the dictionary D. It requires solving the problem in stated in Eq. (1). OMP algorithm that is used is a greedy algorithm that selects at each step the atom with the highest correlation to the current residual. The signal is then orthogonally projected to the span of the previously selected atoms and the residual is iteratively recomputed [5]. The compressed fibers set will consist of the matrix X containing the new sparse representations as its columns, and of the dictionary D. An approximate reconstruction of original representation is achieved by

Instead of learning the dictionary on an entire fiber set, a representative sub-set of fibers is used in order to reduce the learning time. Simple down-sampling of the fiber set may compromise the outcome by randomly excluding parts of the fiber- set that may be important. Here, we propose to employ the coreset concept [8] in order to select a meaningful sub-set of fibers for learning the dictionary. Coresets are an innovative paradigm in data science, useful in tasks requiring large computation time and/or memory [8]. Selective reduction of large fiber-sets using coresets was explored in our previous work [9]. A novel algorithm called Density Coreset (DS) was proposed, that selects a subset of fibers using an iterative non-uniform sampling process. In DS, the effective sampling rate is set to be inversely proportional to the fibers density in brain space. This reduction process tends to choose representative fibers, while omitting redundancies (which are the fibers nearly identical to the ones selected). Feigin et al. [10] have shown that the optimization problem of learning D can be solved on a coreset, which is much smaller, without sacrificing too much accuracy.