RANDOMIZED NUMERICAL LINEAR ALGEBRA FOR KERNEL MATRIX COMPRESSION
Analytics
75 views ◎41 downloads ⇓
Abstract
Matrices are used significantly as a medium to store data in many applications likeData Science, Computer Science, Statistics, and Applied Mathematics. Matrix computationslike matrix multiplication, matrix inversion, eigenvalue decomposition, singularvalue decomposition are very substantial in real-world applications. Unfortunately,many of these matrix operations are so time and memory expensive that theyare prohibitive when the scale of data is large. Sometimes, when the data has a largeamount of meaningless information called noise, machine-precision matrix operationsare not necessary, and one can sacrifice a reasonable amount of accuracy for computationalefficiency. In addition to the applications mentioned above, in MachineLearning, Linear Algebra, Partial Differential Equations, and Optimization, the datausually boils down to an mn matrix A, and often it is very helpful to derive matrixapproximations to our original matrix A when our data is seemingly unmanageable.We try to get an approximation matrix Ak which has a particular rank k (considerablysmaller than m and n), in other words, low-rank approximation. Methods likeSingular Value Decomposition and QR Decomposition can be used to achieve suchmatrix approximations. But, such algorithms usually take a lot of time (superlinearin the number of nonzero elements of the matrix). We want to optimize our algorithmsto be used in various applications where data sets are framed by extremelylarge matrices.So, in this thesis, we primarily focus on applying a new approach called RandomizedNumerical Linear Algebra. We will introduce two different methods, the firstmethod known as Random projections and the second known as Random Sampling.Random projections have recently emerged as a powerful method for dimensionalityreduction. We will present some experimental results carried out on matrices generatedby Kernel functions and work on a specific type of kernel function called Green’sfunction. We will try to execute low-rank approximation on them using RandomizedNumerical Linear Algebra and show some approximation errors to better understandthe nature of the new approach known as RandNLA. We will try to show that usinga sparse random matrix gives additional computational savings. By projecting thedata onto a random lower-dimensional subspace yields results comparable to conventionaldimensionality reduction methods such as SVD and QR: the similarity of datavectors is preserved well under random projection. This approach is computationallysignificantly less expensive than the conventional approach.