Thresholding sparse matrices in Matlab

Here are the methods I tried:

function [tA] = hard_threshold(A, t)

    tic;
    tA = sparse(size(A));
    tA(abs(A) >= t) = A(abs(A) >= t);
    toc;

    clear tA;
    tic;
    tA = A;
    tA(abs(tA) < t) = 0;
    toc;

    clear tA;
    tic;
    tA = A;
    find_A = find(A);
    find_tA = find(abs(A) >= t);
    victim_tA = setdiff(find_A, find_tA);
    tA(victim_tA) = 0;
    toc;

    fprintf('numel(A):%i nnz(A):%i nnz(tA):%i \n', numel(A), nnz(A), nnz(tA)');
end

I first tried a small sparse matrix with 100k elements, 1% sparsity, removing 50% of nonzeros:

A = sprand(1e5,1,0.01); tA = hard_threshold(A, 0.5);
Elapsed time is 0.128991 seconds.
Elapsed time is 0.007644 seconds.
Elapsed time is 0.003038 seconds.
numel(A):100000 nnz(A):995 nnz(tA):489

I next repeated with 1m elements:

A = sprand(1e6,1,0.01); tA = hard_threshold(A, 0.5);
Elapsed time is 15.456836 seconds.
Elapsed time is 0.082908 seconds.
Elapsed time is 0.018396 seconds.
numel(A):1000000 nnz(A):9966 nnz(tA):5019

With 100m elements, excluding the first, slowest, method:

A = sprand(1e8,1,0.01); tA = hard_threshold(A, 0.5);
Elapsed time is 16.405617 seconds.
Elapsed time is 0.259951 seconds.
numel(A):100000000 nnz(A):994845 nnz(tA):498195

The time differential is about the same even when the thresholded matrix is much sparser than the original:

A = sprand(1e8,1,0.01); tA = hard_threshold(A, 0.95);
Elapsed time is 12.980427 seconds.
Elapsed time is 0.238180 seconds.
numel(A):100000000 nnz(A):995090 nnz(tA):49950

The second method fails due to memory constraints for really large sparse matrices:

Error using < 
Requested 1000000000x1 (7.5GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may
take a long time and cause MATLAB to become unresponsive. See array size limit or preference panel for more information. Error in hard_threshold (line 10)
 tA(abs(tA) < t) = 0;

After excluding the second method, the third method gives:

A = sprand(1e9,1,0.01); tA = hard_threshold(A, 0.5);
Elapsed time is 1.894251 seconds.
numel(A):1000000000 nnz(A):9950069 nnz(tA):4977460

Are there any other approaches that are faster?

 

Sampling from Multivariate Gaussian distribution in Matlab

tl;dr: Don’t use mvnrnd in Matlab for large problems; do it manually instead.

The first improvement uses the Cholesky decomposition, allowing us to sample from a univariate normal distribution. The second improvement uses the Cholesky decomposition of the sparse inverse covariance matrix, not the dense covariance matrix. The third improvement avoids computing the inverse, instead solving a (sparse) system of equations.


n = 10000;
Lambda = gallery('tridiag',n,-0.3,1,-0.3); % sparse
tic;
x_mvnrnd = mvnrnd(zeros(n,1),inv(Lambda));
toc;

tic;
z = randn(n,1); % univariate random
Sigma = inv(Lambda);
A = chol(Sigma,'lower'); % sparse
x_fromSigma = A*z; % using cholesky of Sigma
toc;

tic;
z = randn(n,1); % univariate random
L_Lambda = chol(Lambda,'lower'); % sparse
A_fromLambda = (inv(L_Lambda))'; % sparse
x_fromLambda = A_fromLambda*z;
toc;

tic;
z = randn(n,1); % univariate random
L_Lambda = chol(Lambda,'lower'); % sparse
x_fromLambda = L_Lambda'\z;
toc;

Results:

Elapsed time is 4.514641 seconds.
Elapsed time is 2.734001 seconds.
Elapsed time is 1.740317 seconds.
Elapsed time is 0.012431 seconds.

Matlab: different colormaps for subplots

I often want different subplots in one Matlab figure to have different colormaps. However, colormap is a figure property, so it’s not trivial, except that it is… with these utilities:

This works for everything except colorbars: http://www.mathworks.com/matlabcentral/fileexchange/7943-freezecolors—unfreezecolors

Post-2010, Matlab refreshes colorbars with each subplot, so you’ll need this to freeze colorbars: http://www.mathworks.com/matlabcentral/fileexchange/24371-colormap-and-colorbar-utilities–feb-2014-