Here are the methods I tried:

function [tA] = hard_threshold(A, t) tic; tA = sparse(size(A)); tA(abs(A) >= t) = A(abs(A) >= t); toc; clear tA; tic; tA = A; tA(abs(tA) < t) = 0; toc; clear tA; tic; tA = A; find_A = find(A); find_tA = find(abs(A) >= t); victim_tA = setdiff(find_A, find_tA); tA(victim_tA) = 0; toc; fprintf('numel(A):%i nnz(A):%i nnz(tA):%i \n', numel(A), nnz(A), nnz(tA)'); end

I first tried a small sparse matrix with 100k elements, 1% sparsity, removing 50% of nonzeros:

A = sprand(1e5,1,0.01); tA = hard_threshold(A, 0.5); Elapsed time is 0.128991 seconds. Elapsed time is 0.007644 seconds. Elapsed time is 0.003038 seconds. numel(A):100000 nnz(A):995 nnz(tA):489

I next repeated with 1m elements:

A = sprand(1e6,1,0.01); tA = hard_threshold(A, 0.5); Elapsed time is 15.456836 seconds. Elapsed time is 0.082908 seconds. Elapsed time is 0.018396 seconds. numel(A):1000000 nnz(A):9966 nnz(tA):5019

With 100m elements, excluding the first, slowest, method:

A = sprand(1e8,1,0.01); tA = hard_threshold(A, 0.5); Elapsed time is 16.405617 seconds. Elapsed time is 0.259951 seconds. numel(A):100000000 nnz(A):994845 nnz(tA):498195

The time differential is about the same even when the thresholded matrix is much sparser than the original:

A = sprand(1e8,1,0.01); tA = hard_threshold(A, 0.95); Elapsed time is 12.980427 seconds. Elapsed time is 0.238180 seconds. numel(A):100000000 nnz(A):995090 nnz(tA):49950

The second method fails due to memory constraints for really large sparse matrices:

Error using < Requested 1000000000x1 (7.5GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference panel for more information. Error in hard_threshold (line 10) tA(abs(tA) < t) = 0;

After excluding the second method, the third method gives:

A = sprand(1e9,1,0.01); tA = hard_threshold(A, 0.5); Elapsed time is 1.894251 seconds. numel(A):1000000000 nnz(A):9950069 nnz(tA):4977460

Are there any other approaches that are faster?

You must be logged in to post a comment.