Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparse and transpose on sparse matrices could be faster #12998

Closed
KristofferC opened this issue Sep 7, 2015 · 10 comments
Closed

sparse and transpose on sparse matrices could be faster #12998

KristofferC opened this issue Sep 7, 2015 · 10 comments
Labels
performance Must go faster potential benchmark Could make a good benchmark in BaseBenchmarks sparse Sparse arrays

Comments

@KristofferC
Copy link
Member

I thought this was worth a separate issue.

In #12981 @mb1234 posted a short comparison script to benchmark few methods on sparse matrices between Julia, Octave and Matlab. The script to perform the table below can be seen in #12981 (comment) except I added an explicit call to gc() between method invocations. The average of 5 runs is shown with the standard deviation in parenthesis.

Func Julia Octave Matlab (single thread)
sparse: 2.478101 (0.161879) 1.233007 (0.098839) 2.678221 (0.104867)
2*A : 0.052062 (0.000805) 0.120093 (0.042131) 0.070358 (0.001547)
A' : 1.131908 (0.010331) 0.810197 (0.075346) 1.159815 (0.246967)
A+B : 0.246413 (0.001277) 0.487021 (0.088988) 0.202465 (0.000276)
A*x : 0.169672 (0.003784) 0.340456 (0.009046) 0.162381 (0.016590)
A'*x : 0.146576 (0.003835) 1.191680 (0.099767) 0.135291 (0.003267)

In general, we are doing really well! However, we can see that for sparse and transpose we have a bit to go to match Octave. Right now, for sparse we are making a CSR-matrix and transposing it in the end. It should be possible to directly construct the CSC-matrix and avoid the transpose.

For transpose I looked at the code and I am not sure what could be done. Anyone have any ideas here?

@tkelman tkelman added performance Must go faster sparse Sparse arrays labels Sep 7, 2015
@KristofferC
Copy link
Member Author

Maybe Octave doesn't sum duplicates in their sparse?

@andreasnoack
Copy link
Member

It appears that they do.

octave:1> sparse([1,1], [1,1], [1,1])
ans = Compressed Column Sparse (rows = 1, cols = 1, nnz = 1 [100%])

  (1, 1) ->  2

@ViralBShah
Copy link
Member

The reason for creating a csr and transposing is that we get everything in sorted order. For direct csc construction, you have to have a sorting step. Any idea what octave does? Also, maybe worth testing few different cases for the constructor.

@ViralBShah
Copy link
Member

Related, see #13400 for a pathological case due to csr construction.

@KristofferC
Copy link
Member Author

I took a stab at trying to improve our sparse but I failed. My attempt was to use a bucket for each column and sort the rows into that. I currently use a linear search through each bucket and insert the row at the proper place.

Here is my attempt: https://gist.github.com/KristofferC/554badaa7b9f494f45ee

And results:

n = 10^5
sparsity = 20 / n
accums = 5
nzs_1 = repmat(rand(1:n, Int(sparsity * n^2)), accums)
nzs_2 = repmat(rand(1:n, Int(sparsity * n^2)), accums)
vals = rand(Float64, length(nzs_1))

A2 = @time sparse2(nzs_1, nzs_2, vals)
#   5.090254 seconds (970.25 k allocations: 152.123 MB, 3.83% gc time)
A = @time sparse(nzs_1, nzs_2, vals)
#   1.284557 seconds (23 allocations: 186.918 MB, 3.46% gc time)
A == A2
# true

Maybe someone can find some obvious way to speed my attempt up.

@mauro3
Copy link
Contributor

mauro3 commented Oct 1, 2015

Here on binary search vs linear search: https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/

@KristofferC
Copy link
Member Author

Cool, I'll play around with some techniques there to see if I can get the search faster.

@JuliaLang JuliaLang locked and limited conversation to collaborators Oct 2, 2015
@JuliaLang JuliaLang unlocked this conversation Oct 2, 2015
@StefanKarpinski
Copy link
Member

I have deleted @ScottPJones comments and all the back and forth waste of everyone's time (@KristofferC, primarily) and unlocked the issue so that we can make forward progress on it. @ScottPJones, you may not post here. If you do so, you will be blocked for JuliaLang for a week.

@ViralBShah
Copy link
Member

@KristofferC I used to have the insertion sort based logic a long time back, and switched to the current one since it was faster.

@KristofferC
Copy link
Member Author

We are pretty much the same as MATLAB now so dont think there is a need to keep this issue open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster potential benchmark Could make a good benchmark in BaseBenchmarks sparse Sparse arrays
Projects
None yet
Development

No branches or pull requests

7 participants