Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of Linear Algebra operations of AbstractTriangular Sparse Matrices #28451

Closed
KlausC opened this issue Aug 5, 2018 · 4 comments
Closed

Comments

@KlausC
Copy link
Contributor

KlausC commented Aug 5, 2018

I looked with surprise at the benchmark results of left multiplication/division of some wrapped matrix types.
In the example all times could be as low as 1.7ms, because the algorithms for multiplication and division is rather similar for triangular matrices.
Actually, I observed: 1.6s for some multiplications (factor 1000) and 500ms for some multiplications and divisions (factor 300).
Only 6 of 24 cases had the expected 1-2ms (partially as a consequence of #28242).

I want to investigate the multiplication and division methods in order to avoid the fallback methods for LinearAlgebra.AbstractTriangular in the sparse case.
It is possible, that new multiplication algorithms analogous to the triangular solvers have to be coded.
The types Unit...Triangular also deserve special handling for * and \ .

  | | |_| | | | (_| |  |  Version 1.0.0-DEV.18 (2018-08-05 08:25 UTC)

julia> n = 10000;
julia> Random.seed!(0); A = sprandn(n, n, 0.01); nnz(A)
1000598
julia> A += I*2 - Diagonal(diag(A));
julia> B = ones(n);
julia> using Statistics
julia> using BenchmarkTools

julia> function benchtrwrappers(A::AbstractArray, B)
           for tr in (identity, adjoint, transpose)
               for wrapper in (UpperTriangular, LowerTriangular, UnitUpperTriangular, UnitLowerTriangular)
                   TRI = tr(wrapper(A))
                   for op in (*, \)
                       t = @benchmark $op($TRI, $B)
                       println("$tr($wrapper(A)) $op B: $(median(t))")
                   end
               end
           end
       end
benchtrwrappers (generic function with 1 method)


julia> benchtrwrappers(A, B)
identity(UpperTriangular(A)) * B: TrialEstimate(1.571 s)
identity(UpperTriangular(A)) \ B: TrialEstimate(1.807 ms)
identity(LowerTriangular(A)) * B: TrialEstimate(1.658 s)
identity(LowerTriangular(A)) \ B: TrialEstimate(1.697 ms)
identity(UnitUpperTriangular(A)) * B: TrialEstimate(1.646 s)
identity(UnitUpperTriangular(A)) \ B: TrialEstimate(561.277 ms)
identity(UnitLowerTriangular(A)) * B: TrialEstimate(1.623 s)
identity(UnitLowerTriangular(A)) \ B: TrialEstimate(522.930 ms)
adjoint(UpperTriangular(A)) * B: TrialEstimate(473.104 ms)
adjoint(UpperTriangular(A)) \ B: TrialEstimate(1.238 ms)
adjoint(LowerTriangular(A)) * B: TrialEstimate(507.524 ms)
adjoint(LowerTriangular(A)) \ B: TrialEstimate(1.367 ms)
adjoint(UnitUpperTriangular(A)) * B: TrialEstimate(459.261 ms)
adjoint(UnitUpperTriangular(A)) \ B: TrialEstimate(447.648 ms)
adjoint(UnitLowerTriangular(A)) * B: TrialEstimate(466.578 ms)
adjoint(UnitLowerTriangular(A)) \ B: TrialEstimate(464.726 ms)
transpose(UpperTriangular(A)) * B: TrialEstimate(533.262 ms)
transpose(UpperTriangular(A)) \ B: TrialEstimate(1.241 ms)
transpose(LowerTriangular(A)) * B: TrialEstimate(528.717 ms)
transpose(LowerTriangular(A)) \ B: TrialEstimate(1.352 ms)
transpose(UnitUpperTriangular(A)) * B: TrialEstimate(458.291 ms)
transpose(UnitUpperTriangular(A)) \ B: TrialEstimate(447.555 ms)
transpose(UnitLowerTriangular(A)) * B: TrialEstimate(468.688 ms)
transpose(UnitLowerTriangular(A)) \ B: TrialEstimate(465.549 ms)

julia> 
@andreasnoack
Copy link
Member

These cases would also be easy to detect if we could make scalar indexing fail during tests as mentioned in JuliaLang/LinearAlgebra.jl#545

@KlausC
Copy link
Contributor Author

KlausC commented Aug 6, 2018

I agree, if the objective is to avoid inefficient calls * and \ for wrapped triangular matrices.

Nevertheless, I would like to not to be forced to use ugly workarounds for those cases, but rather just use for example L' \ b, also if L happens to be a LinearAlgebra.UnitLowerTriangular.
I think X = sparse(tril(L)'; X += 1.0I - Diagonal(diag(L)); X \ b would be an appropriate workaround, which is not appealing at all.

Please wait a day or so for my next PR to resolve that.

@KlausC
Copy link
Contributor Author

KlausC commented Aug 7, 2018

The figures have drastically improved by the change in the last commit:

julia> benchtrwrappers(A, B)
identity(UpperTriangular(A)) * B: TrialEstimate(1.448 ms)
identity(UpperTriangular(A)) \ B: TrialEstimate(1.802 ms)
identity(LowerTriangular(A)) * B: TrialEstimate(1.451 ms)
identity(LowerTriangular(A)) \ B: TrialEstimate(1.639 ms)
identity(UnitUpperTriangular(A)) * B: TrialEstimate(1.409 ms)
identity(UnitUpperTriangular(A)) \ B: TrialEstimate(1.787 ms)
identity(UnitLowerTriangular(A)) * B: TrialEstimate(1.430 ms)
identity(UnitLowerTriangular(A)) \ B: TrialEstimate(1.613 ms)
adjoint(UpperTriangular(A)) * B: TrialEstimate(1.671 ms)
adjoint(UpperTriangular(A)) \ B: TrialEstimate(1.269 ms)
adjoint(LowerTriangular(A)) * B: TrialEstimate(1.707 ms)
adjoint(LowerTriangular(A)) \ B: TrialEstimate(1.324 ms)
adjoint(UnitUpperTriangular(A)) * B: TrialEstimate(1.654 ms)
adjoint(UnitUpperTriangular(A)) \ B: TrialEstimate(1.325 ms)
adjoint(UnitLowerTriangular(A)) * B: TrialEstimate(1.757 ms)
adjoint(UnitLowerTriangular(A)) \ B: TrialEstimate(1.412 ms)
transpose(UpperTriangular(A)) * B: TrialEstimate(1.718 ms)
transpose(UpperTriangular(A)) \ B: TrialEstimate(1.359 ms)
transpose(LowerTriangular(A)) * B: TrialEstimate(1.817 ms)
transpose(LowerTriangular(A)) \ B: TrialEstimate(1.470 ms)
transpose(UnitUpperTriangular(A)) * B: TrialEstimate(1.787 ms)
transpose(UnitUpperTriangular(A)) \ B: TrialEstimate(1.320 ms)
transpose(UnitLowerTriangular(A)) * B: TrialEstimate(1.752 ms)
transpose(UnitLowerTriangular(A)) \ B: TrialEstimate(1.419 ms)

julia> versioninfo()
Julia Version 1.0.0-DEV.43
Commit f7d9a8be97* (2018-08-06 07:42 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz

@KlausC
Copy link
Contributor Author

KlausC commented Aug 8, 2018

The recent version improved runtimes another time:

  | | |_| | | | (_| |  |  Version 1.0.0-rc1.5 (2018-08-08 14:53 UTC)
 _/ |\__'_|_|_|\__'_|  |  sparsearrays-triangular-operations/70c8b73b7c* (fork: 5 commits, 0 days)

julia> function testtrwrappers(A::AbstractArray, B)
           for tr in (identity, adjoint, transpose)
               for wrapper in (UpperTriangular, LowerTriangular, UnitUpperTriangular, UnitLowerTriangular)
                   TRI = tr(wrapper(A)); TRI2 = tr(wrapper(Matrix(A)))
                   for (op, opm) in ( (*, *), (\, \) )
                       r1 = op(TRI, copy(B))
                       r2 = opm(TRI2, B)
                       d = norm(r1 - r2, Inf)
                       println("$op($tr($wrapper)) $(norm(r1)) $(d / norm(r1, Inf))")
                       #@Test.test r1 ≈ r2
                   end
               end
           end
       end
testtrwrappers (generic function with 1 method)

julia> function benchtrwrappers(A::AbstractArray, B)
           for tr in (identity, adjoint, transpose)
               for wrapper in (UpperTriangular, LowerTriangular, UnitUpperTriangular, UnitLowerTriangular)
                   TRI = tr(wrapper(A))
                   for op in (*, \)
                       t = @benchmark $op($TRI, $copy(B))
                       println("$tr($wrapper(A)) $op B: $(median(t))")
                   end
               end
           end
       end
benchtrwrappers (generic function with 1 method)

julia> benchtrwrappers(A, B)
identity(UpperTriangular(A)) * B: TrialEstimate(1.344 ms)
identity(UpperTriangular(A)) \ B: TrialEstimate(1.523 ms)
identity(LowerTriangular(A)) * B: TrialEstimate(1.444 ms)
identity(LowerTriangular(A)) \ B: TrialEstimate(1.372 ms)
identity(UnitUpperTriangular(A)) * B: TrialEstimate(1.316 ms)
identity(UnitUpperTriangular(A)) \ B: TrialEstimate(1.549 ms)
identity(UnitLowerTriangular(A)) * B: TrialEstimate(1.411 ms)
identity(UnitLowerTriangular(A)) \ B: TrialEstimate(1.408 ms)
adjoint(UpperTriangular(A)) * B: TrialEstimate(1.554 ms)
adjoint(UpperTriangular(A)) \ B: TrialEstimate(1.210 ms)
adjoint(LowerTriangular(A)) * B: TrialEstimate(1.625 ms)
adjoint(LowerTriangular(A)) \ B: TrialEstimate(1.309 ms)
adjoint(UnitUpperTriangular(A)) * B: TrialEstimate(1.532 ms)
adjoint(UnitUpperTriangular(A)) \ B: TrialEstimate(1.197 ms)
adjoint(UnitLowerTriangular(A)) * B: TrialEstimate(1.602 ms)
adjoint(UnitLowerTriangular(A)) \ B: TrialEstimate(1.304 ms)
transpose(UpperTriangular(A)) * B: TrialEstimate(1.560 ms)
transpose(UpperTriangular(A)) \ B: TrialEstimate(1.204 ms)
transpose(LowerTriangular(A)) * B: TrialEstimate(1.621 ms)
transpose(LowerTriangular(A)) \ B: TrialEstimate(1.309 ms)
transpose(UnitUpperTriangular(A)) * B: TrialEstimate(1.531 ms)
transpose(UnitUpperTriangular(A)) \ B: TrialEstimate(1.199 ms)
transpose(UnitLowerTriangular(A)) * B: TrialEstimate(1.600 ms)
transpose(UnitLowerTriangular(A)) \ B: TrialEstimate(1.306 ms)

mcognetta added a commit to mcognetta/julia that referenced this issue Sep 15, 2018
…These should go in another PR so this one can be merged more quickly.

Revert "added sparse multiplication and division for triangular matrices. Fix JuliaLang#28451"

This reverts commit 11c1d1d.
mcognetta added a commit to mcognetta/julia that referenced this issue Sep 16, 2018
andreasnoack pushed a commit that referenced this issue Dec 11, 2018
* added sparse multiplication and division for triangular matrices. Fix #28451

* merge with master

* merge with master 2

* fixed symtridiagonal + bidiagonal

* improved find diagonal part

* refactored to purge name space of SparseArrays

* additional test cases and bug fix

* specializing some structured matrix operations

* added constructors for Triangular(::Diagonal). Removed redundant code from binops of special.jl so that broadcasting takes over. Cleaned up some of the tests for special.jl

* fix whitespace

* actually fixed whitespace

* fixed a typo in Diagonal*Bi/Tridiag. Changed the multiplication methods to more explicit constructors so that matrices with BigFloat dont error

* fixed bidiag+/-diag speed regression

* fixed +/- regressions for the other structured matrix types (bidiag, tridiag, symtridiag, diag)

* Revert "merged with master"

This reverts commit 3a58908, reversing
changes made to 0facd1d.

* Removing the speedups for sparse matrix multiplication and division. These should go in another PR so this one can be merged more quickly.

Revert "added sparse multiplication and division for triangular matrices. Fix #28451"

This reverts commit 11c1d1d.

* Revert "additional test cases and bug fix"

This reverts commit 21592db.

* reverting sparse changes

* removing extra whitespace and comments

* fixing BiTriSym*BiTriSym sparse eltype

* fixing the cases where we have two structured matrices and the resulting diagonals are of different types. This still fails when the representation is a range and we get a step size of 0

* Fixes the issue where we try to add structured matrices and one has an eltype <: AbstractArray

See PR 27289

* remove adjoint and transpose methods that I never changed

* fixing tridiagonal constructor to save time/memory

* fixing bidiag * diag return type

* adding multiplication to binops tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants