more methods for Hessenberg factorizations #625

stevengj · 2019-04-16T15:17:23Z

I just noticed that the LinearAlgebra package only supports a very small number of specialized functions for Hessenberg factorization objects (#7965), which limits their practical utility. Some things that would be nice to have:

Fast H \ b: this can be done in O(n²) operations. Currently it gives a MethodError. Actually, as discussed below, we can even support fast (H-µI) \ b, where different shifts µ can be used without making copies. (H+μI) \ x solvers for Hessenberg factorizations julia#31853
Fast QR factorization.
Fast eigenvalues etc.
H*scalar and H+µ*I should return another Hessenberg object; currently these give a MethodError. Ideally, as discussed below, the shift µ should be stored in the Hessenberg object itself, so that multiple shifts can be applied without allocating a whole new matrix. (H+μI) \ x solvers for Hessenberg factorizations julia#31853
determinant (and logdet etc) of H+µ*I in O(n²) operations, e.g. using theorem 2.1 of this paper or maybe using Givens rotations to make H upper-triangular (though avoiding copying the whole matrix would be good). #31853
hessenberg(::Hermitian) or real-symmetric — in this case the Hessenberg matrix is tridiagonal with associated speedups. (H+μI) \ x solvers for Hessenberg factorizations julia#31853

This came up on discourse, where someone wanted to solve (A - λ*I) \ b problems repeatedly for many different λ — a nice way to do this is via a Hessenberg factorization of A = QHQ' since A - λ*I = Q(H-λ*I)Q' and subsequent solves are O(n²). You could also use Schur for this sort of thing, but computing the Schur factorization is a lot more expensive (≈7×).

The text was updated successfully, but these errors were encountered:

mfalt · 2019-04-23T14:32:08Z

I could try to implement these features. AFAIK H\b is usually solved as in
https://github.com/JuliaLinearAlgebra/GenericLinearAlgebra.jl/blob/master/src/eigenGeneral.jl#L28
using some simple iterations like

"""Assuming Hessenberg H"""
function ldiv!(H, rhs)
    n = size(H,1)
    for i = 1:n-1
        Gi, r = givens(H, i, i+1, i)
        lmul!(Gi, H)  # Note: slow indexing
        lmul!(Gi, rhs)
    end
    # H is now upper triangular
    ldiv!(UpperTriangular(H), rhs)
    return rhs
end

However, the operation lmul!(Gi, H) modifies two rows of H, which can be quite slow compared to working on the columns, there are ways to get around this but they are quite ugly, for example:

"""Assuming Hessenberg H"""
function ldiv!(H::AbstractMatrix{T}, rhs) where {T}
    n = size(H,1)
    Ht = H'
    rotations = LinearAlgebra.Givens{T}[] # Can be replaced with LinearAlgebra.Rotation
    for i = n:-1:2
        Gi, r = givens(Ht, i, i-1, i)
        push!(rotations, Gi')
        rmul!(H, Gi')
    end
    # H is now upper triangular
    ldiv!(UpperTriangular(H), rhs)
    # Should be possible to do lmul!(rotation::Rotation, rhs) here
    for i in n-1:-1:1
        lmul!(rotations[i], rhs)
    end
    return rhs
end

where it can be cleaned up a bit if
https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/givens.jl#L368
is relaxed to accept vectors (Rotation*Vector currently throws error).
This second version is a bit faster, for the cost of readability, it uses the fact that

Hx=b <==>
H(G1*G2*...Gn)*(Gn'*Gn-1'*...*G1')x=b <==>
(Gn'*Gn-1'*...*G1')x = (H*(G1*G2*...Gn))\b
x = (Gn'*Gn-1'*...*G1')*(H*(G1*G2*...Gn))\b

Where the G1,G2,...Gn that are needed to make H*(G1*G2*...Gn) upper triangular are calculated as the transpose of givens on the transpose of H, starting with the last column, instead of first row.
Is it worthwhile using this approach or is there a good standard way of making the original version efficient from a memory-layout perspective?

Edit: the original discussion is found here https://discourse.julialang.org/t/how-to-optimize-simple-matrix-equation-solving/23035/16

stevengj · 2019-04-23T15:42:46Z

I'm not sure that Givens QR is state-of-the art for H \ b here. In particular, this paper says it is possible to combine the triangularization with the backsolve into a single pass over the array (without modifying H or allocating any temporary matrix).

Also, since it is extremely common to use Hessenberg factorizations with a shift H - λ*I, it might be nice to build this into the Hessenberg type so that it can be done without allocating a new matrix:

add a λ field into the Hessenberg type which defaults to zero
define ±(H::Hessenberg, D::UniformScaling) = Hessenberg(H, H.λ ± D.λ)
incorporate the shift λ into the algorithm for ldiv! etcetera.

stevengj · 2019-04-25T21:01:17Z

@kshyatt, I removed the "good first issue" tag because implementing these algorithms well is somewhat nontrivial…

stevengj · 2019-04-26T23:01:31Z

For reference, here is a first implementation of the Henry algorithm; it seems to be about 10x faster than the routine by @mfalt above (and allocates much less memory) for a 1000x1000 matrix:

# solve (H - µI)x =b, with output written in-place in b
function ldiv_H!(F::Hessenberg, b::AbstractVector, μ=0)
    n = size(F.factors, 1)
    n != length(b) && throw(DimensionMismatch("wrong right-hand-side length != $n"))
    H = F.factors
    u = H[:,n] # temporary vector
    u[n] -= μ
    x = b # not a copy, just rename to match paper
    cs = Vector{Tuple{eltype(u),eltype(u)}}(undef, length(u)) # store Givens rotations
    @inbounds for k = n:-1:2
        Φ, ρ = givens(u[k], H[k,k-1], 1,2)
        cs[k] = c, s = Φ.c, Φ.s
        x[k] /= ρ
        t₁ = s * x[k]; t₂ = c * x[k]
        @simd for j = 1:k-2
            x[j] -= u[j]*t₂ + H[j,k-1]*t₁
            u[j] = H[j,k-1]*c - u[j]*s'
        end
        x[k-1] -= u[k-1]*t₂ + (H[k-1,k-1] - μ) * t₁
        u[k-1] = (H[k-1,k-1] - μ) * c - u[k-1]*s'
    end
    τ₁ = x[1] / u[1]
    @inbounds for k = 2:n
        τ₂ = x[k]
        c, s = cs[k]
        x[k-1] = c*τ₁ + s*τ₂
        τ₁ = c*τ₂ - s'τ₁
    end
    x[n] = τ₁
    return x
end

olof3 · 2019-12-11T12:17:24Z

Thanks, nice PR! I've been looking at how to use the functionality for computing frequency responses of linear systems. It seems to work nicely, so these are just a few minor things.

Q \ B is defined for HessenbergQ but it does not exploit the structure of Q and seems to be surprisingly slow, the equivalent Q' * B is fast as expected. Being able to use \ would perhaps make the code more natural and avoid the need to remember that Q is unitary.

\ is currently not defined for UpperHessenberg

Calling rdiv!(B, H; shift=s) with s=Inf does not return. Perhaps it would make sense to return a zero vector if A and B are finite? Or just throw an error?

The hessenbergQ function does not seem to be type stable?

using Test
F = hessenberg(randn(4,4))
@inferred getproperty(F, :Q)

gives
return type LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false} does not match inferred return type Any error(::String) at error.jl:33 top-level scope at experiment_freqresp.jl:104

andreasnoack · 2019-12-11T12:32:20Z

It looks like our reflection functions don't agree on this

julia> @inferred LinearAlgebra.HessenbergQ(F)
ERROR: return type LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false} does not match inferred return type LinearAlgebra.HessenbergQ{_A,Array{Float64,2},Array{Float64,1},false} where _A
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope at REPL[21]:1

julia> @code_warntype LinearAlgebra.HessenbergQ(F)
Variables
  #self#::Type{LinearAlgebra.HessenbergQ}
  F::Hessenberg{Float64,UpperHessenberg{Float64,Array{Float64,2}},Array{Float64,2},Array{Float64,1},Bool}

Body::LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}
1 ─ %1 = Base.getproperty(F, :factors)::Array{Float64,2}
│   %2 = LinearAlgebra.eltype(%1)::Core.Compiler.Const(Float64, false)
│   %3 = $(Expr(:static_parameter, 1))::Core.Compiler.Const(Array{Float64,2}, false)
│   %4 = $(Expr(:static_parameter, 2))::Core.Compiler.Const(Array{Float64,1}, false)
│   %5 = Core.apply_type(LinearAlgebra.HessenbergQ, %2, %3, %4, false)::Core.Compiler.Const(LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}, false)
│   %6 = Base.getproperty(F, :uplo)::Char
│   %7 = Base.getproperty(F, :factors)::Array{Float64,2}
│   %8 = Base.getproperty(F, :τ)::Array{Float64,1}
│   %9 = (%5)(%6, %7, %8)::LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}
└──      return %9

olof3 · 2019-12-11T13:08:40Z

Ah, my bad, it is this thing.

andreasnoack · 2019-12-11T13:19:01Z

No. It's not #12. I think it might just be an issue with @inferred.

stevengj · 2019-12-11T13:23:22Z

@olof3, PRs to fix any of those issues would be welcome. In any case, feel free to file a new issue.

olof3 · 2019-12-11T13:39:40Z

@andreasnoack I think it is better if you submit an issue if you think that it is warranted, I know too little about how it is supposed to work.

@stevengj I'll see what I can do.

stevengj · 2021-06-29T12:34:22Z

Closing this issue since at this point we should file more specific issues for remaining functionality.

kshyatt added the good first issue Good for newcomers label Apr 25, 2019

stevengj removed the good first issue Good for newcomers label Apr 25, 2019

stevengj mentioned this issue Apr 27, 2019

(H+μI) \ x solvers for Hessenberg factorizations JuliaLang/julia#31853

Merged

7 tasks

olof3 mentioned this issue Nov 26, 2024

rdiv!(B, F::UpperHessenberg; shift::Number) does not return if shift=Inf #687

Open

stevengj closed this as completed Jun 29, 2021

stevengj mentioned this issue Jun 29, 2021

eigensolvers for Hessenberg matrices #858

Open

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

more methods for Hessenberg factorizations #625

more methods for Hessenberg factorizations #625

stevengj commented Apr 16, 2019 •

edited

Loading

mfalt commented Apr 23, 2019 •

edited

Loading

stevengj commented Apr 23, 2019 •

edited

Loading

stevengj commented Apr 25, 2019

stevengj commented Apr 26, 2019 •

edited

Loading

olof3 commented Dec 11, 2019

andreasnoack commented Dec 11, 2019

olof3 commented Dec 11, 2019

andreasnoack commented Dec 11, 2019

stevengj commented Dec 11, 2019

olof3 commented Dec 11, 2019

stevengj commented Jun 29, 2021

more methods for Hessenberg factorizations #625

more methods for Hessenberg factorizations #625

Comments

stevengj commented Apr 16, 2019 • edited Loading

mfalt commented Apr 23, 2019 • edited Loading

stevengj commented Apr 23, 2019 • edited Loading

stevengj commented Apr 25, 2019

stevengj commented Apr 26, 2019 • edited Loading

olof3 commented Dec 11, 2019

andreasnoack commented Dec 11, 2019

olof3 commented Dec 11, 2019

andreasnoack commented Dec 11, 2019

stevengj commented Dec 11, 2019

olof3 commented Dec 11, 2019

stevengj commented Jun 29, 2021

stevengj commented Apr 16, 2019 •

edited

Loading

mfalt commented Apr 23, 2019 •

edited

Loading

stevengj commented Apr 23, 2019 •

edited

Loading

stevengj commented Apr 26, 2019 •

edited

Loading