-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse matrix support #38
Comments
A non-OO implementation could be an easier start indeed.
In my field, entries are added to a sparse matrix in a quite random way. Therefore, the indices are far to be sorted, even in a row. Therefore, I use |
@jvdp1 eventually we can support all the formats like SciPy does. It's quite a bit of work, and so for my own work I just did COO and CSR, since I didn't have time to implement the others, but for |
For sparse matrix support I suggest you look at the PSBLAS3 project of Filippone and Buttari. Its Object-Oriented Fortran 2003/2008. Go to https://github.com/sfilippone/psblas3 I also recommend you read the papers listed there for ideas about implementing sparse matrix support in OO Fortran and the various "design patterns" associated with sparse matrices. I like there approach to supporting different storage types. COO is used as input and the "bridge" type to all other storage formats. You only write converters to/from COO and your chosen storage format. Reduces the explosion of code trying to go directly from say CRS and other formats. For a "simple" sparse grid implementation, I use an approach similar to Java Sparse Arrays which is sorta like a CRS. I find it makes Matvecs simple |
@rweed Thanks for the suggestion! I agree, and have been using PSBLAS in a number of projects for work. I added the suggestion to the list of prior art at the top. |
Also, I'm not really sure where this class of sparse linear-algebra routines would belong, but I often see very large tridiagonal, and pentadiagonal systems. To my (admittedly limited) knowledge, there are not many good parallel algorithms for the solutions of these systems and there are not parallel algorithms in extant linear algebra libraries. The best/most promising algorithm that I know of is the SPIKE algorithm, but I've never seen a library that included it. I have a half-decent version specialized for tridiagonal matrices, and would love to see this and any other approaches implemented in the linear algebra portion of the stdlib. Should I open a new issue for this, or does it fit within the general sparse matrix support category? |
As the author of PSBLAS, I am very much aware of the need to sort indices,
and this is taken care *internally* in my library, so that there is no
burden on the user.
And no, quicksort is not the best choice. I normally use a special version
of merge sort, adapted from TAOCP.
Hope this helps
Salvatore
…On Thu, Jan 2, 2020 at 5:04 PM Ondřej Čertík ***@***.***> wrote:
Prior art:
- SciPy: https://docs.scipy.org/doc/scipy/reference/sparse.html
- Matlab: https://www.mathworks.com/help/matlab/ref/sparse.html
- Julia:
https://docs.julialang.org/en/v1/stdlib/SparseArrays/index.html
- @certik <https://github.com/certik>:
https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/sparse.f90
- @jvdp1 <https://github.com/jvdp1>: https://github.com/jvdp1/libsparse
- @sfilippone <https://github.com/sfilippone>:
https://github.com/sfilippone/psblas3
There are probably many more implementations. I really like the SciPy
simple non-OO implementation, and I have ported it to modern Fortran in the
link above. If people also want an OO implementation, then it can be build
on top as an option.
One thing that I found out is that one must sort the indices
<https://github.com/certik/hfsolver/blob/b4c50c1979fb7e468b1852b144ba756f5a51788d/src/sparse.f90#L185>
and the overall speed very much depends on how quickly one can sort it. I
ended up using quicksort, but it might be even faster to use some
specialized sorting algorithm (such as Timsort
<https://en.wikipedia.org/wiki/Timsort>) because in practice, the indices
have subsections that are already sorted (typically coming from some local
to global mapping as in finite elements), but overall it is not sorted.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38?email_source=notifications&email_token=AD274TY6ANZGMIDAYNGDX5DQ3YNANA5CNFSM4J6NGRLKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ICGREGQ>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD274TZUQE4LA53J5SW6ASTQ3YNANANCNFSM4J6NGRLA>
.
|
Thanks @sfilippone for the feedback! Indeed, I also suspect quicksort is not the best choice. @zbeekman, I would start with COO, CSR that we all agree we need. Then as we are figuring out the API, let's think how tridiagonal (Lapack has those) and pentadiagonal systems fit in. Also, initially I was thinking of sticking to serial implementations. We'll have to figure out how to best tackle parallel algorithms in stdlib, but I think that's an issue on its own, so I created #66 for it. |
@certik good idea. Do you happen to know the name of the Lapack procedure for diagonal matrices? I remember looking for a good serial implementation too and not finding it, but perhaps I was only focused on parallel implementations. |
@zbeekman yes, I have used |
I don't know if this could be useful, but in FEMPAR project we have implemented an OO extendible sparse matrix based on Filipone PSBLAS sparse matrix. It's complicated ... but performs quite well. |
Pentadiagonal matrices can be treated with the banded solvers in LAPACK, but they require a special storage scheme. I have some simple examples available:
The ancient Yale sparse matrix package has routines for ordering and solving both systems of symmetric and asymmetric sparse systems of equations. I don't know if the algorithms they used were any good though. A few other Fortran sparse matrix libraries include:
|
@certik What is the status of this discussion? From the previous posts I understand that there is a number of related projects, some discussion about technical aspects but it is not clear to me if there is a work started to create this library in stdlib (or even have people decided that this proposal for a sparse matrix library has passed step 1 of the Workflow in workflow.md). Sorry for what I am missing. |
@ghwilliams the status of the discussion at this issue as I understand it is that there is a general agreement that we want this, and so now we need to start implementing it, say the COO / CSR subset to get started, then submit a PR with a draft of a specification and then people will comment if they like it or not. It should be designed taking into account the discussion above and all the other libraries as linked at the top post. If you want to go ahead to take a lead on this one, that would be great. I'll help out as much as I can. |
Ok @certik . I will try to direct the discussion toward more concrete conclusions. So I will follow the workglow.md guide line and move on to Step 2. Defining an API. regards |
@ghwilliams I agree, let's move! Good to go forward with step 2 (API) as far as I'm concerned. I'm happy you're here. |
Before trying to reinvent the wheel check out PSBLAS 3 which is a parallel Fortran 2003 OOP |
The psblas software can now be found at
https://github.com/sfilippone/psblas3
The design is still being tweaked, it is very stable as far as the "serial"
(or, intra-node) part is concerned.
For the MPI part, I have some ideas in mind in terms of improvements of the
API, which I plan to tackle later on.
Any suggestions and discussions are welcome.
Salvatore
…On Fri, 15 May 2020, 21:45 rweed, ***@***.***> wrote:
Before trying to reinvent the wheel check out PSBLAS 3 which is a parallel
Fortran 2003 OOP
implementation of sparse matrixes (see
http://people.uniroma2.it/salvatore.filippone/psblas and the
selected publications listed).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD274T7E56VMBU5PQGXGSZTRRWLVLANCNFSM4J6NGRLA>
.
|
In particular the current development branch introduces a very important
feature: I have four different integer KINDs
psb_mpk_ <= psb_ipk_ <= psb_lpk_ <= psb_epk_
where:
psb_mpk_ is always 4 bytes;
psb_ipk_ is the integer kind of indices local to a process
psb_lpk_ is the integer kind of global indices
psb_epk_ is always 8 bytes,
and the size of IPK and LPK is chosen at configure time.
S.
On Fri, May 15, 2020 at 9:52 PM Salvatore Filippone <
[email protected]> wrote:
… The psblas software can now be found at
https://github.com/sfilippone/psblas3
The design is still being tweaked, it is very stable as far as the
"serial" (or, intra-node) part is concerned.
For the MPI part, I have some ideas in mind in terms of improvements of
the API, which I plan to tackle later on.
Any suggestions and discussions are welcome.
Salvatore
On Fri, 15 May 2020, 21:45 rweed, ***@***.***> wrote:
> Before trying to reinvent the wheel check out PSBLAS 3 which is a
> parallel Fortran 2003 OOP
> implementation of sparse matrixes (see
> http://people.uniroma2.it/salvatore.filippone/psblas and the
> selected publications listed).
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#38 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AD274T7E56VMBU5PQGXGSZTRRWLVLANCNFSM4J6NGRLA>
> .
>
|
I'm going to keep a document for the specification of the API. I will collect all the suggestions posted here for the API. In a first round I will include many suggestions for data structures/algorithms in the previous posts, also taking the other libraries as a reference. After some time I will propose to filter this list to get a final list of features to go in a version 1 of the API. Just as a side note, initially I'm planning to keep Object Oriented solutions out and focus on a clean, procedural design. The reason for that is to notenter an infinite loop of discussions about the pros and cons of one design or another. Let's choose one and move on. I plan to write the API document using MS Word but if anyone has another suggestion for writing the text (latex, notepad, whatever), for me it is ok, just let me know. Williams |
In general most people agree to do both, have a low level procedural (no side effects) design, and then have an optional higher level OO design. I would recommend to start with a serial design for COO and CSR matrices and let's start the discussion on the API for those. I updated all the Fortran implementations in the comment above that were suggested so far. Most of them are either fully OO, or partially OO. The only procedural implementations that I found are:
So those three can serve as an inspiration for the low level API. The rest of the packages can serve as an inspiration for the OO API. |
Ok @certik, the representations of the sparse matrices are certainly in the core of this work, I will start writing some text about the objectives of the API, its ambition and philosophy (e.g. to have a minimum number of functions/subroutines instead of being extensive as I envision now) and in a more technical section I will start writing about these two formats to start with. |
@ghwilliams perfect, looking forward! |
Hello everybody. Of course any suggestions, critiques are welcome. |
Thanks @ghwilliams for starting it. Would it be ok to use Markdown and put the document on our wiki? That way others can easily edit and it's easier to view, and later it will be easier to create the specification document, which also uses Markdown. Regarding indexing, in Fortran the standard is to start from 1, so that's what I would recommend. |
Of course, having the document in the wiki is much better just let me know how to do that. |
@ghwilliams in #189 we have to discuss as a community if this is an API that we would like to use, or if not, how to improve it. I submitted the code so that we have something more concrete to discuss and so that people can compile and run it and play with it. |
Allright, how #189 then relates to this document that Im creating? Because in my document we are going to propose an API with the same purpose right? |
@ghwilliams yes, exactly. The PR #189 provides code implementation of an API that you are writing the specs for in the document. In order to efficiently discuss the API, it is helpful to have code that implements it, so that we can play with it. |
Ok I will continue then writing the API document in the wiki. It is good to keep track of all the posts here related to it. |
Absolutely! Thank you for your help.
…On Sat, May 16, 2020, at 9:54 AM, Williams A. Lima wrote:
Ok I will continue then writing the API document in the wiki. It is
good to keep track of all the posts here related to it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFAWA5FXQBGYVC2NFIXE3RR2ZLHANCNFSM4J6NGRLA>.
|
I created the wiki page with all the information in the previous Word file. But there are some formatting issues to be corrected. I've never used Markdown before and I don't know about its capabilities to render math. |
I would like to suggest that as policy COO will be used as the bridge format for implementing new sparse formats, for initializing an existing format and conversion to/from other formats. I believe |
Yes, that's exactly how I proposed it in the PR.
…On Sat, May 16, 2020, at 10:25 AM, rweed wrote:
I would like to suggest that as policy COO will be used as the bridge
format for implementing new sparse formats, for initializing an
existing format and conversion to/from other formats. I believe
IMSL does this for their sparse matrix support and if I remember
correctly so does PSBLAS. The
advantage of this is people implementing a new format only have to
write code to go to/from COO
(or whatever format is selected) instead of having to support direct
conversion to all the other formats that might eventually be
impemented. I'm guessing this is standard practice in most other sparse
matrix packages.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFAWAIX5BXZTITAHRXU2DRR246XANCNFSM4J6NGRLA>.
|
Yes SPARSKIT does the same. I will add some text about it. |
@certik I would like to include an image in the wiki page. Clicking the wiki edit tool asks me for a url but the file is in my local computer. The alternative, I think, is to have the image file uploaded to the repository but I don't have write access to it, I guess. |
We'll give you access, once I get to a computer. For now you can upload the images, e.g., to a gist:
https://remarkablemark.org/blog/2016/06/16/how-to-add-image-to-gist/
And link them from the wiki.
…On Sat, May 16, 2020, at 10:49 AM, Williams A. Lima wrote:
@certik <https://github.com/certik> I would like to include an image in
the wiki page. Clicking the wiki edit tool asks me for a url but the
file is in my local computer. The alternative, I think, is to have the
image file uploaded to the repository but I don't have write access to
it, I guess.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFAWGD4LGT2LH3NUUKPBTRR27YNANCNFSM4J6NGRLA>.
|
The COO format is the most flexible when it comes to shuffling and
transforming data, including:
1. Having multiple coefficients with the same indices (very common in the
matrix construction phase in finite element applications)
2. Applying index transformations (e.g.: transposing)
3. Moving from a row-oriented to a column-oriented ordering.
You will need at least two flags, one flag that declares a sorting has been
applied, another denoting whether it's row- or column-oriented
And you will need to define a processing strategy in dealing with repeated
entries.
On the other hand, it is practically always slower when it comes to
actually applying the matrix operator e.g. in a matrix-vector product.
Salvatore
…On Sat, May 16, 2020 at 6:32 PM Williams A. Lima ***@***.***> wrote:
Yes SPARSKIT does the same. I will add some text about it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD274T5AYAS3V445TUAHIULRR25YZANCNFSM4J6NGRLA>
.
|
I added (wiki page) some ideas for the relation between the storage format and the routines in the API. |
I don't think it has been linked previously, but on netlib there is a folder of templates for sparse solvers: Section 4.3.1 of the documentation provides a survey of sparse storage formats. |
I know that book. If you can look at the wiki and leave some comments it will be greatly appreciated. Thanks. |
PSBLAS is OO when it comes to the sparse matrices (all storage formats are
derived from a base type) and uses the STATE design pattern (a sparse
matrix is a two-layered object) to allow an individual matrix object to
change storage format at runtime.
The sorting routines are procedural but with compile-time generics
resolution, for performance.
And please, please, please: DO NOT default to quicksort. Please. It's just
not suitable for sorting the kind of distributions you find in the indices
of sparse matrices (multiple repetitions, small range of values with
respect to number of indices, multiple sorted subsequences etc.)
My implementation of mergesort is (in my quite long experience) better in
this context.
…On Fri, May 15, 2020 at 11:26 PM Ondřej Čertík ***@***.***> wrote:
In general most people agree to do both, have a low level procedural (no
side effects) design, and then have an optional higher level OO design.
I would recommend to start with a serial design for COO and CSR matrices
and let's start the discussion on the API for those.
I updated all the Fortran implementations in the comment above that were
suggested so far.
Most of them are either fully OO, or partially OO. The only procedural
implementations that I found are:
- hfsolver (mine)
- Yale
- psblas3 (only some parts, e.g. sorting seems to be implemented in a
procedural way)
So those three can serve as an inspiration for the low level API. The rest
of the packages can serve as an inspiration for the OO API.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD274TZASNDALFU7OHBTCPDRRWXPDANCNFSM4J6NGRLA>
.
|
Thanks @sfilippone . |
Today I found this interesting work: https://www.math.uzh.ch/pages/spam/ |
@sfilippone thanks for the information. As I wrote in #189, I also suggest we do not default to quicksort. |
I've been editing the wiki (https://github.com/fortran-lang/stdlib/wiki/Stdlib-Sparse-matrix-API) and I would like to have some feedback to know if I am in the right direction. Any comments are welcome. |
@ghwilliams thanks a lot for creating the page and suggesting an API. My main feedback is that it looks like this would be the high level API -- or middle level, depending on how we want to view it, but not the low level API. Because it uses a derived type to store the CSR matrix, which I also ended up doing in my application, but I suggest for the low level API to not do that, unless absolutely necessary (which it isn't in this case), because you are then forcing your derived type on all applications. As an example, I would not be able to just use a routine from stdlib in my code, because I already use a different derived type for the CSR matrix. In the same way, unless PSBLAS moves to this derived type, it would not be able to use the subroutines either. So for that reason, I propose for the low level API to be roughly in the sense of #189, and then the high level API roughly what you wrote in your wiki, or what PSBLAS and other libraries are using. |
Thanks @certik . I will work on your suggestions. |
I separated the design now in low and middle level parts. Low level subroutines take plain arrays of integers/real/complex values for the elements in the matrix representation. |
Some amount of data duplication is unavoidable unless you force the user to
stick to one and only one data storage format, which is not practical.
To make the best of the situation you may want to define data conversion
routines in two flavours, CPY and MOV.
S.
…On Wed, May 20, 2020 at 12:21 PM Williams A. Lima ***@***.***> wrote:
I separated the design now in low and middle level parts. Low level
subroutines take plain arrays of integers/real/complex values for the
elements in the matrix representation.
But I would like to discuss here one particular aspect of the design of
this API. It was suggested, and I'm following this suggestion, to have a
preferred matrix format against with all the API core subroutines would be
written and then have conversion routines for the other formats. The
drawback of this approach, as far as I can see now, is that the data will
be duplicated requiring, broadly speaking, two times RAM during the API
calls.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD274T2TTNS2FTRGVHKQB4TRSOVK7ANCNFSM4J6NGRLA>
.
|
@sfilippone I will look at psblas (and other libs) to see how this problem is approached. |
Your code can
Be quick to write and compact
Be very fast
Require the minimum possible amount of memory
but you only get to pick two...
…On Thu, May 21, 2020 at 12:48 PM Williams A. Lima ***@***.***> wrote:
@sfilippone <https://github.com/sfilippone> I will look at psblas (and
other libs) to see how this problem is approached.
I don't like the idea of duplicating data/code so I will try to find a
good solution that will not blamed by users regarding its performance nor
by developers for having to write many versions of a routine for each data
representation..
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD274T6KLQAFCJFSRERA7FTRSUBIFANCNFSM4J6NGRLA>
.
|
Hi @certik and @jvdp1 . Thanks @ivan-pi for making me aware of the effort. Said that, seeing that the discussion is taking time, please let me share a bit of my experience, too. Concerning the API:
Whatever you end up choosing for the most external API later, I recommend caution with respect to the internals, especially if you start an effort from scratch (see above @certik @sfilippone 's comments about sorting :-) ). Staying flexible with respect to the internals, perhaps engineering a good API+wrappers code generator that would interface with an existing library is maybe the best thing. I'm not discouraging from offering a sane COO/CSR interface to the user -- but maybe joining forces with one or more existing projects, maybe even reusing the internals, may help getting something usable very quickly. And maybe, kick more library implementors in making their libraries substitutable one-to-one for Michele |
Thank you @michelemartone for your comments.
I mostly agree with all your comments. At this stage, I believe that a good (low-level) API is the first thing to define. And the sparse BLAS API might be a good start. |
Prior art in other languages:
In Fortran (I keep this list updated with all implementations posted in this issue):
I really like the SciPy simple non-OO implementation, and I have ported it to modern Fortran in the link above. If people also want an OO implementation, then it can be build on top as an option.
One thing that I found out is that one must sort the indices and the overall speed very much depends on how quickly one can sort it. I ended up using
quicksort
, but it might be even faster to use some specialized sorting algorithm (such as Timsort) because in practice, the indices have subsections that are already sorted (typically coming from some local to global mapping as in finite elements), but overall it is not sorted.The text was updated successfully, but these errors were encountered: