Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start the addition of the module stdlib_sorting #386

Closed
wants to merge 18 commits into from

Conversation

wclodius2
Copy link
Contributor

Added the draft markdown documentation file, stdlib_sorting.md.

[ticket: X]

Added the draft markdown documentation file, stdlib_sorting.md.

[ticket: X]
Removed trailing whitespace from the markdown document, stdlib_sorting.md.

[ticket: X]
Added a reference to stdlib_sorting.md to index.md

[ticket: X]
Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice additions. I added a few comments. I am wondering if the subroutines 'ord_sort' and unord_sort should include an option to sort in a decreasing order.


The `int_size` parameter is used to specify the kind of integer used
in indexing the various arrays. Currently the module sets `int_size`
to the value of `int64` from the intrinsic `ISO_FORTRAN_ENV` module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should stdlib_kinds be used, instead of iso_fortran_env?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I use stdlib_kinds, but I thought the casual user would be more familiar with iso_fortran_env.

Comment on lines 66 to 82
The [`rust sort` implementation]
(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs)
is distributed with the header:

Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
file at the top-level directory of this distribution and at
http://rust-lang.org/COPYRIGHT.

Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
<LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
option. This file may not be copied, modified, or distributed
except according to those terms.

so the license for the original code is compatible with the use of
modified versions of the code in the Fortran Standard Library under
the MIT license.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this section be mentioned here, or in a separate section (e.g. entitled "Licences of original codes")?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate section seems appropriate.

...
```

#### `ord_sorting` - creates an arry of sorting indices for an input array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### `ord_sorting` - creates an arry of sorting indices for an input array.
#### `ord_sorting` - creates an array of sorting indices for an input array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Comment on lines 47 to 49
* `ORD_SORTING` is intended to provide indices for sorting arrays of
derived type data, based on the ordering of an intrinsic component
of the derived type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be used for sorting other things than derived type data.
E.g. to sort an array following one of its column.

real :: a(:,:)
....
call ord_sorting(a(:, 2), index)

a(:,:) = a(index, :)

Am I correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'll see how I can reword this.

Comment on lines 284 to 289
`array`: shall be a rank one array of any of the types:
`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`,
`real(real32)`, `real(real64)`, or `real(real128)`. It is an
`intent(inout)` argument. On input it will be an array whose sorting
indices are to be determined. On return it will be the sorted
array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it should be an intent(in). If it is an intent(in) it could be used as follows:

...
real :: a(:, :)
...
cal ord_sorting(a(:, 3), index)
a(:,:) = a(index, ;)

As the API is now defined, such an approach is not possible.
Furthermore, I guess that getting the array sorted is not important when working with derived types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had it as intent(in), but then I needed an additional scratch array of the same size as array to store a copy of the original array during all the sorting. The scratch array could either be an allocatable (which makes the code more vulnerable to exceeding the stack size limits), or an optional intent(inout) argument. Users can still do

    ...
    real :: a(:, :), b(:)
    ...
    b(:) = a(:,3)
    call ord_sorting(b(:), index)
    a(:,:) = a(index, ;) ! Shouldn't this actually be a do loop?
    ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. In all cases, an additional array is needed. Therefore, it is good for me as is.

Added a licnsing section. Changed the description of `ORD_SORT`, `ORD_SORTING`,
and `UNORD_SORT`. Added more examples for `ORD_SORTING`. Simplified the
discussion of `INT_SIZE`. Used `stdlib_kinds` instead of `iso_fortran_env`.
Changed `processor` to `compiler`. Fixed spelling of array.

[ticket: X]
@gareth-nx
Copy link
Contributor

@wclodius2 I wonder whether the routine ORD_SORTING might be better named ORD_SORT_INDEX (or perhaps ORD_ARGSORT)? To me, this would make the difference with ORD_SORT more obvious.

@wclodius2
Copy link
Contributor Author

I now have three requested changes in the API. Let me know if you have strong preferences.

  1. Change the name of ORD_SORTING. Possibilities include:
    • ORD_SORT_INDEX
    • ORD_ARGSORT
    • ORD_INDEX
    • SORT_INDEX
    • Maybe some other name?
  2. Add a REVERSE subroutine
  3. Add a wrapper for the C/C++ library qsort. This would allow direct sorting of arrays of a derived type.

@gareth-nx
Copy link
Contributor

@wclodius2

  1. I was wondering whether it is straightforward to also support sorting arrays of type character(len=*) in these routines?
  2. Regarding the names - ultimately I suggest aiming for 2 'wrapper' routines that allow the user to choose the sort method, such as sort(array, ...various-optional-arguments...., method='ord_sort') and similarly for the sort_index (or argsort). Then we could simply steer users to 2 routines, for sorting objects of intrinsic type. The qsort_C routine would have to be separate (if included) - but for users we would suggest only using this in complex cases that are not treated by one of the other routines.

@wclodius2
Copy link
Contributor Author

wclodius2 commented Apr 16, 2021

@gareth-nx

  1. It is relatively straight forward (but tedious) to generalize the codes to support sorting arrays of type character(len=*) you just have to be careful in your indexing operations. However I don't know how commonly such arrays are used. I would find sorting a derived type such as
    type string
        character(:), allocatable :: name
    end type string

to be more useful and easier to implement, but liable to bring up a lot of bikesheding over the details of the definition of string. Adding a feature like this is also liable to change the structure of the code from the current single monolithic stdlib_sorting module to a submodule for each generic procedure.

  1. Maybe I have my blinders on, but I don't see much advantage to the wrapper code.

@gareth-nx
Copy link
Contributor

gareth-nx commented Apr 16, 2021

@wclodius2

Thanks. A few more comments below -- but IMO, all these matters can optionally be addressed with future pull requests, I don't mean to suggest you should do it all in your initial implementation.

  1. It is relatively straight forward (but tedious) to generalize the codes to support sorting arrays of type character(len=*) ... However I don't know how commonly such arrays are used. I would find sorting a derived type such as ... to be more useful and easier to implement, but liable to bring up a lot of bikesheding over the details of the definition of string.....

I use arrays of character(len=*) quite often for convenience, notwithstanding the storage inefficiency, but can't comment on how often others use them.

Regarding the string type, it seems there is already one in stdlib. It would be most natural for that module to implement the details (perhaps using sufficiently generic routines from this module, if we provide them), since the details will be tied to the particular definition of the string type.

In any case, something like qsort_C could allow for a low-effort implementation in these and similar cases, where the currently proposed sorting module cannot realistically support such diverse types.

  1. Maybe I have my blinders on, but I don't see much advantage to the wrapper code.

My intuition is that many users will barely read the documentation, and will be drawn to routines with familiar names. If we provide a sort and sort_index that treat 90% of cases featuring intrinsic types in a reasonable way (i.e. using algorithms where even the "unusual case" performance is not too bad), I guess that would be helpful. Then if in the remaining 10% of cases people choose to optimize the performance, we can assume they will be motivated to properly read the documentation. A method option is one way to support them. However, your approach of providing distinct subroutines unord_sort, ord_sort is also good for this purpose -- and indeed might make for easier interfaces than what I suggested.

Did you want me to provide a review at this stage (so you can commit things)? I still need to teach myself the process. Or are you going to append the implementation within this pull request, after there has been enough discussion of the interface?

@wclodius2
Copy link
Contributor Author

@gareth-nx

You may not mean to suggest that I do it all in the initial implementation, but I don't like doing an incomplete job, and I now consider it incomplete.

I definitely think I should add a qsort wrapper. I think it will only take a little over an hour to implement it and the associated testing. Documenting it will take longer. The name qsort_C is acceptable, but if someone thinks of a better name let me know.

Adding character(*) and string_type sorting routines are more work. About an hour each type for each generic, so six hours total. Then a couple of hours writing the testing code. Documenting them all will take about an hour. I think they should be added, but finding a day's plus effort in addition to any other requested changes and a life outside of Git is daunting.

At 2161 lines, the stdlib_sorting.fypp module file is starting to seem bloated so I think I should break it up into submodules. This will take a couple of hours, and requires relatively stable names for the generic functions. As the one with the simplest interface and reasonable all round performance, I suspect UNORD_SORT would be best as SORT, while ORD_SORTING would be better described as SORT_INDEX, and ORD_SORT` can remain as is.

I can't commit anything. That is the job of designated reviewers. I want the sort of feedback you are giving me, so I have a good idea of the best API before submitting the code to join the PR.

@gareth-nx
Copy link
Contributor

gareth-nx commented Apr 17, 2021

Great -- but note I've become unsure whether I'm using qsort correctly -- and have posted a question here to seek clarity. (Edit April 18 -- based on that discussion I corrected the example in the other thread ).

@wclodius2
Copy link
Contributor Author

Well I am not a C interoperability expert. I'll follow Fortran Discourse to see what they tell you.

@awvwgk awvwgk added the specification Discussion and iteration over the API label Apr 17, 2021
@jvdp1
Copy link
Member

jvdp1 commented Apr 17, 2021

I suspect UNORD_SORT would be best as SORT, while ORD_SORTING would be better described as SORT_INDEX, and ORD_SORT` can remain as is.

I think the proposed renaming makes sense.

Regarding C's qsort, I would suggest to keep it for a separate PR. If I am not wrong, it would be the first C wrapper introduced in stdlib. This might require more discussion that might delay the original aim of this PR. I guess that unord_sort, ord_sort, and ord_sorting would cover 90-95% cases. Furthermore the current proposition would be already useful for future implementations (like stat functions)

@wclodius2
Copy link
Contributor Author

I am thinking of dropping the zero_based argument to the ORD_SORTING/SORT_INDEX procedure. It complicates the interface, and its effects are easily achieved by subtracting 1 from the output INDEX. Its effects internally, however, are minor.

@ivan-pi
Copy link
Member

ivan-pi commented Apr 18, 2021

I would also prefer to keep the interface to the C qsort as a separate pull request (as a C compatibility module).

I agree with dropping the zero_based dummy variable. The tools in stdlib should be geared primarily towards usage within Fortran and not as a basis for cross-language API's.

@wclodius2
Copy link
Contributor Author

FWIW you can have zero based dummy variables in pure Fortran. It is natural for translating code from other languages. But it is rare enough that I don't think it needs to be supported.

@wclodius2
Copy link
Contributor Author

I had just about finished implementing sorting for character(*) arrays, when I realized there are two ways of sorting on EBCDIC machines (thank you IBM): LGT (sorting according to the ASCII character set), which is more intuitive, and > (sorting according to the native character set), which is faster on EBCDIC machines. So far I have been using >. Does anyone have any preferences? It's a lot of work to provide both types of sorting.

@gareth-nx
Copy link
Contributor

IMO the version based on > sounds preferable because it is faster. Also I wonder whether the LGT approach will work with non-ascii characters - if not, that would be another reason to prefer >.

Ultimately I think diverse alternatives are best supported by a routine that supports use of a user-specified comparison function (either to sort the data directly, or sort array indices). The qsort_C routine is one such approach -- and there might be good Fortran alternatives (?) -- and as suggested above it seems natural to discuss this as part of a separate pull request.

Added sorting for `character(*)` and `string_type` rank 1 arrays. Improved the
overall documentation. Added newer benchmarks for the addiitons and the Intel
One API compiler system.

[ticket: X]
@wclodius2
Copy link
Contributor Author

I have now committed and pushed a revised version of the markdown document stdlib/doc/specs/stdlib_sorting.md incorporating my additions of sorting for character(*) arrays and string_type arrays. The biggest surprise is how slow sorting of string_type arrays are compared to character(*) arrays with the same information. I had expected a factor of two slowdown from the additional layer of abstraction, but instead I get well over a factor of ten slowdown.

@gareth-nx
Copy link
Contributor

Very interesting regarding the performance of the string_type.

From the timing table, it looks like the routine called sort is typically slower than the routine called ord_sort? If I got that right, consider whether it would be better to rename sort as unord_sort, and ord_sort as sort.

@jvdp1
Copy link
Member

jvdp1 commented Apr 21, 2021

From the timing table, it looks like the routine called sort is typically slower than the routine called ord_sort? If I got that right, consider whether it would be better to rename sort as unord_sort, and ord_sort as sort.

Since unord_sort is more general than ord_sort (that requires knowledge about the array to be sorted), I would keep it as it is now: sort = unord_sort, and ord_sort.

@gareth-nx
Copy link
Contributor

gareth-nx commented Apr 21, 2021

My understanding is that both unord_sort and ord_sort will work in all cases.

To my understanding (see also the plot below which uses timings from the pull request), the difference is that unord_sort is slightly more optimal for random data, but can be much slower for ordered data. Conversely, ord_sort is pretty good for both random and ordered data. Although it can be slightly slower than unord_sort for random data.

Be good to hear from @wclodius2. But the key point is, assuming that most users will jump to sort, I think it's good to give them a routine that has pretty good performance in all cases.

Sorting_performance

@wclodius2
Copy link
Contributor Author

@gareth-nx and @jvdp1

I really like the graph, but what it shows can be misleading. Where ORD_SORT is fastest is on data that has significant regions that are already "presorted". It is slower than SORT on purely random data. Except in limited cases where you know you are dealing with presorted data, e.g. combined analyses of separate previously analyzed data or reanalyses of previously analyzed data that has had new data samples added to the analysis, significant regions that are already "presorted" are expected to be a negligible fraction of the possible data collections. SORT also has the simpler API, and I expect it to be less likely than ORD_SORT to exceed stack memory limits on different systems when dealing with large data sets. Note that SORT's performance is largely independent of presorting. It is not that SORT is doing badly on presorted data, it is that ORD_SORT is doing exceptionally well. Note that your graph appears to be for the gfortran lests: ifort tends to be twice as fast as gfortran on the numeric arrays, they have comparable performance on the default character strings, but gfortran tends to be more than twice as fast as ifort on the string_type arrays.

@wclodius2
Copy link
Contributor Author

wclodius2 commented Apr 21, 2021 via email

@gareth-nx
Copy link
Contributor

Thanks for the clarifications @wclodius2 -- I now agree that unord_sort is a good default option

@wclodius2
Copy link
Contributor Author

wclodius2 commented Apr 21, 2021

The addition of the sorting for string_type arrays has greatly increased the runtime for my test_sorting.f90 test code, compiled with the Intel OneAPI ifort, from a little over a minute to more than ten minutes. Currently each test case is run eight times and the average run time is reported for each case to reduce the "noise" in the reported value. I am tentatively going to change the number of repetitions from eight to four. The alternative is to change the size of the string_type arrays. Are there any other suggestions?

Copy link
Contributor

@gareth-nx gareth-nx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be very enabling for stdlib, thanks.

`quicksort`, `insertion sort`, and `heap sort`. While this algorithm's
runtime performance is always O(N Ln(N)), it is relatively fast on
randomly ordered data, but inconsistent in performance on partly
sorted data.as the official source of the algorithm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a typo here ((data.as)?

Also, recalling the discussion during review, is it better to say something like "... is relatively fast on randomly ordered data, but does not show a speed improvement on partially sorted data (whereas ord_sort can give faster performance in partially ordered cases)".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to: "is relatively fast on randomly ordered data, but does not show the improvement in performance on partly sorted data found for ORD_SORT."

can show either slower `heap sort` performance, or enhanced
performance by up to a factor of six. Still, even when it shows
enhanced performance, its performance on partially sorted data is
typically an order of magnitude slower than `ORD_SORT`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I suggest mentioning one of the benefits -- flagged during the review -- along the lines that that sort requires less memory than ord_sort, and is less likely to exhaust the stack.

It is good to clear regarding the reasons that ord_sort is not the default (as we previously discussed at #386 (comment)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added "Its memory requirements are also low, being of order O(Ln(N)), while the memory requirements of ORD_SORT and SORT_INDEX are of order O(N)."

! Concatenate the arrays
allocate( array( size(array1) + size(array2) ) )
array( 1:size(array1) ) = array1(:)
array( size(array1)+1:size(array1)+size(array2) ) = array2(:)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider simplifying the previous 3 lines of code with array = [array1, array2]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

##### Description

Returns an integer array whose elements would sort the input array in
the specified direction retaining order stability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be emphasised that the argument array will be returned sorted, e.g. in the text above, Sorts an array, and also returns an integer array whose elements would sort the input array (prior to sorting) in the specified direction retaining order stability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "Returns the input array sorted in the direction requested while retaining order stability, and an integer array whose elements would sort the input array to produce the output array."

...
```

#### `sort_index` - creates an array of sorting indices for an input array.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest emphasising that the input array is sorted as well -- this surprised me initially when I saw the interface. Something like ... - sorts the input array, and creates an array of sorting indices for the original input array (prior to sorting).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want a section heading to be too long. Changed to "- creates an array of sorting indices for an input array, while also sorting the array."

!! ! Find the indices to sort a
!! call sort_index(a, index(1:size(a)),&
!! work(1:size(a)/2), iwork(1:size(a)/2))
!! ! Sort b based on the sorting of a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emacs sometimes inserts tabs, that can affect the formatting depending on how viewed. I think I have fixed it.

!!
!!```Fortran
!! subroutine sort_related_data( array, column, work, index, iwork )
!! ! Sort `a_data` in terms or its component `a`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to replace with ! Reorder rows of arraysuch thatarray(:,column)` is sorted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire example was badly written. Replaced by:

	subroutine sort_related_data( array, column, work, index, iwork )
	    ! Reorder rows of `array` such that `array(:, column)` is  sorted
	    integer, intent(inout)           :: array(:,:)
		integer(int32), intent(in)       :: column
		integer(int32), intent(inout)    :: work(:)
		integer(int_size), intent(inout) :: index(:)
		integer(int_size), intent(inout) :: iwork(:)
		integer, allocatable             :: dummy(:)
		integer :: i
		allocate(dummy(size(array, dim=1)))
		! Extract a column of `array`
		dummy(:) = array(:, column)
		! Find the indices to sort the column
		call sort_index(dummy, index(1:size(dummy)),&
		    work(1:size(dummy)/2), iwork(1:size(dummy)/2))
		! Sort a based on the sorting of its column
		do i=1, size(array, dim=2)
		    array(:, i) = array(index(1:size(array, dim=1)), i)
		end do
	end subroutine sort_related_data

!! Sorting an array of a derived type based on the dsta in one component
!!```fortran
!! subroutine sort_a_data( a_data, a, work, index, iwork )
!! ! Sort `a_data` in terms or its component `a`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Similar comment as previous -- mention that a is also sorted on return -- or alternatively consider making a a local variable in the subroutine).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example entirely rewritten - see above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I misread your comment. You missed a typo in "dsta". The quote was changed to

Sorting an array of a derived type based on the data in one component

	subroutine sort_a_data( a_data, a, work, index, iwork )
	    ! Sort `a_data` in terms or its component `a` also sorting `a`.

#:for k1 in INT_KINDS

module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse )
! A modification of `${k1}$_ord_sort` to return an array of indices that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest mentioning that it also sorts array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to see someone start to review the module files.

Changed to

! A modification of ${k1}$_ord_sort to return an array of indices that
! would perform a stable sort of the ARRAY as input, and also sort ARRAY
! as desired. The indices by default

Note I didn't try to change subsequent lines to maintain a consistent column width. Also changed the comments for the other versions of `SORT_INDEX'.

end subroutine char_sort_index

module subroutine string_sort_index( array, index, work, iwork, reverse )
! A modification of `string_ord_sort` to return an array of indices that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include mention that is also sorts the input array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as noted above.

Changed the markdown document docs/specs/stdlib_sorting.md to correct problems
noted by gareth_nx. Changed comments in the FYPP file
src/stlib_sorting_sort_index.fypp to note that the input array is sorted on
output.

[ticket: X]
Forgot to save an edit prior to the previous commit of
docs/specs/stdlib_sorting.md. Saved and commited that edit.

[ticket: X]
Review-dog caught several misspellings in the comments of the source codes.

[ticket: X]
@gareth-nx
Copy link
Contributor

In the stdlib_sorting.md documentation file, the top description of SORT_INDEX is:

  • SORT_INDEX is intended to provide indices for sorting arrays of
    derived type data, based on the ordering of an intrinsic component
    of the derived type; and

To me this somewhat too specific, because a derived type need not be involved. Also I suggest noting that the input array is still sorted. Consider something like this:

  • SORT_INDEX is like ORD_SORT, but in addition to sorting the input array, it returns indices that map the original array to its sorted version. This enables related arrays to be re-ordered in the same way.

@gareth-nx
Copy link
Contributor

Related to my comment above, in stdlib_sorting.md, there is the following description of SORT_INDEX:

The SORT_INDEX subroutine

The SORT and ORD_SORT subroutines can sort rank 1 isolated
arrays of intrinsic types, but do nothing for the coordinated sorting
of related data, e.g., multiple related rank 1 arrays, higher rank
arrays, or arrays of derived types. For such related data, what is
useful is an array of indices that maps a rank 1 array to its sorted
form. For such a sort, a stable sort is useful, therefore the module
provides a subroutine, SORT_INDEX, that generates such an array of
indices based on the ORD_SORT algorithm.

I suggest editing the end of the last sentence of the above paragraph to be "....that generates such an array of
indices based on the ORD_SORT algorithm, in addition to sorting the input array".

Modified doc/specs/index.md so it mentioned stdlib_sorting.md.

[ticket: X]
@wclodius2
Copy link
Contributor Author

I have committed and pushed the related changes:

  • SORT_INDEX is based on ORD_SORT, but in addition to sorting the
    input array, it returns indices that map the original array to its
    sorted version. This enables related arrays to be re-ordered in the
    same way; and

"....that generates such an array of indices based on the ORD_SORT
algorithm, in addition to sorting the input array".

Added changes in the description of `SORT_INDEX`.

[ticket: X]
@w6ws
Copy link

w6ws commented May 6, 2021

Great idea to add sorting to the library! Some basic sorting intrinsics should have been added to Fortran decades ago. The old High Performance Fortran (i.e., http://hpff.rice.edu/versions/hpf2/hpf-v20.pdf) had specs for HPF intrinsic functions SORT_UP, SORT_DOWN, GRADE_UP, and GRADE_DOWN. Note that these were defined as functions, not subroutines, so that they can be used in expressions. As an old APL programmer, it seems natural to have functional forms of the sorts.

FWIW, I generally just use a version of the LAPACK SLASRT subroutine for the actual sort algorithm. I've 'templated' it using the macro capabilities of the C preprocessor to support various data types - including character(*), reals, and integers. Not sure how it compares with the timsort and other rust-inspired algorithms you've used yet. I need to learn about fypp, and recoding my SLASRT-based code will be a good start.

I noticed some comments above about using some glue to call C qsort. The C qsort suffers from invoking the user-defined comparison routine through a procedure pointer. The C++ STL 'sort' is much better than qsort because it inlines the user-defined comparison code. But the SLASRT code was even faster yet - last time I checked.

@wclodius2
Copy link
Contributor Author

Making them functions would increase memory requirements. SLASRT is a quicksort routine using insertion sort only for small arrays. It will have performance problems for significantly presorted data.

@awvwgk awvwgk added reviewers needed This patch requires extra eyes and removed specification Discussion and iteration over the API labels May 7, 2021
@w6ws
Copy link

w6ws commented May 7, 2021

Making them functions would increase memory requirements. SLASRT is a quicksort routine using insertion sort only for small arrays. It will have performance problems for significantly presorted data.

I appreciate your thoughts. When I've tested performance, I typically used pseudo-random numbers. E.g., what the built-in RANDOM_NUMBER intrinsic provides. So other cases, like the significantly presorted data you mention, could certainly differ.

Another suggestion for eventual inclusion in stdlib are grade up/down versions - which output an integer permutation array rather than the sorted values. Could also be done via an optional output argument. The permutation array would allow easy use with derived types, assuming a single key, without needing user-supplied comparison function (like the C qsort and C++ STL sort do). In fact Iversons original APL spec only had grade up/down operators. Sort up/down were added later. Same with HPF 1.0 vs HPF 2.0.

@wclodius2
Copy link
Contributor Author

How do grade_up and grade_down differ from sort_index?

@w6ws
Copy link

w6ws commented May 7, 2021

How do grade_up and grade_down differ from sort_index?

Please correct me if I'm wrong, but it appears that sort_index() routine returns both the sorted values and the index array. For an array of derived types, one often wants just the index array. E.g., one could write something like:

type mydata_t
character(8) :: key
character(:), allocatable :: text_data
end type
....
type(mydata_t), allocatable :: mydata(:)
integer, allocatable :: indicies(:)
....
indicies = grade_up (mydata(:)%key)
(then access or rearrange mydata as needed using the indicies array)

With the sort_index(), the array argument has intent(inout) - which means that in the above case, the keys would be rearranged, but the associated data would not. The caller would need to explicitly make a temporary key array before calling sort_index().

@w6ws
Copy link

w6ws commented May 7, 2021

....
indicies = grade_up (mydata(:)%key)
(then access or rearrange mydata as needed using the indicies array)

A simple case of sorting a derived type would be clearly written:

mydata = mydata(grade_up (mydata%key))

It is totally safe because the indicies returned by grade_up are unique from one another.

@wclodius2
Copy link
Contributor Author

I had problems with stack overflow with my codes, so I have emphasized minimizing memory use. I use a Mac for my testing, and on a Mac the maximum stack size is 64 Mbytes. I did my testing on arrays with N=2**20 elements, so the input arrays were 4 Mbytes. My initial testing was of a pure Quicksort, which on partly sorted data would have of order N stack frames. About 2**20 stack frames with M bytes per frame easily exceeded the stack limit. Getting away from a pure Quicksort fixed that problem, but

  1. Having an "internal" allocatable array to return as a function result increases the stack size limiting input arrays on a Mac to less than 2**24 elements for the Quicksort based SORT, while without the internal array SORT's stack size is of order Ln(N) so it can handle almost unlimited size arrays;
  2. Similar conditions apply to the Heapsort based ORD_SORT and SORT_INDEX. Both of these codes need "scratch" arrays that require about half as many elements as the input arrays. These were initially allocated internally limiting maximum input array sizes to about 2**25 for ORD_SORT and 2**23 for SORT_INDEX. By allowing the user to optionally provide the "scratch" arrays their input array sizes also became almost unlimited. However the input scratch arrays are intent(inout) which means the standard conforming ORD_SORT and SORT_INDEX cannot be functions. Having an "internal" array so that SORT_INDEX would not modify its input array, would still limit the size of the input array due to stack limits.

@w6ws
Copy link

w6ws commented May 7, 2021

LOL! Yeah - linux defaults to 8 mb stack size limit per process. I set my stack limit to default to unlimited a very long time ago...

Both gfortran and Intel Fortran have compiler options to set a threshold whether to place temporaries on the stack or the heap - including all on the heap. PGI might. I don't think NAG did. Obviously if one sets the default to place everything on the heap, the resulting increase in malloc/free calls could hurt performance.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are already some comments on the specs. I willl review the code ASAP

randomly ordered data, but does not show the improvement in
performance on partly sorted data found for `ORD_SORT`.

As with `introsort`, `SORT` is an unstable hybrid algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence could be removed, because already mentioned at L156-160.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


# The `stdlib_sorting` module

(TOC)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(TOC)
[TOC]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


## Overview of the module

The module `stdlib_sorting` defines several public entities one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The module `stdlib_sorting` defines several public entities one
The module `stdlib_sorting` defines several public entities, one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* `ORD_SORT` is intended to sort simple arrays of intrinsic data
that have significant sections that were partially ordered before
the sort;
* `SORT_INDEX` is based on ORD_SORT, but in addition to sorting the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `SORT_INDEX` is based on ORD_SORT, but in addition to sorting the
* `SORT_INDEX` is based on `ORD_SORT`, but in addition to sorting the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

`ORD_SORT` begins by traversing the array starting in its tail
attempting to identify `runs` in the array, where a run is either a
uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a
non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once delimitated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once delimitated
non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. First delimitated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

! Process the sorted array
call array_search( array, values )
...
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```
```fortran
program demo_ord_sort
use stdlib_sorting, only: ord_sort
implicit none
integer, allocatable :: array1(:), work(:)
array1 = [ 5, 4, 3, 1, 10, 4, 9]
allocate(work, mold = array1)
call ord_sort(array1, work)
print*, array1 !print [1, 3, 4, 4, 5, 9, 10]
end program demo_ord_sort
```

suggestion: a small program that could be copyied/pasted, and compiled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


##### Example

```fortran
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the previous example, I suggest to treplace this example by a small demo program.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

wclodius2 and others added 3 commits May 8, 2021 19:59
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
@wclodius2
Copy link
Contributor Author

I made a mistake. I updated my local copy while also accepting some of Jeremy's suggestions, so my local copy is inconsistent with that in this repository. I am not an expert on GIT. How do I merge the repository with my local copy?

@gareth-nx
Copy link
Contributor

I'm also not a git expert, but would try doing:
git fetch origin pull/386/head:SortingStdlibPullRequest
to make a new local branch SortingStdlibPullRequest that contains the most recent version of this pull request (i.e. number 386).

Then I would use git diff SortingStdlibPullRequest to view the differences with the local branch, and git merge SortingStdlibPullRequest to try to merge it (actually before doing the latter, I'd probably checkout a new branch with my local changes and try to merge there first, in case something breaks). If done successfully then the local branch should be consistent with the most recent version of the pull request. At that point I expect you can proceed as usual?

@wclodius2
Copy link
Contributor Author

After the merge, I have wound up with a new branch at #408.

@jvdp1
Copy link
Member

jvdp1 commented May 16, 2021

@wclodius2 for clarity, I will close this PR in favour of #408
It can be reopened if needed.

@jvdp1 jvdp1 closed this May 16, 2021
@awvwgk awvwgk removed the reviewers needed This patch requires extra eyes label Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants