-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add local to global index mapping #1707
Conversation
1070d82
to
8392d07
Compare
c43c48a
to
c3b27c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not finish review yet.
why is it required for PGM?
after merging row and column, do they still need to map back to the global index?
you need to reindex all of them because the size will be the smaller than the original one after coarsening.
// | ||
// SPDX-License-Identifier: BSD-3-Clause | ||
|
||
#ifndef GINKGO_PARTITION_HPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
#ifndef GINKGO_PARTITION_HPP | |
#ifndef GKO_CORE_DISTRIBUTED_DEVICE_PARTITION_HPP_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is probably time now to move to #pragma once
?
exec->run(partition::make_build_ranges_by_part( | ||
part_ids_.get_const_data(), get_num_ranges(), get_num_parts(), | ||
range_ids, num_ranges_per_part)); | ||
ranges_by_part_ = segmented_array<size_type>::create_from_sizes( | ||
std::move(range_ids), num_ranges_per_part); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, this rely on the atomic add to count the size.
then segmented_array will convert size to offset.
Maybe you can implement a similar kernel with convert_idxs_to_ptrs + additional index map from i to range_id[i] because range_part[range_id[i]]
is sorted. then segmented_array can create_from_offsets
directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be future improvement not in this pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for now I would just use atomics. I don't think this is a performance critical part, so it should be fine for a while.
@yhmtsai I don't understand your point, could you rephrase this? |
rank 0: holds row 0, 1, 4 |
yes, but if PGM does not need it, do you still need this function for something else? |
I think I need at least parts of it to keep the old distributed matrix constructor around. @fritzgoebel also seems to need it for his domain decomposition matrix. Anyway, this still sounds like a discussion for #1639. |
c3b27c2
to
bc75a0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, but mostly looks good!
// | ||
// SPDX-License-Identifier: BSD-3-Clause | ||
|
||
#ifndef GINKGO_PARTITION_HPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is probably time now to move to #pragma once
?
return {num_parts, | ||
partition->get_num_empty_parts(), | ||
partition->get_size(), | ||
partition->get_range_bounds(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be a check whether these pointers are on device executors ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit of a temporary solution until we fully introduce device-side views for all of our (relevant) types. So I would leave handling that issue until then.
TEST_F(IndexMap, CanGetGlobalWithLocalISWithInvalid) | ||
{ | ||
gko::array<global_index_type> global_ids(ref); | ||
gko::array<local_index_type> local_ids(ref, {5, 4, 10, 3, 2, 1, 0, 100, 4}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The partition contains 18 elements, so I think it would be relevant to check so that global_id=17 is output correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The global_id=17 is not part of the local index set of rank 1, which is used here. I changed it to rank 0 for this test, and now it also outputs gid=17.
5233e98
to
7863f01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exec->run(partition::make_build_ranges_by_part( | ||
part_ids_.get_const_data(), get_num_ranges(), get_num_parts(), | ||
range_ids, num_ranges_per_part)); | ||
ranges_by_part_ = segmented_array<size_type>::create_from_sizes( | ||
std::move(range_ids), num_ranges_per_part); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be future improvement not in this pr
// | ||
// SPDX-License-Identifier: BSD-3-Clause | ||
|
||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still prefer the official supporting macro approach not relying on the compiler support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only very obscure edge cases where pragma once doesn't work properly (because some filesystems don't make it easy to see if two files have the same identity), I think we could safely move from header guards to #pragma once
if we wanted to. Every relevant compiler supports it.
assert(range_id < | ||
std::distance(partition.offsets_begin, partition.offsets_end) - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(range_id < | |
std::distance(partition.offsets_begin, partition.offsets_end) - 1); | |
assert(range_id < | |
std::distance(partition.offsets_begin, partition.offsets_end)); |
do you need -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the offsets array has #range-ids + 1
entries, since it's an offset array. For an empty partition, offsets = {0}
, so the begin and end pointers are different.
auto local_range_id = | ||
it != local_ranges_size ? it : max(int64(0), it - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this happen? it seems to be broken if we reach the else branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that I implemented this a bit incorrectly. I'm now using the correct predicate for the binary search, which makes the check here unnecessary. If also added a comment to clarify this a bit.
thrust::stable_sort_by_key(policy, range_parts_ptr, | ||
range_parts_ptr + num_ranges, range_ids_ptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it uses the truth that range ids is also sorted, right?
such that it is sorted by (part_id, range_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although the part_id
is not used after this anymore. It's only relevant that the range_id
is first sorted by part_id
, and then by range_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor change requests and comments
@@ -69,6 +69,9 @@ class Partition : public CommonTestFixture { | |||
gko::make_array_view( | |||
this->exec, dpart->get_num_parts(), | |||
const_cast<local_index_type*>(dpart->get_part_sizes()))); | |||
|
|||
GKO_ASSERT_SEGMENTED_ARRAY_EQ(part->get_ranges_by_part(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side-node: Since this is a function returning void
, this will not prevent subsequent operations from executing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be implemented the same way as the GKO_ASSERT_ARRAY_EQ
macro. The segmented_array_equal
function will return something, so I don't know if there is anything more to do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the overall approach of implementing a custom assert_something
function may cause tests to continue executing after the assertion fails, which may or may not be desirable. It's fine here, but something to keep in mind. To fix this, we would need to return testing::AssertionResult
and wrap the call in a macro. But nothing to hold back the code here, just a comment that people should be aware of.
// | ||
// SPDX-License-Identifier: BSD-3-Clause | ||
|
||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only very obscure edge cases where pragma once doesn't work properly (because some filesystems don't make it easy to see if two files have the same identity), I think we could safely move from header guards to #pragma once
if we wanted to. Every relevant compiler supports it.
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be kept
|
||
|
||
template <typename LocalIndexType, typename GlobalIndexType> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should kept
92b9692
to
a75e164
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
a75e164
to
7e19ec3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
8db4eea
to
5b5ddac
Compare
5b5ddac
to
5be2c7f
Compare
37f5bba
to
d7d3d40
Compare
Signed-off-by: Marcel Koch <[email protected]>
5be2c7f
to
285ff97
Compare
Signed-off-by: Marcel Koch <[email protected]>
Signed-off-by: Marcel Koch <[email protected]>
Signed-off-by: Marcel Koch <[email protected]>
Co-authored-by: Pratik Nayak <[email protected]>
- reduce tests - update docs - minor refactoring - fix binary search usage in cuda/hip - refactor - add tests for segmented array assertion Co-authored-by: Yu-Hsiang M. Tsai <[email protected]> Co-authored-by: Pratik Nayak <[email protected]> Co-authored-by: Tobias Ribizel <[email protected]>
285ff97
to
71ac6a4
Compare
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1707 +/- ##
===========================================
- Coverage 89.17% 88.58% -0.59%
===========================================
Files 817 818 +1
Lines 67816 68151 +335
===========================================
- Hits 60474 60374 -100
- Misses 7342 7777 +435 ☔ View full report in Codecov by Sentry. |
This PR adds a local-to-global mapping to the
index_map
class. It is necessary for the follow-up PR #1639. This also introduces a device-side view of a partition, and adds another member to the partition which stores the range-ids segmented by their part-id.(I know that the branch name is wrong...)