Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local to global index mapping #1707

Merged
merged 8 commits into from
Feb 28, 2025
Merged

Add local to global index mapping #1707

merged 8 commits into from
Feb 28, 2025

Conversation

MarcelKoch
Copy link
Member

This PR adds a local-to-global mapping to the index_map class. It is necessary for the follow-up PR #1639. This also introduces a device-side view of a partition, and adds another member to the partition which stores the range-ids segmented by their part-id.

(I know that the branch name is wrong...)

@MarcelKoch MarcelKoch added the 1:ST:ready-for-review This PR is ready for review label Oct 23, 2024
@MarcelKoch MarcelKoch added this to the Ginkgo 1.9.0 milestone Oct 23, 2024
@MarcelKoch MarcelKoch self-assigned this Oct 23, 2024
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. mod:all This touches all Ginkgo modules. labels Oct 23, 2024
@MarcelKoch MarcelKoch mentioned this pull request Oct 23, 2024
2 tasks
@fritzgoebel fritzgoebel self-requested a review October 30, 2024 10:36
@MarcelKoch MarcelKoch modified the milestones: Ginkgo 1.9.0, Ginkgo 1.10.0 Dec 9, 2024
@MarcelKoch MarcelKoch force-pushed the global-to-local branch 2 times, most recently from c43c48a to c3b27c2 Compare December 17, 2024 11:26
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not finish review yet.
why is it required for PGM?
after merging row and column, do they still need to map back to the global index?
you need to reindex all of them because the size will be the smaller than the original one after coarsening.

//
// SPDX-License-Identifier: BSD-3-Clause

#ifndef GINKGO_PARTITION_HPP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
#ifndef GINKGO_PARTITION_HPP
#ifndef GKO_CORE_DISTRIBUTED_DEVICE_PARTITION_HPP_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is probably time now to move to #pragma once ?

Comment on lines +133 to +137
exec->run(partition::make_build_ranges_by_part(
part_ids_.get_const_data(), get_num_ranges(), get_num_parts(),
range_ids, num_ranges_per_part));
ranges_by_part_ = segmented_array<size_type>::create_from_sizes(
std::move(range_ids), num_ranges_per_part);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, this rely on the atomic add to count the size.
then segmented_array will convert size to offset.
Maybe you can implement a similar kernel with convert_idxs_to_ptrs + additional index map from i to range_id[i] because range_part[range_id[i]] is sorted. then segmented_array can create_from_offsets directly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be future improvement not in this pr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for now I would just use atomics. I don't think this is a performance critical part, so it should be fine for a while.

@MarcelKoch
Copy link
Member Author

after merging row and column, do they still need to map back to the global index?
you need to reindex all of them because the size will be the smaller than the original one after coarsening.

@yhmtsai I don't understand your point, could you rephrase this?

@yhmtsai
Copy link
Member

yhmtsai commented Dec 18, 2024

rank 0: holds row 0, 1, 4
rank 1: holds row 2, 3
if we mege 0, 1 and 2, 3
rank 0: 0, 4
rank 1: 2
we need to compute the size of each gap from different node and then reindex the global index
rank 0: 0, 2
rank 1: 1

@MarcelKoch
Copy link
Member Author

@yhmtsai I think this discussion makes more sense in #1639. I don't see much relevance to this PR.

@yhmtsai
Copy link
Member

yhmtsai commented Dec 18, 2024

yes, but if PGM does not need it, do you still need this function for something else?

@MarcelKoch
Copy link
Member Author

I think I need at least parts of it to keep the old distributed matrix constructor around. @fritzgoebel also seems to need it for his domain decomposition matrix.

Anyway, this still sounds like a discussion for #1639.

Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, but mostly looks good!

//
// SPDX-License-Identifier: BSD-3-Clause

#ifndef GINKGO_PARTITION_HPP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is probably time now to move to #pragma once ?

return {num_parts,
partition->get_num_empty_parts(),
partition->get_size(),
partition->get_range_bounds(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a check whether these pointers are on device executors ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit of a temporary solution until we fully introduce device-side views for all of our (relevant) types. So I would leave handling that issue until then.

TEST_F(IndexMap, CanGetGlobalWithLocalISWithInvalid)
{
gko::array<global_index_type> global_ids(ref);
gko::array<local_index_type> local_ids(ref, {5, 4, 10, 3, 2, 1, 0, 100, 4});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The partition contains 18 elements, so I think it would be relevant to check so that global_id=17 is output correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global_id=17 is not part of the local index set of rank 1, which is used here. I changed it to rank 0 for this test, and now it also outputs gid=17.

Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more comments.

@upsj upsj self-requested a review February 13, 2025 11:50
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new testing method GKO_ASSERT_SEGMENTED_ARRAY_EQ needs a test in core/test/utils/assertions_test.cpp as @pratikvn mentioned
some comments from @pratikvn seems not to be checked yet

Comment on lines +133 to +137
exec->run(partition::make_build_ranges_by_part(
part_ids_.get_const_data(), get_num_ranges(), get_num_parts(),
range_ids, num_ranges_per_part));
ranges_by_part_ = segmented_array<size_type>::create_from_sizes(
std::move(range_ids), num_ranges_per_part);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be future improvement not in this pr

//
// SPDX-License-Identifier: BSD-3-Clause

#pragma once
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer the official supporting macro approach not relying on the compiler support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only very obscure edge cases where pragma once doesn't work properly (because some filesystems don't make it easy to see if two files have the same identity), I think we could safely move from header guards to #pragma once if we wanted to. Every relevant compiler supports it.

Comment on lines +90 to +91
assert(range_id <
std::distance(partition.offsets_begin, partition.offsets_end) - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert(range_id <
std::distance(partition.offsets_begin, partition.offsets_end) - 1);
assert(range_id <
std::distance(partition.offsets_begin, partition.offsets_end));

do you need -1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the offsets array has #range-ids + 1 entries, since it's an offset array. For an empty partition, offsets = {0}, so the begin and end pointers are different.

Comment on lines 336 to 337
auto local_range_id =
it != local_ranges_size ? it : max(int64(0), it - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this happen? it seems to be broken if we reach the else branch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I implemented this a bit incorrectly. I'm now using the correct predicate for the binary search, which makes the check here unnecessary. If also added a comment to clarify this a bit.

Comment on lines +155 to +156
thrust::stable_sort_by_key(policy, range_parts_ptr,
range_parts_ptr + num_ranges, range_ids_ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it uses the truth that range ids is also sorted, right?
such that it is sorted by (part_id, range_id)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although the part_id is not used after this anymore. It's only relevant that the range_id is first sorted by part_id, and then by range_id.

Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor change requests and comments

@@ -69,6 +69,9 @@ class Partition : public CommonTestFixture {
gko::make_array_view(
this->exec, dpart->get_num_parts(),
const_cast<local_index_type*>(dpart->get_part_sizes())));

GKO_ASSERT_SEGMENTED_ARRAY_EQ(part->get_ranges_by_part(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-node: Since this is a function returning void, this will not prevent subsequent operations from executing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be implemented the same way as the GKO_ASSERT_ARRAY_EQ macro. The segmented_array_equal function will return something, so I don't know if there is anything more to do here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the overall approach of implementing a custom assert_something function may cause tests to continue executing after the assertion fails, which may or may not be desirable. It's fine here, but something to keep in mind. To fix this, we would need to return testing::AssertionResult and wrap the call in a macro. But nothing to hold back the code here, just a comment that people should be aware of.

//
// SPDX-License-Identifier: BSD-3-Clause

#pragma once
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only very obscure edge cases where pragma once doesn't work properly (because some filesystems don't make it easy to see if two files have the same identity), I think we could safely move from header guards to #pragma once if we wanted to. Every relevant compiler supports it.

@MarcelKoch MarcelKoch requested a review from upsj February 14, 2025 09:56
Comment on lines 22 to 23


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be kept

Comment on lines 18 to 20


template <typename LocalIndexType, typename GlobalIndexType>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should kept

Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MarcelKoch MarcelKoch added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Feb 18, 2025
@MarcelKoch MarcelKoch force-pushed the global-to-local branch 2 times, most recently from 8db4eea to 5b5ddac Compare February 18, 2025 15:46
@MarcelKoch MarcelKoch changed the base branch from develop to neighborhood-communicator February 20, 2025 09:35
@MarcelKoch MarcelKoch force-pushed the neighborhood-communicator branch from 37f5bba to d7d3d40 Compare February 25, 2025 07:35
Base automatically changed from neighborhood-communicator to develop February 26, 2025 12:16
MarcelKoch and others added 7 commits February 27, 2025 13:37
Signed-off-by: Marcel Koch <[email protected]>
- reduce tests
- update docs
- minor refactoring
- fix binary search usage in cuda/hip
- refactor
- add tests for segmented array assertion

Co-authored-by: Yu-Hsiang M. Tsai <[email protected]>
Co-authored-by: Pratik Nayak <[email protected]>
Co-authored-by: Tobias Ribizel <[email protected]>
Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 96.53846% with 9 lines in your changes missing coverage. Please review.

Project coverage is 88.58%. Comparing base (d7d3d40) to head (71ac6a4).
Report is 10 commits behind head on develop.

Files with missing lines Patch % Lines
core/distributed/index_map.cpp 0.00% 7 Missing ⚠️
core/device_hooks/common_kernels.inc.cpp 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1707      +/-   ##
===========================================
- Coverage    89.17%   88.58%   -0.59%     
===========================================
  Files          817      818       +1     
  Lines        67816    68151     +335     
===========================================
- Hits         60474    60374     -100     
- Misses        7342     7777     +435     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MarcelKoch MarcelKoch merged commit 7404a82 into develop Feb 28, 2025
13 of 14 checks passed
@MarcelKoch MarcelKoch deleted the global-to-local branch February 28, 2025 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. mod:all This touches all Ginkgo modules. reg:testing This is related to testing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants