Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(sampler): clean up sampler attributes, part 3 #5402

Merged
merged 8 commits into from
Sep 12, 2022

Conversation

mananshah99
Copy link
Contributor

@mananshah99 mananshah99 commented Sep 9, 2022

This PR continues the effort to consolidate PyG's sampling interface in preparation for moving sample(...) behind the GraphStore interface. This effort is somewhat large in scope and will be broken into multiple PRs for ease of review. It resolves some TODOs from #5365.

The major change removes special handling of input_type and perm / perm_dict in the neighbor loader. This enables a full separation between the sampler interface and the NeighborLoader (the only other interaction that is not clearly documented or exposed well in the sampler class is *SamplerOutput.metadata, which will take a bit more work to properly define.

@github-actions github-actions bot added loader and removed data labels Sep 9, 2022
@codecov
Copy link

codecov bot commented Sep 9, 2022

Codecov Report

Merging #5402 (8e6b3cb) into master (2c54dae) will decrease coverage by 0.01%.
The diff coverage is 95.23%.

❗ Current head 8e6b3cb differs from pull request most recent head f92b8ea. Consider uploading reports for the commit f92b8ea to get more accurate results

@@            Coverage Diff             @@
##           master    #5402      +/-   ##
==========================================
- Coverage   83.35%   83.33%   -0.02%     
==========================================
  Files         343      342       -1     
  Lines       18800    18784      -16     
==========================================
- Hits        15671    15654      -17     
- Misses       3129     3130       +1     
Impacted Files Coverage Δ
torch_geometric/loader/utils.py 81.25% <ø> (ø)
torch_geometric/sampler/base.py 96.77% <66.66%> (-3.23%) ⬇️
torch_geometric/loader/link_neighbor_loader.py 90.90% <100.00%> (-0.40%) ⬇️
torch_geometric/loader/neighbor_loader.py 90.90% <100.00%> (+0.10%) ⬆️
torch_geometric/sampler/neighbor_sampler.py 93.42% <100.00%> (+0.13%) ⬆️
torch_geometric/nn/__init__.py 100.00% <0.00%> (ø)
torch_geometric/nn/encoding.py

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@mananshah99 mananshah99 requested review from rusty1s, wsad1 and a team September 9, 2022 16:24
Copy link
Contributor

@Padarn Padarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall, left some very minor comments

@mananshah99 mananshah99 merged commit 000a133 into master Sep 12, 2022
@mananshah99 mananshah99 deleted the remote_backend_4 branch September 12, 2022 18:14
mananshah99 added a commit that referenced this pull request Sep 13, 2022
…n classes, part 4 (#5404)

This PR continues the effort to consolidate PyG's sampling interface in preparation for moving `sample(...)` behind the `GraphStore` interface. This effort is somewhat large in scope and will be broken into multiple PRs for ease of review. It builds off of #5402, and makes a significant move to abstract data loading behind a `data: Union[Data, HeteroData, Tuple[FeatureStore, GraphStore]]` and a `sampler: BaseSampler`.

It does so by introducing two base implementation classes: `NodeLoader` and `LinkLoader`. `NodeLoader` performs sampling from nodes (using `sample_from_nodes`), and `LinkLoader` does the same from edges (using `sample_from_edges`). They both expose parameters in their initializers that are intended for **loading** (that is, the process of using a sampler to get subgraphs, using a feature fetcher to get features, and joining these together to construct a `HeteroData` object to pass downstream). Samplers are intended to expose parameters that are used for **sampling** (that are particular to the sampling method).

The implementations of `NeighborLoader` and `LinkNeighborLoader` are now very simple: they pass the `NeighborSampler` and any necessary initialization parameters directly in `__init__`, with no other change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants