-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not all LINKX datasets are available #4569
Comments
Hey @OlegPlatonov I've opened a PR here to address this #4570. Please take a look and let me know what you think. Happy to have your input especially on the features for the |
Hi @Padarn! I'm afraid there is a slight misunderstanding. The repository for the “Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods” paper stores some datasets from prior works and these are the datasets you have added, however, they are already present in PyG (for example, here and here). What I've meant were the new datasets introduced in the paper, specifically pokec, arXiv-year, snap-patents, genius, twitch-gamers and wiki. |
Oh I see, thanks for the clarification! I didn't look carefully enough at
what already exists. I'll update tomorrow.
…On Sat, 30 Apr 2022, 9:21 pm OlegPlatonov, ***@***.***> wrote:
Hi @Padarn <https://github.com/Padarn>! I'm afraid there is a slight
misunderstanding. The repository for the “Large Scale Learning on
Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods”
<https://arxiv.org/abs/2110.14446> paper stores some datasets from prior
works and these are the datasets you have added, however, they are already
present in PyG (for example, here
<https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html#torch_geometric.datasets.WikipediaNetwork>
and here
<https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html#torch_geometric.datasets.DeezerEurope>).
What I've meant were the new datasets introduced in the paper, specifically
pokec, arXiv-year, snap-patents, genius, twitch-gamers and wiki.
—
Reply to this email directly, view it on GitHub
<#4569 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGRPN2RLRWDHJJWAVMP7L3VHUXVTANCNFSM5UW3LCSQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
By communicating with Grab Inc and/or its subsidiaries, associate
companies and jointly controlled entities (“Grab Group”), you are deemed to
have consented to the processing of your personal data as set out in the
Privacy Notice which can be viewed at https://grab.com/privacy/
<https://grab.com/privacy/>
This email contains confidential information
and is only for the intended recipient(s). If you are not the intended
recipient(s), please do not disseminate, distribute or copy this email
Please notify Grab Group immediately if you have received this by mistake
and delete this email from your system. Email transmission cannot be
guaranteed to be secure or error-free as any information therein could be
intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain
viruses. Grab Group do not accept liability for any errors or omissions in
the contents of this email arises as a result of email transmission. All
intellectual property rights in this email and attachments therein shall
remain vested in Grab Group, unless otherwise provided by law.
|
Hey @OlegPlatonov - actually many of the datasets are are already in PyG. For example 'twitch-gamer's is available here. However in the paper they say they have updated some of these datasets.
The only one totally new is the I've updated the MR to just add 'genius' which did seem to be missing before. |
Hey @Padarn - indeed most datasets from the paper are not entirely new, but unless I'm missing something, they are not available in PyG (at least not in the form used in the paper). I've just checked and could not find arXiv-year, snap-patents, genius and wiki datasets in PyG. pokec dataset is available here, but it does not contain node features that were defined in the paper. As for twitch dataset, the version in PyG is a collection of 6 different graphs, which is different from the single twitch-gamers graph used in the paper (the number of nodes does not match). |
Yeah understand, they're not all there and some are updated in the paper,
it just wasn't immediately clear what the best thing to do was with updated
datasets that we already have.
Maybe it's easier to tackle them separately across a few PRs? I have one
open for genius now, maybe we could prioritize the others?
|
Thanks @Padarn for your work on adding some of these datasets. I think adding the remaining one is definitely of interest to the community, especially in order to accelerate GNN research on heterophily graphs. Let's try to tackle this in follow-up PRs. |
Yep aligned. I think adding |
🚀 The feature, motivation and pitch
Hi! I've noticed that PyG now has datasets from the “Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods” paper, however, for some reason not all datasets proposed in the paper are provided. It would be great if all the other datasets from the paper (pokec, genius, wiki, etc.) were added. There are not many heterophilous graph datasets and it will be very useful to have all of them in one place.
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: