Skip to content

Unify all Graph Dataset types #173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Dsantra92
Copy link
Collaborator

Related to: #169

Added the GraphDataset struct which can replace all custom structs for each Data source.

The metadata will have fields like OGBDatasets metadata, including name, source, etc. We also can add data citations. This is also required for saving datasets in parquet format, as defining storing method for each struct is troublesome.

@CarloLucibello if you can review the initial struct, I can start with replacing the data source structs.

@codecov-commenter
Copy link

codecov-commenter commented Sep 10, 2022

Codecov Report

Merging #173 (f053ef4) into master (e8ec56e) will decrease coverage by 0.13%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #173      +/-   ##
==========================================
- Coverage   32.03%   31.90%   -0.14%     
==========================================
  Files          42       42              
  Lines        2132     2141       +9     
==========================================
  Hits          683      683              
- Misses       1449     1458       +9     
Impacted Files Coverage Δ
src/graph.jl 44.30% <0.00%> (-5.70%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@CarloLucibello CarloLucibello marked this pull request as ready for review September 11, 2022 07:49
@CarloLucibello
Copy link
Member

I'm not okay with replacing every graph dataset with a unique structure. At least we should first have a long discussion about that.

Each dataset in MLDatasets used to be its own module, then with a large and breaking change datasets became types. I don't see enough compelling reasons to go through another heavily breaking change.

Having datasets as their own types means we can customize their behavior with specialized methods, we don't want to lose that flexibility.

We can still expose a GraphDataset for the user's convenience in defining their own datasets, similarly to the containers in https://github.com/JuliaML/MLDatasets.jl/tree/master/src/containers

@Dsantra92
Copy link
Collaborator Author

I didn't look into containers enough, I will look into that.

@Dsantra92 Dsantra92 closed this Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants