-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not prefetch when possible #12101
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Some comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comments.
@@ -393,6 +393,9 @@ option when using sequential data. | |||
to ``limit_{mode}_batches``, if it is set to 1.0 it will run for the whole dataset, otherwise it will throw an exception. | |||
Here ``mode`` can be train/val/test/predict. | |||
|
|||
When iterable datasets are used, Lightning will pre-fetch 1 batch (in addition to the current batch) so it can detect | |||
when the training will stop and run validation if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when the training will stop and run validation if necessary. | |
when the training epoch will end and run validation if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while the comment here is not wrong, the real reason we have the prefetching is actually to avoid starting a new epoch that doesn't have any batches left in the dataloader. Relying on a StopIteration check is not possible for this reason.
Co-authored-by: Rohit Gupta <[email protected]>
What does this PR do?
Only prefetch unless we require it. That is when:
This is good for FFCV because they already do prefetching internally.
We no longer error on 0-length map-style datasets as this is inconsistent with iterable datasets.
Part of #11538
Follow-up to #11606
Reverts #1280
Does your PR introduce any breaking changes? If yes, please list them.
None knowingly
Before submitting
PR review
cc @Borda @justusschock @awaelchli @ninginthecloud @akihironitta