Skip to content

Commit 5ba1728

Browse files
gerardrbentleyGerard BentleywilliamFalconBorda
authored and
akarnachev
committed
Simplify progress bar args (Lightning-AI#1108)
* show progress bar dependent on refresh_rate * test progress_bar_refresh control show bar * remove show_progress_bar from other tests * borda fixes * flake8 fix * changelog update prog bar refresh rate * move show_progress_bar to deprecated 0.9 api * rm show_progress_bar references, test deprecated * Update pytorch_lightning/trainer/__init__.py * fix test * changelog * minor CHANGELOG.md format * Update pytorch_lightning/trainer/__init__.py * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Gerard Bentley <[email protected]> Co-authored-by: William Falcon <[email protected]> Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: J. Borovec <[email protected]>
1 parent 3cffda3 commit 5ba1728

15 files changed

+79
-55
lines changed

CHANGELOG.md

+10-11
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
2626

2727
### Changed
2828

29+
- Changed `progress_bar_refresh_rate` trainer flag to disable progress bar when set to 0. ([#1108](https://github.com/PyTorchLightning/pytorch-lightning/pull/1108))
2930
- Enhanced `load_from_checkpoint` to also forward params to the model ([#1307](https://github.com/PyTorchLightning/pytorch-lightning/pull/1307))
3031
- Updated references to self.forward() to instead use the `__call__` interface. ([#1211](https://github.com/PyTorchLightning/pytorch-lightning/pull/1211))
3132
- Added option to run without an optimizer by returning `None` from `configure_optimizers`. ([#1279](https://github.com/PyTorchLightning/pytorch-lightning/pull/1279))
@@ -44,6 +45,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
4445
### Deprecated
4546

4647
- Deprecated Trainer argument `print_nan_grads` ([#1097](https://github.com/PyTorchLightning/pytorch-lightning/pull/1097))
48+
- Deprecated Trainer argument `show_progress_bar` ([#1108](https://github.com/PyTorchLightning/pytorch-lightning/pull/1108))
4749

4850
### Removed
4951

@@ -72,9 +74,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
7274

7375
### Added
7476

75-
- Added automatic sampler setup. Depending on DDP or TPU, lightning configures the sampler correctly (user needs to do nothing) ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
76-
- Added `reload_dataloaders_every_epoch=False` flag for trainer. Some users require reloading data every epoch ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
77-
- Added `progress_bar_refresh_rate=50` flag for trainer. Throttle refresh rate on notebooks ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
77+
- Added automatic sampler setup. Depending on DDP or TPU, lightning configures the sampler correctly (user needs to do nothing) ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
78+
- Added `reload_dataloaders_every_epoch=False` flag for trainer. Some users require reloading data every epoch ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
79+
- Added `progress_bar_refresh_rate=50` flag for trainer. Throttle refresh rate on notebooks ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
7880
- Updated governance docs
7981
- Added a check to ensure that the metric used for early stopping exists before training commences ([#542](https://github.com/PyTorchLightning/pytorch-lightning/pull/542))
8082
- Added `optimizer_idx` argument to `backward` hook ([#733](https://github.com/PyTorchLightning/pytorch-lightning/pull/733))
@@ -97,7 +99,6 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
9799
- Added TPU gradient clipping ([#963](https://github.com/PyTorchLightning/pytorch-lightning/pull/963))
98100
- Added max/min number of steps in `Trainer` ([#728](https://github.com/PyTorchLightning/pytorch-lightning/pull/728))
99101

100-
101102
### Changed
102103

103104
- Improved `NeptuneLogger` by adding `close_after_fit` argument to allow logging after training([#908](https://github.com/PyTorchLightning/pytorch-lightning/pull/1084))
@@ -109,17 +110,17 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
109110
- Freezed models `hparams` as `Namespace` property ([#1029](https://github.com/PyTorchLightning/pytorch-lightning/pull/1029))
110111
- Dropped `logging` config in package init ([#1015](https://github.com/PyTorchLightning/pytorch-lightning/pull/1015))
111112
- Renames model steps ([#1051](https://github.com/PyTorchLightning/pytorch-lightning/pull/1051))
112-
* `training_end` >> `training_epoch_end`
113-
* `validation_end` >> `validation_epoch_end`
114-
* `test_end` >> `test_epoch_end`
113+
- `training_end` >> `training_epoch_end`
114+
- `validation_end` >> `validation_epoch_end`
115+
- `test_end` >> `test_epoch_end`
115116
- Refactor dataloading, supports infinite dataloader ([#955](https://github.com/PyTorchLightning/pytorch-lightning/pull/955))
116117
- Create single file in `TensorBoardLogger` ([#777](https://github.com/PyTorchLightning/pytorch-lightning/pull/777))
117118

118119
### Deprecated
119120

120121
- Deprecated `pytorch_lightning.logging` ([#767](https://github.com/PyTorchLightning/pytorch-lightning/pull/767))
121122
- Deprecated `LightningModule.load_from_metrics` in favour of `LightningModule.load_from_checkpoint` ([#995](https://github.com/PyTorchLightning/pytorch-lightning/pull/995), [#1079](https://github.com/PyTorchLightning/pytorch-lightning/pull/1079))
122-
- Deprecated `@data_loader` decorator ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
123+
- Deprecated `@data_loader` decorator ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
123124
- Deprecated model steps `training_end`, `validation_end` and `test_end` ([#1051](https://github.com/PyTorchLightning/pytorch-lightning/pull/1051), [#1056](https://github.com/PyTorchLightning/pytorch-lightning/pull/1056))
124125

125126
### Removed
@@ -309,9 +310,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
309310

310311
### Added
311312

312-
- Added the flag `log_gpu_memory` to `Trainer` to deactivate logging of GPU
313-
memory utilization
314-
- Added SLURM resubmit functionality (port from test-tube)
313+
- Added the flag `log_gpu_memory` to `Trainer` to deactivate logging of GPU memory utilization
315314
- Added optional weight_save_path to trainer to remove the need for a checkpoint_callback when using cluster training
316315
- Added option to use single gpu per node with `DistributedDataParallel`
317316

pytorch_lightning/trainer/__init__.py

+4-5
Original file line numberDiff line numberDiff line change
@@ -646,6 +646,8 @@ def on_train_end(self):
646646
# default used by the Trainer
647647
trainer = Trainer(progress_bar_refresh_rate=1)
648648
649+
# disable progress bar
650+
trainer = Trainer(progress_bar_refresh_rate=0)
649651
650652
reload_dataloaders_every_epoch
651653
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -702,12 +704,9 @@ def on_train_end(self):
702704
show_progress_bar
703705
^^^^^^^^^^^^^^^^^
704706
705-
If true shows tqdm progress bar
707+
.. warning:: .. deprecated:: 0.7.2
706708
707-
Example::
708-
709-
# default used by the Trainer
710-
trainer = Trainer(show_progress_bar=True)
709+
Set `progress_bar_refresh_rate` to 0 instead. Will remove 0.9.0.
711710
712711
test_percent_check
713712
^^^^^^^^^^^^^^^^^^

pytorch_lightning/trainer/deprecated_api.py

+19
Original file line numberDiff line numberDiff line change
@@ -87,3 +87,22 @@ def nb_sanity_val_steps(self, nb):
8787
"`num_sanity_val_steps` since v0.5.0"
8888
" and this method will be removed in v0.8.0", DeprecationWarning)
8989
self.num_sanity_val_steps = nb
90+
91+
92+
class TrainerDeprecatedAPITillVer0_9(ABC):
93+
94+
def __init__(self):
95+
super().__init__() # mixin calls super too
96+
97+
@property
98+
def show_progress_bar(self):
99+
"""Back compatibility, will be removed in v0.9.0"""
100+
warnings.warn("Argument `show_progress_bar` is now set by `progress_bar_refresh_rate` since v0.7.2"
101+
" and this method will be removed in v0.9.0", DeprecationWarning)
102+
return self.progress_bar_refresh_rate >= 1
103+
104+
@show_progress_bar.setter
105+
def show_progress_bar(self, tf):
106+
"""Back compatibility, will be removed in v0.9.0"""
107+
warnings.warn("Argument `show_progress_bar` is now set by `progress_bar_refresh_rate` since v0.7.2"
108+
" and this method will be removed in v0.9.0", DeprecationWarning)

pytorch_lightning/trainer/distrib_data_parallel.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ def ddp_train(self, gpu_idx, model):
281281
self.node_rank = 0
282282

283283
# show progressbar only on progress_rank 0
284-
self.show_progress_bar = self.show_progress_bar and self.node_rank == 0 and gpu_idx == 0
284+
self.progress_bar_refresh_rate = self.progress_bar_refresh_rate if self.node_rank == 0 and gpu_idx == 0 else 0
285285

286286
# determine which process we are and world size
287287
if self.use_ddp:

pytorch_lightning/trainer/distrib_parts.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -480,7 +480,7 @@ def tpu_train(self, tpu_core_idx, model):
480480
self.tpu_global_core_rank = xm.get_ordinal()
481481

482482
# avoid duplicating progress bar
483-
self.show_progress_bar = self.show_progress_bar and self.tpu_global_core_rank == 0
483+
self.progress_bar_refresh_rate = self.progress_bar_refresh_rate if self.tpu_global_core_rank == 0 else 0
484484

485485
# track current tpu
486486
self.current_tpu_idx = tpu_core_idx

pytorch_lightning/trainer/evaluation_loop.py

+2-3
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,6 @@ class TrainerEvaluationLoopMixin(ABC):
163163
num_val_batches: int
164164
fast_dev_run: ...
165165
process_position: ...
166-
show_progress_bar: ...
167166
process_output: ...
168167
training_tqdm_dict: ...
169168
proc_rank: int
@@ -278,7 +277,7 @@ def _evaluate(self, model: LightningModule, dataloaders, max_batches: int, test_
278277
dl_outputs.append(output)
279278

280279
# batch done
281-
if batch_idx % self.progress_bar_refresh_rate == 0:
280+
if self.progress_bar_refresh_rate >= 1 and batch_idx % self.progress_bar_refresh_rate == 0:
282281
if test_mode:
283282
self.test_progress_bar.update(self.progress_bar_refresh_rate)
284283
else:
@@ -361,7 +360,7 @@ def run_evaluation(self, test_mode: bool = False):
361360
desc = 'Testing' if test_mode else 'Validating'
362361
total = max_batches if max_batches != float('inf') else None
363362
pbar = tqdm(desc=desc, total=total, leave=test_mode, position=position,
364-
disable=not self.show_progress_bar, dynamic_ncols=True, file=sys.stdout)
363+
disable=not self.progress_bar_refresh_rate, dynamic_ncols=True, file=sys.stdout)
365364
setattr(self, f'{"test" if test_mode else "val"}_progress_bar', pbar)
366365

367366
# run evaluation

pytorch_lightning/trainer/trainer.py

+14-7
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@
2121
from pytorch_lightning.trainer.callback_config import TrainerCallbackConfigMixin
2222
from pytorch_lightning.trainer.callback_hook import TrainerCallbackHookMixin
2323
from pytorch_lightning.trainer.data_loading import TrainerDataLoadingMixin
24-
from pytorch_lightning.trainer.deprecated_api import TrainerDeprecatedAPITillVer0_8
24+
from pytorch_lightning.trainer.deprecated_api import (TrainerDeprecatedAPITillVer0_8,
25+
TrainerDeprecatedAPITillVer0_9)
2526
from pytorch_lightning.trainer.distrib_data_parallel import TrainerDDPMixin
2627
from pytorch_lightning.trainer.distrib_parts import TrainerDPMixin, parse_gpu_ids, determine_root_gpu_device
2728
from pytorch_lightning.trainer.evaluation_loop import TrainerEvaluationLoopMixin
@@ -66,12 +67,13 @@ class Trainer(
6667
TrainerCallbackConfigMixin,
6768
TrainerCallbackHookMixin,
6869
TrainerDeprecatedAPITillVer0_8,
70+
TrainerDeprecatedAPITillVer0_9,
6971
):
7072
DEPRECATED_IN_0_8 = (
7173
'gradient_clip', 'nb_gpu_nodes', 'max_nb_epochs', 'min_nb_epochs',
7274
'add_row_log_interval', 'nb_sanity_val_steps'
7375
)
74-
DEPRECATED_IN_0_9 = ('use_amp',)
76+
DEPRECATED_IN_0_9 = ('use_amp', 'show_progress_bar')
7577

7678
def __init__(
7779
self,
@@ -86,7 +88,7 @@ def __init__(
8688
gpus: Optional[Union[List[int], str, int]] = None,
8789
num_tpu_cores: Optional[int] = None,
8890
log_gpu_memory: Optional[str] = None,
89-
show_progress_bar: bool = True,
91+
show_progress_bar=None, # backward compatible, todo: remove in v0.9.0
9092
progress_bar_refresh_rate: int = 1,
9193
overfit_pct: float = 0.0,
9294
track_grad_norm: int = -1,
@@ -161,9 +163,12 @@ def __init__(
161163
162164
log_gpu_memory: None, 'min_max', 'all'. Might slow performance
163165
164-
show_progress_bar: If true shows tqdm progress bar
166+
show_progress_bar:
167+
.. warning:: .. deprecated:: 0.7.2
168+
169+
Set `progress_bar_refresh_rate` to postive integer to enable. Will remove 0.9.0.
165170
166-
progress_bar_refresh_rate: How often to refresh progress bar (in steps)
171+
progress_bar_refresh_rate: How often to refresh progress bar (in steps). Value ``0`` disables progress bar.
167172
168173
overfit_pct: How much of training-, validation-, and test dataset to check.
169174
@@ -414,7 +419,9 @@ def __init__(
414419

415420
# can't init progress bar here because starting a new process
416421
# means the progress_bar won't survive pickling
417-
self.show_progress_bar = show_progress_bar
422+
# backward compatibility
423+
if show_progress_bar is not None:
424+
self.show_progress_bar = show_progress_bar
418425

419426
# logging
420427
self.log_save_interval = log_save_interval
@@ -821,7 +828,7 @@ def run_pretrain_routine(self, model: LightningModule):
821828
pbar = tqdm(desc='Validation sanity check',
822829
total=self.num_sanity_val_steps * len(self.val_dataloaders),
823830
leave=False, position=2 * self.process_position,
824-
disable=not self.show_progress_bar, dynamic_ncols=True)
831+
disable=not self.progress_bar_refresh_rate, dynamic_ncols=True)
825832
self.main_progress_bar = pbar
826833
# dummy validation progress bar
827834
self.val_progress_bar = tqdm(disable=True)

pytorch_lightning/trainer/training_loop.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -623,7 +623,7 @@ def optimizer_closure():
623623
self.get_model().on_batch_end()
624624

625625
# update progress bar
626-
if batch_idx % self.progress_bar_refresh_rate == 0:
626+
if self.progress_bar_refresh_rate >= 1 and batch_idx % self.progress_bar_refresh_rate == 0:
627627
self.main_progress_bar.update(self.progress_bar_refresh_rate)
628628
self.main_progress_bar.set_postfix(**self.training_tqdm_dict)
629629

tests/models/test_amp.py

+1-5
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ def test_amp_single_gpu(tmpdir):
2121

2222
trainer_options = dict(
2323
default_save_path=tmpdir,
24-
show_progress_bar=True,
2524
max_epochs=1,
2625
gpus=1,
2726
distributed_backend='ddp',
@@ -42,7 +41,6 @@ def test_no_amp_single_gpu(tmpdir):
4241

4342
trainer_options = dict(
4443
default_save_path=tmpdir,
45-
show_progress_bar=True,
4644
max_epochs=1,
4745
gpus=1,
4846
distributed_backend='dp',
@@ -66,7 +64,6 @@ def test_amp_gpu_ddp(tmpdir):
6664

6765
trainer_options = dict(
6866
default_save_path=tmpdir,
69-
show_progress_bar=True,
7067
max_epochs=1,
7168
gpus=2,
7269
distributed_backend='ddp',
@@ -90,7 +87,6 @@ def test_amp_gpu_ddp_slurm_managed(tmpdir):
9087
model = LightningTestModel(hparams)
9188

9289
trainer_options = dict(
93-
show_progress_bar=True,
9490
max_epochs=1,
9591
gpus=[0],
9692
distributed_backend='ddp',
@@ -128,7 +124,7 @@ def test_cpu_model_with_amp(tmpdir):
128124

129125
trainer_options = dict(
130126
default_save_path=tmpdir,
131-
show_progress_bar=False,
127+
progress_bar_refresh_rate=0,
132128
logger=tutils.get_default_testtube_logger(tmpdir),
133129
max_epochs=1,
134130
train_percent_check=0.4,

tests/models/test_cpu.py

+7-8
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ def test_early_stopping_cpu_model(tmpdir):
2727
gradient_clip_val=1.0,
2828
overfit_pct=0.20,
2929
track_grad_norm=2,
30-
show_progress_bar=True,
3130
logger=tutils.get_default_testtube_logger(tmpdir),
3231
train_percent_check=0.1,
3332
val_percent_check=0.1,
@@ -48,7 +47,7 @@ def test_lbfgs_cpu_model(tmpdir):
4847
trainer_options = dict(
4948
default_save_path=tmpdir,
5049
max_epochs=2,
51-
show_progress_bar=False,
50+
progress_bar_refresh_rate=0,
5251
weights_summary='top',
5352
train_percent_check=1.0,
5453
val_percent_check=0.2,
@@ -67,7 +66,7 @@ def test_default_logger_callbacks_cpu_model(tmpdir):
6766
max_epochs=1,
6867
gradient_clip_val=1.0,
6968
overfit_pct=0.20,
70-
show_progress_bar=False,
69+
progress_bar_refresh_rate=0,
7170
train_percent_check=0.01,
7271
val_percent_check=0.01,
7372
)
@@ -95,7 +94,7 @@ def test_running_test_after_fitting(tmpdir):
9594

9695
trainer_options = dict(
9796
default_save_path=tmpdir,
98-
show_progress_bar=False,
97+
progress_bar_refresh_rate=0,
9998
max_epochs=8,
10099
train_percent_check=0.4,
101100
val_percent_check=0.2,
@@ -133,7 +132,7 @@ class CurrentTestModel(LightTrainDataloader, LightTestMixin, TestModelBase):
133132
checkpoint = tutils.init_checkpoint_callback(logger)
134133

135134
trainer_options = dict(
136-
show_progress_bar=False,
135+
progress_bar_refresh_rate=0,
137136
max_epochs=1,
138137
train_percent_check=0.4,
139138
val_percent_check=0.2,
@@ -226,7 +225,7 @@ def test_cpu_model(tmpdir):
226225

227226
trainer_options = dict(
228227
default_save_path=tmpdir,
229-
show_progress_bar=False,
228+
progress_bar_refresh_rate=0,
230229
logger=tutils.get_default_testtube_logger(tmpdir),
231230
max_epochs=1,
232231
train_percent_check=0.4,
@@ -247,7 +246,7 @@ def test_all_features_cpu_model(tmpdir):
247246
gradient_clip_val=1.0,
248247
overfit_pct=0.20,
249248
track_grad_norm=2,
250-
show_progress_bar=False,
249+
progress_bar_refresh_rate=0,
251250
logger=tutils.get_default_testtube_logger(tmpdir),
252251
accumulate_grad_batches=2,
253252
max_epochs=1,
@@ -344,7 +343,7 @@ def test_single_gpu_model(tmpdir):
344343

345344
trainer_options = dict(
346345
default_save_path=tmpdir,
347-
show_progress_bar=False,
346+
progress_bar_refresh_rate=0,
348347
max_epochs=1,
349348
train_percent_check=0.1,
350349
val_percent_check=0.1,

tests/models/test_gpu.py

+4-5
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ def test_multi_gpu_model_ddp2(tmpdir):
2727
model, hparams = tutils.get_default_model()
2828
trainer_options = dict(
2929
default_save_path=tmpdir,
30-
show_progress_bar=True,
3130
max_epochs=1,
3231
train_percent_check=0.4,
3332
val_percent_check=0.2,
@@ -49,7 +48,7 @@ def test_multi_gpu_model_ddp(tmpdir):
4948
model, hparams = tutils.get_default_model()
5049
trainer_options = dict(
5150
default_save_path=tmpdir,
52-
show_progress_bar=False,
51+
progress_bar_refresh_rate=0,
5352
max_epochs=1,
5453
train_percent_check=0.4,
5554
val_percent_check=0.2,
@@ -69,7 +68,7 @@ def test_ddp_all_dataloaders_passed_to_fit(tmpdir):
6968

7069
model, hparams = tutils.get_default_model()
7170
trainer_options = dict(default_save_path=tmpdir,
72-
show_progress_bar=False,
71+
progress_bar_refresh_rate=0,
7372
max_epochs=1,
7473
train_percent_check=0.4,
7574
val_percent_check=0.2,
@@ -165,7 +164,7 @@ def test_multi_gpu_none_backend(tmpdir):
165164
model, hparams = tutils.get_default_model()
166165
trainer_options = dict(
167166
default_save_path=tmpdir,
168-
show_progress_bar=False,
167+
progress_bar_refresh_rate=0,
169168
max_epochs=1,
170169
train_percent_check=0.1,
171170
val_percent_check=0.1,
@@ -184,7 +183,7 @@ def test_multi_gpu_model_dp(tmpdir):
184183
model, hparams = tutils.get_default_model()
185184
trainer_options = dict(
186185
default_save_path=tmpdir,
187-
show_progress_bar=False,
186+
progress_bar_refresh_rate=0,
188187
distributed_backend='dp',
189188
max_epochs=1,
190189
train_percent_check=0.1,

0 commit comments

Comments
 (0)