Releases: kitzeslab/opensoundscape
v0.12.0 release
Opensoundscape version 0.12.0
This release includes some (but not too many) breaking changes compared to 0.11.0, some bug fixes, and some new functionality.
A few highlights:
audio_root
argument for CNN.predict(), CNN.train(), and similar functions
this means your annotation and output csvs can contain (much shorter) relative audio paths rather than the absolute audio path
eg my_cnn.predict(['sub1/file1.wav','sub2/file2.wav'], audio_root="/Users/user/folder/project/")
This should make it easier to transfer projects between machines file systems, by simply changing the audio_root argument and leaving the dataframes of labels/scores (with file/start_time/end_time index) untouched.
-
bioacoustics_model_zoo module has been removed. Instead, please install the bioacoustics_model_zoo directly from github with
pip install git+https://github.com/kitzeslab/bioacoustics-model-zoo
-
enable automatic mixed precision on mps (this was previously disabled beacuse it wasn't supported yet)
-
GradCAM plotting (
CNN.generate_cams -> cam.plot()
) has improved, it shouldn't give ugly results with white pixels from over-saturation now (#964) -
reduce default random gain reduction in AudioAugmentationPreprocessor
-
changed behavior of data_selection.resample to retain all-0 rows
-
quickly get a short audio clip of bird song, or the path to it:
import opensoundscape as opso
opso.birds # Audio object with 10s of Pennsylvania bird song
opso.birds_path # path to 10s audio clip
- Save mp3/opus format audio with custom bitrate mode and compression levels
audio.save("output.mp3", bitrate_mode="VARIABLE", compression_level=0.8)
-
interactive display of preprocessor w/params interactive widget when Preprocessor objects are returned from a notebook cell (expand sections to show parameters of each action in the preprocessor's pipeline)
-
BoxedAnnotations.from_raven_files allows the user to pass a list of options for the name of the annotation column rather than just one option
-
added add train_test_split method for BoxedAnnotations
What's Changed
- enable passing list of options for annotation_column by @sammlapp in #1099
- implement audio_root argument by @sammlapp in #1103
- Version 0.12.0 release by @sammlapp in #1107
Full Changelog: v0.11.0...v0.12.0
v0.11.0
Opensoundscape v0.11.0
This release includes major feature additions, breaking changes to the API, and bug fixes. We support Python 3.9-3.12 and test on 3.9-3.11.
Major additions:
Machine Learning
- added support for training with Lightning, with
opensoundscape.ml.lightning.LightningSpectrogramModule
; Lightning implements various training performance enhancements and common workflows/strategies for ml training; supports hardware acceleration including distributed training; and supports logging to various logging platforms. - major refactor of the CNN class increases the modularity; for instance, optimizer class and parameters can be configured through the
.optimizer_params
attribute, learning rate scheduler class and parameters can be configured through the.lr_scheduler_params
attribute, modular methods can be overridden in sub classes for custom behavior (.training_step, .validation_step, .configure_optimizers, .train_dataloader, .predict_dataloader) - change the output classes with CNN.change_classes()
- freeze all layers except classifier before training with CNN.freeze_feature_extractor() (see also freeze_layers_except())
- access the layers considered to be the "classifier" with CNN.classifier property
CNN.__call__
allows intermediate layers to be returned with theintermediate_layers
andavgpool_intermediates
argumentsembed()
method for CNN and other classifier models to generate embeddings with same API as .predict();- support for training BirdNet and Perch, tensorflow models from the model zoo (trains classification head, not entire model); see the tutorial notebook
training_birdnet_and_perch.ipynb
- transfer learning tools for training shallow classiers on pre-trained embedding models; see
transfer_learning.ipynb
tutorial notebook - support for sparse label datatypes when training machine learning models; adds a class
CategoricalLabels
that stores annotations in a sparse format and produces various label formats. BoxedAnnotations.clip_labels() can create this type of object as a return type. - support for automated mixed-precision training: set
model.use_amp=True
Localization
- synchronizing audio files recorded with AudioMoth GPS firmware (see tools in
opensoundscape.localization.audiomoth_sync.py
andtutorials/synchronize_auidomoth_gps_recordings.py
for an example script) - review TDOA/cross correlation alignment for localized sound detections (see example in
acoustic_localization.ipynb
tutorial) - new class PositionEstimate for storing the estimated position of a localized sound source separately from the SpatialEvent class
- adds least squares as an option for localization algorithm
- use timestamps in SpatialEvent and SynchronizedRecorderArray, which enables audio files with differing start times to be used within the objects
Audio:
- Audio.reduce_noise() applies the noisereduce algorithm
- MultiChannelAudio: a class for retaining multiple audio channels; similar API to Audio()
- Audio.trim_with_timestamps() to select a segment of audio based on a real-world timestamp
Preprocessing
- Preprocessors and Actions can be saved/loaded from json, yaml, and created/outputed to dictionaries
- CNN/SpectrogramClassifier objects, when saved, now also save the entire preprocessing pipeline and settings, allowing reproducibility without needing to pickle the object. Custom preprocessing steps or classes can be included if the user "registers" them, see
preprocess_audio_dataset.ipynb
tutorial in the "Save and Load Preprocessors" section - PCEN Preprocessor class for applying per-channel energy normalization during preprocessing
- added audio-domain augmentations in the action_functions.py module: random_wrap_audio, audio_time_mask, audio_random_gain, audio_add_noise
- new public method load_metadata() to load audio metadata from file without loading the audio data
Annotation
- integration with the crowsetta package, which supports i/o for many common bioacoustics annotation formats. See BoxedAnnotations methods
to_crowsetta
andfrom_crowsetta
Notable changes to the API:
In general, if you are using code from an older OpenSoundscape version and something is broken, look at the docstrings of the associated functions and classes. Most of the time, you'll just need to change an argument name or make some other slight change. Reach out to an opensoundscape developer of you need help.
- BoxedAnnotations method renamed from
one_hot_clip_labels
toclip_labels
, and now requires passing the name or position of the annotation column localization.py
module refactored into a folder with sub-modules- SpatialEvent.estimate_position() returns a PositionEstimate object, rather than returning self (but usage remains about the same; see
acoustic_localization.ipynb
tutorial); estimate_position() no longer modifies SpatialEvent in place, instead it returns an object with the position estimate and other data - note that the CNN class still exists, but is an alias for a new class SpectrogramClassifier
- instead of a .go() method, Action class uses
__call__()
, which enables simply calling action(sample)
Merged Pull Requests since 0.10.2
- more flexible specification of overlap by @sammlapp in #1003
- avoid using slash character in wandb logging by @sammlapp in #1000
- check for labels outside range [0,1] by @sammlapp in #1001
- Solved Issue 889 preprocessor FutureWarning by @LeonardoViotti in #999
- resolve 911 change labels of Spectrogram.plot() and add kHz arg by @sammlapp in #1006
- only use internal pytorch architectures by @sammlapp in #1004
- resolves Many overlap_fractions don't produce results #945 by @sammlapp in #996
- Issue 855 one hot labels by @LeonardoViotti in #982
- crowsetta integration by @sammlapp in #764
- Issue 930 localization bug by @louisfh in #990
- add kwargs to Spectrogram.from_audio by @sammlapp in #1015
- 922 docs fixes + random typos by @syunkova in #1029
- expose audio.load_metadata in api by @sammlapp in #1037
- implement noise reduction by @sammlapp in #1025
- Implement embed() method by @sammlapp in #1021
- adds MultiChannelAudio class by @sammlapp in #1038
- rename Action.go() to Action.call by @sammlapp in #1039
- use timestamps in localization tools by @sammlapp in #1020
- Refactor machine learning modules to enable Ligthning by @sammlapp in #1036
- Feat 350 pcen by @sammlapp in #1048
- Feat_categorical_labels by @sammlapp in #1053
- Fix index checking by @sammlapp in #1054
- use speed_of_sound attribute in methods by @sammlapp in #1055
- Changed annotation column name/id to one required parameter when loading Raven files into BoxedAnnotations by @syunkova in #1058
- Add audiomoth GPS sync tools and refactor localization into modules by @sammlapp in #1064
- Release 0.11.0 by @sammlapp in #1066
Full Changelog: v0.10.2...v0.11.0
v0.10.2
OpenSoundscape version 0.10.2
This release includes bug fixes to 0.10.1 and adds a few functionalities. It does not implement breaking changes.
New "Classifiers 101" Model Training Tutorial
We've added a brand new section of the documentation called Classifiers 101 that explains the process of training machine learning models, step by step.
New features:
Sample preprocessing time profiling
profile of the time taken by each action in a preprocessor when preprocessing a sample:
sample = preprocessor.forward(sample, profile=True)
# sample now contains a dictionary .runtime listing the time to complete each Action in preprocessor.pipeline
sample.runtime
set_seed
Use opensoundscape.utils.set_seed()
to simultaneously set torch
, numpy
, and random
seeds and to get deterministic behavior, for instance when initializing and training pytorch models.
Pull requests merged
- Feat_profile_preprocessing by @sammlapp in #937
- Intro to training classifier by @sanruizguz in #966
- Feat set seed by @LeonardoViotti in #929
- resolve spectrum estimation 'wrong' #947 by @sammlapp in #989
- resolves issue 924 futurewarning pandas by @louisfh in #988
- improve AudioTrim action by @sammlapp in #997
- release 0.10.2 by @sammlapp in #1008
Note that a branch release_0_10_2 was merged to master (rather than merging develop to master) because develop now includes breaking changes that will be released in v0.11.0.
Full Changelog: v0.10.1...v0.10.2
v0.10.1
Bug fixes to 0.10.0 and minor updates
This release contains bug fixes to some specific issues, including #881 and #872. It makes a small set of breaking (non backwards compatible) changes to BoxedAnnotations by changing the name of the attribute raven_files
to annotation_files
and DataFrame column "raven_file"
to "annotation_file"
.
In particular, a bug in 0.10.0 preprocessing created samples including np.nan values, which results in CNN training and prediction also producing nan values. We've restored the behavior of clipping samples to a decibel range (default [-100,-20]) so that they can't contain -inf, which fixes this behavior.
This release also adds additional top-level imports, for instance the localization classes SpatialEvent and SynchronizedRecorderArray are now exposed at the top-level (eg, from opensoundscape import SynchronizedRecorderArray
). For changes to OpenSoundscape v0.9.x, see the release notes and discussion page for v0.10.0
A new method of the Audio class trim_samples allows trimming Audio by sample positions (as opposed to by time in seconds with .trim())
What's Changed
- loca docs hotfix by @louisfh in #882
- Issue_872_annotations by @sammlapp in #894
- refactor SpectrogramToTensor action by @sammlapp in #892
- 0.10.1 release by @sammlapp in #895
- update version number to 0.10.1 by @sammlapp in #898
Full Changelog: v0.10.0...v0.10.1
v0.10.0
OpenSoundscape v0.10.0
This release contains substantial updates to the OpenSoundscape code base, including some breaking changes, new features, and bug fixes.
Highlights:
- Use pre-trained bird identification models including BirdNET and Google Perch through the Bioacoustics Model Zoo. Examples here with each supported model. A quick example with Google Perch:
import torch
model=torch.hub.load('kitzeslab/bioacoustics_model_zoo', 'Perch')
predictions = model.predict(['test.wav']) #predict on the model's classes
embeddings = model.generate_embeddings(['test.wav']) #generate embeddings on each 5 sec of audio
-
The acoustic
localization
module, still in Beta, has been updated with improved tools. We've also added a tutorial notebook in the documentation -
Enhanced automatic Weights and Biases (wandb) logging from the CNN class now includes GradCAM visualizations of samples during
CNN.predict()
and per-class area-under-the-curve metrics duringCNN.train()
-
Apple Silicon GPU (
mps
) is automatically used byCNN
class if available -
Overlay (mixup) augmentation provides the ability to only perform overlay on a subset of training samples
-
Audio class has new filtering methods
.lowpass()
and.highpass()
-
The documentation at opensoundscape.org has been updated including refreshed tutorial notebooks, which can now be run interactively on Google CoLab without requiring any installation of Python or packages on the user's computer.
-
We've added utilities (
opensoundscape.ml.utils.collate_audio_samples_to_tensors()
) and documentation (see "Orientation for PyTorch users") for PyTorch users using OpenSoundscape only for preprocessing -
The behavior of Spectrogram and MelSpectrogram has been improved, resolving issues that created -inf values and/or clipped the values of the spectrogram to a finite range.
The full API is documented at opensoundscape.org. You'll also find detailed tutorials on the most common use-cases for OpenSoundscape: automated sound identification with machine learning and signal processing; training machine learning algorithms to detect sounds; localizing acoustic sound sources; manipulating audio, spectrograms and annotations.
Also check out the Bioacoustics Cookbook for examples of common workflows, including the use of a config file for CNN training experiments.
What's Changed
- resolve MelSpectrogram has all -inf values #794 by @sammlapp in #802
- resolve #792 no default shape by @sammlapp in #804
- Issue 789 per class metrics by @sammlapp in #806
- Issue 779 load model helpful error message by @sammlapp in #805
- automatically use mps if available by @sammlapp in #809
- dont log_graph in wandb.watch by @sammlapp in #808
- add Audio.highpass and Audio.lowpass by @sammlapp in #815
- Issue 784 boxed ann by @sammlapp in #813
- #769: Raven file not retained in BoxedAnnotations if keep_extra_column not set to True by @syunkova in #812
- Add links to source code in docs by @sammlapp in #836
- Issue 840 posix paths by @louisfh in #846
- fix Audio.from_url raising AxisError #837 by @sammlapp in #843
- use local version of ScoreCAM with bug fix by @sammlapp in #842
- add criterion_fn to overlay (selective overlay) by @sammlapp in #839
- Issue 767 tqdm by @sammlapp in #834
- Support tensorflow model hub inference by @sammlapp in #835
- Issue 768 cam to cpu by @sammlapp in #810
- MPS memory issue on github macos runner workarouns by @louisfh in #851
- Docs refresh by @rhine3 in #848
- Feat localize parallelize by @louisfh in #783
- Add notebook for localization tutorial by @louisfh in #853
- Issue 790 pytorch integration by @sammlapp in #849
- make copy in one_hot_labels_on_time_interval by @sammlapp in #858
- Add GradCAM visualization to wandb during CNN.predict by @sammlapp in #862
- Issue_772_bandpass_warning by @sammlapp in #860
- Tutorial updates by @sammlapp in #866
- Issue 793 spectrogram limits by @sammlapp in #868
- refactor safe_dataset resolve #672 by @sammlapp in #869
- 0.10.0 release by @sammlapp in #867
Full Changelog: v0.9.1...v0.10.0
v0.9.1
OpenSoundscape v0.9.1
This release includes bug fixes and small changes to OpenSoundscape, and moves two modules previously in opensoundscape to separate packages:
taxa
module is now a separate github repoaru
module has been removed and is now a separate github repo and package available on PyPI:pip install aru_metadata_parser
(and is a dependency of OpenSoundscape)- the
localization
module has been improved with bug fixes, including corrected cross-correlation algorithm forgcc
- several issues and bugs have been resolved (see below for details)
What's Changed
- Increase tolerance of Audio too short warning by @louisfh in #728
- for metadata values, replace empty string with space by @sammlapp in #713
- fix failing tests for Python 3.10: avoid deprecated pd.Series.append by @sammlapp in #739
- resolve #727 AudioSample repr error by @sammlapp in #740
- change import convention for scipy by @sammlapp in #738
- documentation updates by @sammlapp in #741
- resolve issues for BoxedAnnotations by @sammlapp in #742
- add check for artist field to prevent KeyError by @syunkova in #748
- metadata parsing resolve #745 by @sammlapp in #750
- don't fail in predict() with empty sample list by @sammlapp in #755
- remove
taxa
module andresources
by @sammlapp in #759 - Issue 756 aru package by @sammlapp in #758
- Issue 723 gcc tdoa by @louisfh in #724
- refactor audio.extend by @sammlapp in #763
- version 0.9.1 by @sammlapp in #760
Full Changelog: v0.9.0...v0.9.1
v0.9.0
Opensoundscape v0.9.0
This release represents a major update to the OpenSoundscape library, including new features, bug fixes, and some breaking changes.
New feature highlights
- Localization of sounds from time-synchronized recorders with the
localization
module - Class activation mapping in deep learning models with several flavors of GradCAM and guided backpropagation using the
cam
module orCNN.generate_cams()
- BoxedAnnotation class now support loading, saving, and manipulating labels (and Raven formatted txt files) for many audio files at once, instead of just one audio file
- New methods and properties for the Audio class, including
.show_widget()
,.dBFS
,.rms
,.from_url()
,.normalize()
, and.apply_gain()
- Methods for loading and saving CNN class across OpenSoundscape versions (
.save_torch_dict()
and.from_torch_dict()
)
What's Changed
- add missing mock-imports by @sammlapp in #650
- merge hotfix from master to develop by @sammlapp in #651
- 578: tutorial download links by @syunkova in #658
- Implement wandb.watch in CNN.train() by @sammlapp in #675
- Class activation maps by @sammlapp in #676
- change default resample type to 'soxr_hq' by @sammlapp in #682
- implement save_torch_dict and load_torch_dict by @sammlapp in #683
- Refactor modules by @sammlapp in #686
- warn usr if load_model loads model of old version resolves #661 by @sammlapp in #689
- Issue 679 retain metadata when using Audio methods by @sammlapp in #690
- Resolves #579 boxed annotations by @sammlapp in #688
- Audio class display and from_url by @sammlapp in #691
- add gcc (generalized cross correlation) function and tests by @louisfh in #693
- Feat refact localizer by @louisfh in #697
- v0.9.0 by @sammlapp in #702
Full Changelog: v0.8.0...v0.9.0
v0.8.0
OpenSoundscape 0.8.0 Release
This release represent a substantial update to opensoundscape, containing new features, bug fixes, and breaking changes to some parts of the API.
New Features:
The notebook New in OpenSoundscape 0.8.0 summarizes many of the new features, with code examples. New features include:
- logging progress of model training and prediction to Weights and Biases
- writing mp3 files (thanks to soundfile 1.1.0)
- training CNNs on long audio files without splitting into clips
- updated Audio metadata reading and writing
- Audio classmethods
noise
andsilence
- Audio method
normalize
- top level imports of most used classes. Eg,
from opensoundscape import Audio, Spectrogram, CNN
- debug preprocessing by asking train() and predict() by passing
raise_errors=True
to raise instead of catch Preprocessing errors - option to keep all intermediate versions of preprocessed sample with
preprocessor.forward(...,trace=True)
- new augmentations in
actions
module: audio_random_gain and audio_add_noise - added support for EfficientNet architecture
- new Audio object properties for
.rms
and.dBFS
Breaking changes:
- Audio.duration is now a property (use
.duration
not.duration()
), as are several Spectrogram attributes that were previously methods - CNN.predict() now only returns one argument, the dataframe of scores. Optionally, return a second argument with invalid samples list. Use
predict_multi_target_labels
orpredict_single_target_labels
from themetrics
module to generate 0/1 outputs from continuous scores. - a few arguments to functions have changed names and order
Merged Pull Requests
- update docs to clarify loading old models by @sammlapp in #572
- update docs to clarify loading old models by @sammlapp in #571
- Handles file with zero annotations by @jatinkhilnani in #593
- Updated
BoxedAnnotations
docstring by @jatinkhilnani in #594 - Minor documentation updates by @sammlapp in #608
- resolves #591 by @sammlapp in #612
- Adds Python 3.9 version to CI and a macos runner. by @louisfh in #595
- #552 Standardize references to 'model' object by @louisfh in #587
- fixes #558 with named args for Spectrogram methods by @louisfh in #585
- Support writing mp3 (soundfile 0.11) by @sammlapp in #613
- resolves #614 - avoid bug in load channels of mono by @sammlapp in #616
- Add WandB logging and Support training on a dataframe of clip labels by @sammlapp in #609
- log top-scoring samples to wandb (resolves #575) by @sammlapp in #619
- Feat 478 efficientnet and refactor cnn_architectures by @sammlapp in #620
- Bump dependencies by @sammlapp in #621
- Feat 569 audio by @sammlapp in #622
- Resolves #563 mps logit not implemented by @sammlapp in #626
- Add mix and concat by @sammlapp in #623
- explicit arguments for scipy.spectrogram by @sammlapp in #627
- use explicit arguments for softmax() by @sammlapp in #629
- add json comment-string metadata support by @sammlapp in #632
- Issue561 - Flat Module Structure in Docs by @syunkova in #638
- add audio_random_gain in actions by @sammlapp in #637
- modify predict to only return score df by @sammlapp in #628
- Audio add noise augmentation by @sammlapp in #639
- Issue 617 clipdf by @sammlapp in #631
- Issue 636 one output node for single-class classification by @sammlapp in #641
- Change default args from lists to tuple by @sammlapp in #644
- Issue543_v2: Use_pretrained is deprecated by @syunkova in #645
- Fixed - preprocess - actions by @jatinkhilnani in #570
- Issue 562 preprocessing raise errors by @louisfh in #646
- Issue_510_annotations by @sammlapp in #647
- Debug/trace feature by @jatinkhilnani in #596
- version 0.8.0 release by @sammlapp in #648
Full Changelog: v0.7.1...v0.8.0
v0.7.1
OpenSoundscape v0.7.1
This release updates dependencies, fixes various bugs, updates documentation, and adds a few small features.
Notable feature additions
- Parsing metadata from AudioMoth firmware through 1.8.1 and from SongMeter Micros is now supported. The Audio.from_file method will automatically parse metadata and store it in the object's
.metadata
dictionary unless you specifymetadata=False
- Audio.save() now saves metadata, so that you can retain or update .wav file metadata when using OpenSoundscape
- ResNet networks with pre-trained weights and a number of input channels other than 3 now average conv1 weights over the input channels
Dependencies
Python 3.9 no longer causes installation issues. This release updates OpenSoundscape's package dependencies in a way that allows Python 3.9 environments. Users with Apple Silicon chips to install OpenSoundscape hassle-free in a Python 3.9 environment (but may experience issues with Python 3.7 and 3.8).
If you encounter problems with installation, regardless of your platform, please open an issue.
What's Changed: a list of merged Pull Requests
- add scaling option to the Spectrogram class and MelSpectrogram class by @zoharrpg in #505
- Annotation docs hotfix by @rhine3 in #517
- Minor updates to annotation tutorial by @rhine3 in #516
- Fixes Poetry Installer issue during CI by @louisfh in #535
- change filepath of poetry by @louisfh in #536
- Removed redundant empty load method from base model modules by @jatinkhilnani in #537
- Update black version by @sammlapp in #539
- Issue 472 spectrogram methods by @louisfh in #534
- Resolved - Resample with replacement #479 by @jatinkhilnani in #497
- resolve 520 skip loading metadata by @sammlapp in #541
- resolve 513 Spectrogram loses properties on out-of-place method by @sammlapp in #542
- Fix loading of ResampleLoss CNN models by @sammlapp in #544
- Resolve Issue 507 metrics by @sammlapp in #540
- update pre-commit black version to 22.8.0 by @sammlapp in #551
- Issue 531 final clip by @jatinkhilnani in #550
- Issue 496 tests for cnn arch by @louisfh in #545
- different # input channels should retain pre-trained weights by @sammlapp in #546
- Issue #523 AudioMoth 1.5-1.8.1 MetaData by @syunkova in #553
- Sammlapp/issue391 by @sammlapp in #554
- v0.7.1 by @sammlapp in #557
New Contributors
Full Changelog: v0.7.0...v0.7.1
v0.7.0
This release marks a significant change to the OpenSoundscape API, the addition of several new features, and the resolution of various bugs. Updates include:
- simplified training and prediction with CNNs / pytorch machine learning models
- incorporated preprocessing into CNN class so that all necessary parameters and settings are bundled with saved object
- extract audio from AudioMoth files based on a real-world timestamp
- more flexible spectrogram generation parameters in Spectrogram.from_audio
- load specific channels of multi-channel audio file
- more flexible validation step and choice of metrics when training PyTorch models
- refreshed tutorials
as well as fixes to various bugs and resolution of various issues.
Merged branches
- Tutorial updates by @jatinkhilnani in #475
- add function to load separate channels from audio file by @sammlapp in #492
- Issue 401 change mp3s in tutorials to wav by @louisfh in #490
- Refactor
preprocessing
andmodels.cnn
by @sammlapp in #489 - v0.7.0 minor release by @sammlapp in #499
New Contributors
- @jatinkhilnani made their first contribution in #475
Full Changelog: v0.6.2...v0.7.0