Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximum Likelihood Demonstration #26

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from
Draft

Conversation

charlesknipp
Copy link
Collaborator

@charlesknipp charlesknipp commented Jan 8, 2025

Overview

I added a quick little MLE demonstration, which should work for almost every AD backend (more on this later). This particular example was taken from Kalman.jl using Optimisers.jl to create a custom Newton's method.

A Note on Automatic Differentiation

All the AD here is done via DifferentiationInterface.jl to support something more universal. This allows us to quickly test various backends, switching to whichever suits the users' needs. Luckily, in this instance, every relevant backend can at least evaluate gradients. Unfortunately, since this is a Newton's method, the Hessian fails when using Enzyme.

Second Order Differentiation

Mooncake is still working on their implementation, which is very exciting news; although that means it's functionality is limited to only basic gradient descent with appropriate hyperparametrs.

Enzyme on the other hand, fully supports the calculation of the Hamilton vector product. In our demonstration however, this computation will fail with both DI and base Enzyme.

Requests

  • Make sure to dev both SSMProblems and GeneralisedFilters before resolving the toml
  • If you have any suggestions, feel free to tear it to shreds

@charlesknipp
Copy link
Collaborator Author

I have been testing various backends and there seem to be some problems. I'm not entirely sure it's our fault, but it seems like Zygote, Enzyme, and Mooncake all yield some sort of error.

Zygote fails rather spectacularly because the observation coefficient matrix is non-square. I'm not sure how this even factors into the computation of the gradient; and, if I'm being honest, if Zygote is the one we can't get working then I'm okay with that.

Both Enzyme and Mooncake error specifically when computing the Hessian. For Mooncake, this is due to some recursive tangent type error; I'm not entirely sure why this wasn't an issue with the gradient, but I think there may be some problems. Enzyme just has issues with the additional context (the Constant seen in the objective), which is an underlying bug in the way it interacts with DifferentiationInterface. Lastly, the gradient computation for both is incredibly slow, even for shorter series. Which leads me to believe there might be something else under the hood.

Takeaways

Of all the working interfaces, I'm glad it's ForwardDiff since it's a stable and widely adopted framework. Although, the future of AD is clearly in the direction of both Mooncake and Enzyme. In the future, I'd really like to get those two backends to work consistently.

@THargreaves
Copy link
Collaborator

I don't have the environment properly set up since GeneralisedFilters is not on the official registry, and I can't figure out how to add it to Project.toml

I've just been deving it using the relative path. But yes, should get it registered soon. Was just waiting to tidy things up a touch more but think it should be there by the end of the week.

@THargreaves
Copy link
Collaborator

All sounds very strange, especially with it not working for non-square H. Glad the ForwardDiff version is working though.

I'll see if Will (Mooncake.jl author) can have a quick look and see if anything stands out to him.

yebai added a commit that referenced this pull request Mar 4, 2025
* Initial commit

* Readme

* Project

* Prototype design.  (#2)

* prototype

* Update SSMProblems.jl

* add logM (#3)

* Convert example into docstring.

* Move `logM`.

---------

Co-authored-by: FredericWantiez <[email protected]>

* export

* example

* Fred/ancestor (#5)

* Gibbs

* Add ancestor resampling

* Better names

* Clean up package (#6)

* Gibbs

* Setup github

* Update Readme

* Docs

* GH actions

* Format

* Upgrade node

* Use GH token

* Fix links

* Clean up

* Write a proper example implementation (#7)

* SMC

* remove old file

* Fix types

* Apply suggestions from code review

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Some minor changes (#8)

* use recommended style for interface methods

* minor changes to example

* Update smc.jl

* Update smc.jl

---------

Co-authored-by: Hong Ge <[email protected]>

* Particle filter example bug fix and refactoring (#10)

* fix: corrected observation generation for
particle filter example

* refactor: tidied particle filter example code

- Removed recursive particle show method which flooded REPL
- Removed redundant resampling logic
- Replaced variance with std in Normal() calls
- Tidied final scatter plot

* Updated formatting for named argument

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: corrected flipped noise standard deviations

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix format action (#13)

* Fix docs action (#12)

* Fix docs action

* Add DOCUMENTER_KEY

* Use julia-docdeploy action

* Show link to docs preview (#15)

* Update documentation (#16)

* Add details to doc

* Fix source

* Typo

* Update SSM Interface

* Fix linearize bug

* Interface

* Format

* Trying things

* Fix transition!!

* Format

* Fix doc

* Helper

* Format

* Forget about particles

* Optional timestep

* Add utils

* Apply suggestions from code review

Co-authored-by: David Widmann <[email protected]>

* Utils module

---------

Co-authored-by: David Widmann <[email protected]>

* Update README.md (#17)

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update index.md

* Update documentation to match new interface (#18)

* Update documentation to match new interface

* Update index.md

---------

Co-authored-by: Hong Ge <[email protected]>

* Build examples with doc (#19)

* Build examples with doc

* Reduce size of plot

* Colors

* Size option

* Increase size per page

* Update README.md

* Update Project.toml

* Modify SSMProblems to work with AbstractMCMC interface (#22)

* Update documentation to match new interface

* Incorporate SSMProblems into AbstractMCMC

* Update Project.toml (#24)

* Update Project.toml

* Update make.jl

---------

Co-authored-by: Hong Ge <[email protected]>

* Add Kalman filter example (#26)

* Add Kalman filter example

* Fix formatting issues

* Add literate for docs

* add missing deps

* Comments for literate

* Format

* Use `Gaussian` (#28)

* Format, use `Gaussian`

* Fix the maths

* Format

* Tweaks

* Update script.jl

* Update script.jl

---------

Co-authored-by: Hong Ge <[email protected]>

* Update Project.toml

---------

Co-authored-by: FredericWantiez <[email protected]>
Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: Hong Ge <[email protected]>

* Update script.jl

* Update script.jl

* Create DocsNav.yml

* Add example script for PMMH (#37)

* Add example script for PMMH

* Add Literate.jl

* Update script.jl

* Update script.jl

* Update DocsNav.yml

* Update SSMProblems.jl interface (#38)

* Add split dynamics/observation interface with "extra" variables

* Add utilities for forward simulation and distribution definitions

* Removed redundant particle container code

* Update naming convention for initialisation log-density

Co-authored-by: Hong Ge <[email protected]>

* Update initialisation naming

Co-authored-by: Hong Ge <[email protected]>

* Update naming convention

Co-authored-by: Hong Ge <[email protected]>

* Change sampler to AbstractMCMC

Co-authored-by: Hong Ge <[email protected]>

* Add section heading for SSM

Co-authored-by: Hong Ge <[email protected]>

* Remove redundant method check

* Correct dependencies

* Revert to positional arguments

* Correct forward simulation element type

* Update Kalman filter example to new interface

* Fix formatting issue

* Add missing import

Co-authored-by: Charles Knipp <[email protected]>

* Add default rngs to samplers through macro

* Remove unnecessary section heading

* Add missing dependency

* Tidied Kalman filter example

* Update documentation main page

* Fully document Kalman filter example

* Remove outdated examples

* Add documentation for extra argument

* Apply suggestions from code review

* Update script.jl

* Update examples/kalman-filter/script.jl

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update script.jl

* Remove PMMH (until new API)

* Remove ref to 'Utils'

* Fix broken link

* Remove default RNG macro

* Simplify interface methods

* Correct old function names

* Add parametric type to Kalman filter

* Update main doc page

* Update README

* Make parameter order consistent

---------

Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: Charles Knipp <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: FredericWantiez <[email protected]>

* Minor tweaks and typo fixes  (#41)

* Update index.md

* Update index.md

* Update Project.toml

* Suppress output in example script

---------

Co-authored-by: THargreaves <[email protected]>

* Split method definitions to avoid docstring overwriting (#42)

* Function docstring formatting (#45)

* added TagBot & CompatHelper workflows (#47)

* added TagBot & CompatHelper workflows

* using existing Documenter Key for CompatHelper

* CompatHelper: add new compat entry for Distributions at version 0.25, (keep existing compat) (#48)

Co-authored-by: CompatHelper Julia <[email protected]>

* Update DocsPreviewCleanup.yml

* TagBot Permission Issue fixed (#50)

* Add DOCUMENTER_KEY to Docs workflow (#52)

* Update DocsPreviewCleanup.yml

* Correct type signature for forward simulation method

* Alignment of obs/dyn time steps and refactored forward simulation (#55)

* Update interface documentation to align dyn/obs time steps

* Refactor forward simulation, add type parameters, add unit test

* Add type parameters to docstrings

* Update kalman example

* Add extra for initialisation and simplify obs type parameter

* Bump minor version

* Update unit tests to match aligned interface

* Fix code comment rendering

* Interface Changes for Use in Filtering (#56)

* added basic particle methods and filters

* added qualifiers

* added parameter priors

* added adaptive resampling to bootstrap filter (WIP)

* Julia fomatter changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* changed eltype for <: StateSpaceModel

* updated naming conventions

* formatter

* fixed adaptive resampling

* added particle ancestry

* formatter issues

* fixed metropolis and added rejection resampler

* Keep track of free indices using stack

* updated particle types and organized directory

* weakened SSM type parameter assertions

* improved particle state containment and resampling

* added hacky sparse ancestry to example

* fixed RNG in rejection resampling

* improved callbacks and resamplers

* formatting

* added conditional SMC

* improved linear model type structure

* formatter

* replaced extra with kwargs

* formatter

* migrated filtering code

* Add unittests for new interface

* Update documentation to match kwargs

* Rename extras/kwargs docs file

* remove redundant forward simulations

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Tim Hargreaves <[email protected]>

* Bump 0.4.0 (#58)

* Update docs to match kwargs interface

* Add method definitions for batch simulation/log-densities

* Bump minor version

* Correct docstring overwriting for batch methods

* Update type parameters to contain both arithmetic and element type

* Correct docstring indentations

* Correct RBPF forward simulation

* Documentation and Turing Navigation CI improvement (#61)

* Update Docs.yml

* Update DocsNav.yml

* No need of deploydocs() after using new Docs & DocsNav workflows

* Remove research files from repository

* removed SSMProblems README, LICENSE, GHA workflow, JULIAFORMATTER in favor of merger

* SSMProblems: added missing docstring to avoid documentation failure

* added pkg_path in Docs workflow to fix package development

---------

Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: FredericWantiez <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Tor Erlend Fjelde <[email protected]>
Co-authored-by: Tim Hargreaves <[email protected]>
Co-authored-by: David Widmann <[email protected]>
Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: Charles Knipp <[email protected]>
Co-authored-by: THargreaves <[email protected]>
Co-authored-by: Will Tebbutt <[email protected]>
Co-authored-by: CompatHelper Julia <[email protected]>
Co-authored-by: Penelope Yong <[email protected]>
@charlesknipp charlesknipp force-pushed the ck/maximum-likelihood branch from 6b9b3bb to acbc4e1 Compare March 7, 2025 21:02
@shravanngoswamii
Copy link
Member

  • I don't have the environment properly set up since GeneralisedFilters is not on the official registry, and I can't figure out how to add it to Project.toml

I guess, this may help you:
https://pkgdocs.julialang.org/dev/toml-files/#The-[sources]-section

Monorepo example is what we need:

[sources]
Example = {url = "https://github.com/JuliaLang/Example.jl", rev = "custom_branch"}
WithinMonorepo = {url = "https://github.org/author/BigProject", subdir = "SubPackage"}
SomeDependency = {path = "deps/SomeDependency.jl"}

@gdalle
Copy link

gdalle commented Mar 9, 2025

Hi there! I'd be happy to help you figure out AD-related issues.

  • Mooncake is not expected to handle second-order derivatives at the moment, we're working on it with @willtebbutt but it's a daunting task.
  • Zygote's bug is rather disturbing, I can't really explain it.
  • Enzyme even errors for the gradient when I try it on the latest version of everything. This makes it difficult for me to debug the Hessian. Since January, I have changed the DI.gradient bindings to forward as much as possible to Enzyme without meddling, so that might explain the difference wrt to @charlesknipp's first post. Could you maybe try it again to confirm that we're seeing the same things?

In general, when you encounter something you think is an issue with DI, I encourage you to open an issue on the DI repo with an MWE. It would help tremendously :)

@charlesknipp
Copy link
Collaborator Author

charlesknipp commented Mar 10, 2025

Enzyme even errors for the gradient when I try it on the latest version of everything. This makes it difficult for me to debug the Hessian. Since January, I have changed the DI.gradient bindings to forward as much as possible to Enzyme without meddling, so that might explain the difference wrt to @charlesknipp's first post. Could you maybe try it again to confirm that we're seeing the same things?

@gdalle I get an error for illegal type analysis with strict aliasing turned on. The computation succeeds with Enzyme.API.strictAliasing!(false), which is good but suboptimal. it works with the latest commit.

Just to clarify, I'm working with DI version 0.6.43 and Enzyme version 0.13.30 on Julia 1.10.4 (my work computer)

EDIT: I'm not sure why, but forcing symmetry on the innovation covariance caused Enzyme to report a type instability. I converted it to a symmetric type instead, and somehow that fixed the inconsistency.

@gdalle
Copy link

gdalle commented Mar 10, 2025

I'm afraid we'll have to concoct a pure-Enzyme MWE for the Hessian error, otherwise the Enzyme devs won't want to debug it

@yebai
Copy link
Member

yebai commented Mar 10, 2025

Second Order Differentiation

It is better to implement the basic ones first, i.e. first-order optimisation.

@charlesknipp
Copy link
Collaborator Author

charlesknipp commented Mar 11, 2025

@yebai good call. I'll focus on the non-linear methods from now on. The resampling step is likely to cause some trouble. In some variational algorithms here I was able to use Zygote.@ignore to stop the gradient calculation consistent with the literature.

@gdalle I added a minimum working example, isolated from the SSMProblems/GeneralisedFilters interface. The gradient works fine, but the Hessian needs some work. I created some sugar akin to their demo from here, which clearly doesn't work. If you have some ideas let me know. I'm not sure how DI interfaces with Enzyme for the Hessian.

@gdalle
Copy link

gdalle commented Mar 11, 2025

Your MWE looks simple enough to open an issue on Enzyme, so I'd say that's the right move

THargreaves and others added 10 commits March 11, 2025 17:28
The batch Kalman filter was using two prototype custom CUDA kernels for performing fast Cholesky and regular-transposed matrix multiplication. The CUBLAS version of the former (technically LU) has been sped up now by better pointer creation in CUDA.jl. The second is still very slow and is the bottleneck to the computation. The prototype replacement is not fully tested though and even though it passed this packages unit tests, I found a case where it was giving unexplained incorrect values. For safety, it will be removed for now until the custom batch kernels are fully tested.
log_weights gives a more reliable length than particles, which may be a strange type that we haven't defined an interface for yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants