Maximum Likelihood Demonstration #26

charlesknipp · 2025-01-08T18:30:58Z

Overview

I added a quick little MLE demonstration, which should work for almost every AD backend (more on this later). This particular example was taken from Kalman.jl using Optimisers.jl to create a custom Newton's method.

A Note on Automatic Differentiation

All the AD here is done via DifferentiationInterface.jl to support something more universal. This allows us to quickly test various backends, switching to whichever suits the users' needs. Luckily, in this instance, every relevant backend can at least evaluate gradients. Unfortunately, since this is a Newton's method, the Hessian fails when using Enzyme.

Second Order Differentiation

Mooncake is still working on their implementation, which is very exciting news; although that means it's functionality is limited to only basic gradient descent with appropriate hyperparametrs.

Enzyme on the other hand, fully supports the calculation of the Hamilton vector product. In our demonstration however, this computation will fail with both DI and base Enzyme.

Requests

Make sure to dev both SSMProblems and GeneralisedFilters before resolving the toml
If you have any suggestions, feel free to tear it to shreds

charlesknipp · 2025-01-08T21:28:42Z

I have been testing various backends and there seem to be some problems. I'm not entirely sure it's our fault, but it seems like Zygote, Enzyme, and Mooncake all yield some sort of error.

Zygote fails rather spectacularly because the observation coefficient matrix is non-square. I'm not sure how this even factors into the computation of the gradient; and, if I'm being honest, if Zygote is the one we can't get working then I'm okay with that.

Both Enzyme and Mooncake error specifically when computing the Hessian. For Mooncake, this is due to some recursive tangent type error; I'm not entirely sure why this wasn't an issue with the gradient, but I think there may be some problems. Enzyme just has issues with the additional context (the Constant seen in the objective), which is an underlying bug in the way it interacts with DifferentiationInterface. Lastly, the gradient computation for both is incredibly slow, even for shorter series. Which leads me to believe there might be something else under the hood.

Takeaways

Of all the working interfaces, I'm glad it's ForwardDiff since it's a stable and widely adopted framework. Although, the future of AD is clearly in the direction of both Mooncake and Enzyme. In the future, I'd really like to get those two backends to work consistently.

THargreaves · 2025-01-08T22:30:18Z

I don't have the environment properly set up since GeneralisedFilters is not on the official registry, and I can't figure out how to add it to Project.toml

I've just been deving it using the relative path. But yes, should get it registered soon. Was just waiting to tidy things up a touch more but think it should be there by the end of the week.

THargreaves · 2025-01-08T22:40:35Z

All sounds very strange, especially with it not working for non-square H. Glad the ForwardDiff version is working though.

I'll see if Will (Mooncake.jl author) can have a quick look and see if anything stands out to him.

* Initial commit * Readme * Project * Prototype design. (#2) * prototype * Update SSMProblems.jl * add logM (#3) * Convert example into docstring. * Move `logM`. --------- Co-authored-by: FredericWantiez <[email protected]> * export * example * Fred/ancestor (#5) * Gibbs * Add ancestor resampling * Better names * Clean up package (#6) * Gibbs * Setup github * Update Readme * Docs * GH actions * Format * Upgrade node * Use GH token * Fix links * Clean up * Write a proper example implementation (#7) * SMC * remove old file * Fix types * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: Hong Ge <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Some minor changes (#8) * use recommended style for interface methods * minor changes to example * Update smc.jl * Update smc.jl --------- Co-authored-by: Hong Ge <[email protected]> * Particle filter example bug fix and refactoring (#10) * fix: corrected observation generation for particle filter example * refactor: tidied particle filter example code - Removed recursive particle show method which flooded REPL - Removed redundant resampling logic - Replaced variance with std in Normal() calls - Tidied final scatter plot * Updated formatting for named argument Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix: corrected flipped noise standard deviations --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix format action (#13) * Fix docs action (#12) * Fix docs action * Add DOCUMENTER_KEY * Use julia-docdeploy action * Show link to docs preview (#15) * Update documentation (#16) * Add details to doc * Fix source * Typo * Update SSM Interface * Fix linearize bug * Interface * Format * Trying things * Fix transition!! * Format * Fix doc * Helper * Format * Forget about particles * Optional timestep * Add utils * Apply suggestions from code review Co-authored-by: David Widmann <[email protected]> * Utils module --------- Co-authored-by: David Widmann <[email protected]> * Update README.md (#17) * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update index.md * Update documentation to match new interface (#18) * Update documentation to match new interface * Update index.md --------- Co-authored-by: Hong Ge <[email protected]> * Build examples with doc (#19) * Build examples with doc * Reduce size of plot * Colors * Size option * Increase size per page * Update README.md * Update Project.toml * Modify SSMProblems to work with AbstractMCMC interface (#22) * Update documentation to match new interface * Incorporate SSMProblems into AbstractMCMC * Update Project.toml (#24) * Update Project.toml * Update make.jl --------- Co-authored-by: Hong Ge <[email protected]> * Add Kalman filter example (#26) * Add Kalman filter example * Fix formatting issues * Add literate for docs * add missing deps * Comments for literate * Format * Use `Gaussian` (#28) * Format, use `Gaussian` * Fix the maths * Format * Tweaks * Update script.jl * Update script.jl --------- Co-authored-by: Hong Ge <[email protected]> * Update Project.toml --------- Co-authored-by: FredericWantiez <[email protected]> Co-authored-by: Hong Ge <[email protected]> Co-authored-by: Hong Ge <[email protected]> * Update script.jl * Update script.jl * Create DocsNav.yml * Add example script for PMMH (#37) * Add example script for PMMH * Add Literate.jl * Update script.jl * Update script.jl * Update DocsNav.yml * Update SSMProblems.jl interface (#38) * Add split dynamics/observation interface with "extra" variables * Add utilities for forward simulation and distribution definitions * Removed redundant particle container code * Update naming convention for initialisation log-density Co-authored-by: Hong Ge <[email protected]> * Update initialisation naming Co-authored-by: Hong Ge <[email protected]> * Update naming convention Co-authored-by: Hong Ge <[email protected]> * Change sampler to AbstractMCMC Co-authored-by: Hong Ge <[email protected]> * Add section heading for SSM Co-authored-by: Hong Ge <[email protected]> * Remove redundant method check * Correct dependencies * Revert to positional arguments * Correct forward simulation element type * Update Kalman filter example to new interface * Fix formatting issue * Add missing import Co-authored-by: Charles Knipp <[email protected]> * Add default rngs to samplers through macro * Remove unnecessary section heading * Add missing dependency * Tidied Kalman filter example * Update documentation main page * Fully document Kalman filter example * Remove outdated examples * Add documentation for extra argument * Apply suggestions from code review * Update script.jl * Update examples/kalman-filter/script.jl Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update script.jl * Remove PMMH (until new API) * Remove ref to 'Utils' * Fix broken link * Remove default RNG macro * Simplify interface methods * Correct old function names * Add parametric type to Kalman filter * Update main doc page * Update README * Make parameter order consistent --------- Co-authored-by: Hong Ge <[email protected]> Co-authored-by: Charles Knipp <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: FredericWantiez <[email protected]> * Minor tweaks and typo fixes (#41) * Update index.md * Update index.md * Update Project.toml * Suppress output in example script --------- Co-authored-by: THargreaves <[email protected]> * Split method definitions to avoid docstring overwriting (#42) * Function docstring formatting (#45) * added TagBot & CompatHelper workflows (#47) * added TagBot & CompatHelper workflows * using existing Documenter Key for CompatHelper * CompatHelper: add new compat entry for Distributions at version 0.25, (keep existing compat) (#48) Co-authored-by: CompatHelper Julia <[email protected]> * Update DocsPreviewCleanup.yml * TagBot Permission Issue fixed (#50) * Add DOCUMENTER_KEY to Docs workflow (#52) * Update DocsPreviewCleanup.yml * Correct type signature for forward simulation method * Alignment of obs/dyn time steps and refactored forward simulation (#55) * Update interface documentation to align dyn/obs time steps * Refactor forward simulation, add type parameters, add unit test * Add type parameters to docstrings * Update kalman example * Add extra for initialisation and simplify obs type parameter * Bump minor version * Update unit tests to match aligned interface * Fix code comment rendering * Interface Changes for Use in Filtering (#56) * added basic particle methods and filters * added qualifiers * added parameter priors * added adaptive resampling to bootstrap filter (WIP) * Julia fomatter changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * changed eltype for <: StateSpaceModel * updated naming conventions * formatter * fixed adaptive resampling * added particle ancestry * formatter issues * fixed metropolis and added rejection resampler * Keep track of free indices using stack * updated particle types and organized directory * weakened SSM type parameter assertions * improved particle state containment and resampling * added hacky sparse ancestry to example * fixed RNG in rejection resampling * improved callbacks and resamplers * formatting * added conditional SMC * improved linear model type structure * formatter * replaced extra with kwargs * formatter * migrated filtering code * Add unittests for new interface * Update documentation to match kwargs * Rename extras/kwargs docs file * remove redundant forward simulations --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Tim Hargreaves <[email protected]> * Bump 0.4.0 (#58) * Update docs to match kwargs interface * Add method definitions for batch simulation/log-densities * Bump minor version * Correct docstring overwriting for batch methods * Update type parameters to contain both arithmetic and element type * Correct docstring indentations * Correct RBPF forward simulation * Documentation and Turing Navigation CI improvement (#61) * Update Docs.yml * Update DocsNav.yml * No need of deploydocs() after using new Docs & DocsNav workflows * Remove research files from repository * removed SSMProblems README, LICENSE, GHA workflow, JULIAFORMATTER in favor of merger * SSMProblems: added missing docstring to avoid documentation failure * added pkg_path in Docs workflow to fix package development --------- Co-authored-by: Hong Ge <[email protected]> Co-authored-by: FredericWantiez <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Tor Erlend Fjelde <[email protected]> Co-authored-by: Tim Hargreaves <[email protected]> Co-authored-by: David Widmann <[email protected]> Co-authored-by: Hong Ge <[email protected]> Co-authored-by: Charles Knipp <[email protected]> Co-authored-by: THargreaves <[email protected]> Co-authored-by: Will Tebbutt <[email protected]> Co-authored-by: CompatHelper Julia <[email protected]> Co-authored-by: Penelope Yong <[email protected]>

shravanngoswamii · 2025-03-08T11:01:57Z

I don't have the environment properly set up since GeneralisedFilters is not on the official registry, and I can't figure out how to add it to Project.toml

I guess, this may help you:
https://pkgdocs.julialang.org/dev/toml-files/#The-[sources]-section

Monorepo example is what we need:

[sources]
Example = {url = "https://github.com/JuliaLang/Example.jl", rev = "custom_branch"}
WithinMonorepo = {url = "https://github.org/author/BigProject", subdir = "SubPackage"}
SomeDependency = {path = "deps/SomeDependency.jl"}

gdalle · 2025-03-09T09:28:18Z

Hi there! I'd be happy to help you figure out AD-related issues.

Mooncake is not expected to handle second-order derivatives at the moment, we're working on it with @willtebbutt but it's a daunting task.
Zygote's bug is rather disturbing, I can't really explain it.
Enzyme even errors for the gradient when I try it on the latest version of everything. This makes it difficult for me to debug the Hessian. Since January, I have changed the DI.gradient bindings to forward as much as possible to Enzyme without meddling, so that might explain the difference wrt to @charlesknipp's first post. Could you maybe try it again to confirm that we're seeing the same things?

In general, when you encounter something you think is an issue with DI, I encourage you to open an issue on the DI repo with an MWE. It would help tremendously :)

charlesknipp · 2025-03-10T16:32:10Z

Enzyme even errors for the gradient when I try it on the latest version of everything. This makes it difficult for me to debug the Hessian. Since January, I have changed the DI.gradient bindings to forward as much as possible to Enzyme without meddling, so that might explain the difference wrt to @charlesknipp's first post. Could you maybe try it again to confirm that we're seeing the same things?

@gdalle ~~I get an error for illegal type analysis with strict aliasing turned on. The computation succeeds with Enzyme.API.strictAliasing!(false), which is good but suboptimal.~~ it works with the latest commit.

Just to clarify, I'm working with DI version 0.6.43 and Enzyme version 0.13.30 on Julia 1.10.4 (my work computer)

EDIT: I'm not sure why, but forcing symmetry on the innovation covariance caused Enzyme to report a type instability. I converted it to a symmetric type instead, and somehow that fixed the inconsistency.

gdalle · 2025-03-10T18:55:49Z

I'm afraid we'll have to concoct a pure-Enzyme MWE for the Hessian error, otherwise the Enzyme devs won't want to debug it

yebai · 2025-03-10T21:38:00Z

Second Order Differentiation

It is better to implement the basic ones first, i.e. first-order optimisation.

charlesknipp · 2025-03-11T14:47:33Z

@yebai good call. I'll focus on the non-linear methods from now on. The resampling step is likely to cause some trouble. In some variational algorithms here I was able to use Zygote.@ignore to stop the gradient calculation consistent with the literature.

@gdalle I added a minimum working example, isolated from the SSMProblems/GeneralisedFilters interface. The gradient works fine, but the Hessian needs some work. I created some sugar akin to their demo from here, which clearly doesn't work. If you have some ideas let me know. I'm not sure how DI interfaces with Enzyme for the Hessian.

gdalle · 2025-03-11T16:37:20Z

Your MWE looks simple enough to open an issue on Enzyme, so I'd say that's the right move

The batch Kalman filter was using two prototype custom CUDA kernels for performing fast Cholesky and regular-transposed matrix multiplication. The CUBLAS version of the former (technically LU) has been sped up now by better pointer creation in CUDA.jl. The second is still very slow and is the bottleneck to the computation. The prototype replacement is not fully tested though and even though it passed this packages unit tests, I found a case where it was giving unexplained incorrect values. For safety, it will be removed for now until the custom batch kernels are fully tested.

log_weights gives a more reliable length than particles, which may be a strange type that we haven't defined an interface for yet.

…/AnalyticFilters.jl into ck/maximum-likelihood

charlesknipp added 3 commits January 8, 2025 13:04

fixed type stability of linear filter

04ecb4e

added MLE demonstration

152917b

flipped sign of objective function

efe1573

charlesknipp added 2 commits March 7, 2025 14:08

Merge remote-tracking branch 'origin/main' into ck/maximum-likelihood

e99be16

reorganized and added Mooncake MWE

acbc4e1

charlesknipp force-pushed the ck/maximum-likelihood branch from 6b9b3bb to acbc4e1 Compare March 7, 2025 21:02

fixed KF type stability in Enzyme

25eb4d6

add MWE for Kalman filtering

6d061b0

replaced second order optimizer and added backend testing

f8f0e59

THargreaves and others added 10 commits March 11, 2025 17:28

Make HierarchicalSSM compatible with the BF (#70)

bd96222

Remove redundant computation from batch KF

c2b12db

Factor out intermediate storage (#72)

ba214d6

Make RB particle creation more robust

d47b1c7

log_weights gives a more reliable length than particles, which may be a strange type that we haven't defined an interface for yet.

Export callback methods and types

e429e6b

fixed type stability of linear filter

da6a54b

added MLE demonstration

39d9f32

flipped sign of objective function

30a4be3

reorganized and added Mooncake MWE

935df10

charlesknipp added 5 commits March 13, 2025 10:05

fixed KF type stability in Enzyme

2cc7d63

add MWE for Kalman filtering

5c9149e

replaced second order optimizer and added backend testing

4936c52

Merge branch 'ck/maximum-likelihood' of https://github.com/TuringLang…

e8e6c02

…/AnalyticFilters.jl into ck/maximum-likelihood

fixed Mooncake errors for Bootstrap filter

56c6dc0

charlesknipp mentioned this pull request Mar 13, 2025

Thinking about the guided filter #73

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maximum Likelihood Demonstration #26

Maximum Likelihood Demonstration #26

charlesknipp commented Jan 8, 2025 •

edited

Loading

charlesknipp commented Jan 8, 2025

THargreaves commented Jan 8, 2025

THargreaves commented Jan 8, 2025

shravanngoswamii commented Mar 8, 2025

gdalle commented Mar 9, 2025

charlesknipp commented Mar 10, 2025 •

edited

Loading

gdalle commented Mar 10, 2025

yebai commented Mar 10, 2025

charlesknipp commented Mar 11, 2025 •

edited

Loading

gdalle commented Mar 11, 2025

Maximum Likelihood Demonstration #26

Are you sure you want to change the base?

Maximum Likelihood Demonstration #26

Conversation

charlesknipp commented Jan 8, 2025 • edited Loading

Overview

A Note on Automatic Differentiation

Second Order Differentiation

Requests

charlesknipp commented Jan 8, 2025

Takeaways

THargreaves commented Jan 8, 2025

THargreaves commented Jan 8, 2025

shravanngoswamii commented Mar 8, 2025

gdalle commented Mar 9, 2025

charlesknipp commented Mar 10, 2025 • edited Loading

gdalle commented Mar 10, 2025

yebai commented Mar 10, 2025

charlesknipp commented Mar 11, 2025 • edited Loading

gdalle commented Mar 11, 2025

charlesknipp commented Jan 8, 2025 •

edited

Loading

charlesknipp commented Mar 10, 2025 •

edited

Loading

charlesknipp commented Mar 11, 2025 •

edited

Loading