Markov Switching State-Space Models for Uncovering Musical Interpretation

<div class="my-footer"><span><a href="https://dajmcdon.github.io/chopin-2022-slides/" style="color:white">dajmcdon.github.io/chopin-2022-slides</a></span></div>

---
background-image: url("gfx/cover-1.svg")
background-size: contain
background-position: top

---
background-image: url("gfx/State-cello-0299.jpg")
background-size: contain

---
background-image: url("gfx/jacobs.jpg")
background-size: contain

???

* Dabbled in Economics
* PhD is Statistics
* IU happened to have an opening
* Met Prof. Chris Raphael at my interview
* Accepted the Job

---
background-image: url("https://img.freepik.com/free-vector/musical-pentagram-sound-waves-notes-background_1017-33911.jpg?w=1800&t=st=1666393969~exp=1666394569~hmac=2bd5438306b82462af3c31fb52c733010cd2f8ff069fcba5324d16a25638af2f")
background-size: contain

???

* Occasionally dabbled in Music on the Side
* First invited talk at Laval on Music
* Finally published the paper in 2021
* First in-person talk post-pandemic
* Seemed more fun than talking about all the COVID modelling or riveting stat theory

---

## Musical taste

.pull-left[
* Easy to describe music you like:
  - "Jazzy sound"
  
  - "Strong beat"
  
  - "good lyrics"
  
  - "anything by Taylor Swift"
]

<div class="tenor-gif-embed" data-postid="8009181" data-share-method="host" data-aspect-ratio="1.30612" data-width="80%"><a href="https://tenor.com/view/head-phones-music-recording-studio-listening-to-music-bart-gif-8009181">Listening To Music GIF</a>from <a href="https://tenor.com/search/head+phones-gifs">Head Phones GIFs</a></div> <script type="text/javascript" async src="https://tenor.com/embed.js"></script>

]
  
--
 
 .pull-left[
  
* Harder to describe a .tertiary[performance]

* Classical music is mainly about performances of the .tertiary[same] music

* How do we think about which ones we like?

]

---

## Primer on "classical" music

* Written between 6th century and today

* Includes music written during the Classical period (1750&ndash;1820)

<blockquote cite="Leonard Bernstein">The real difference is that when a composer writes a piece of what’s usually called classical music, he puts down the exact notes that he wants, the exact instruments or voices that he wants to play or sing those notes—even the exact number of instruments or voices; and he also writes down as many directions as he can think of. </blockquote>

* Generally much more "musically" complicated

<br>

???

* Musically complicated = wider range of chords, keys, instrumentation, contrasts

* Hey Jude: 3 chords (2 others briefly) in 7 minutes. Same key the whole time.

* For today, Chopin is running example

* Chopin: 6 unique chords in first 10 seconds. Two key areas in 1.5 minutes of music.

---

<br>

<br>

???

Which one do you like better?

---

## What's different?

2. Extraneous noise

3. Recording quality

4. Articulation / Legato / Bowing / Breathing

5. Dynamics

6. Tempo / Rubato
]

![:scale 90%](https://media.wbur.org/wp/2019/04/gustavo-dudamel.jpg)

???

The first three are uninteresting.

The others are about .red[.bold[interpretation]]

We like performances with "better" interpretations

---
class: inverse, middle, center
background-image: url("https://www.worldpianonews.com/wp-content/uploads/2020/05/Bosendorfer-280VC-custom4_Post-scaled.jpg")
background-size: cover

???

Piano music

* Simplifies the problem
  - No bowing, fingering, breathing, glissando
* Focus on __tempo__ 
* professional pianists would cringe as I say this, but contrast with strings / winds / singers

---

## Musical tempo

* Notes change "speed"

* Sometimes purposeful

* Speed is important for .tertiary[.bold[interpretation]]

???

* Orange is Rubinstein, as recorded
* Dashed is "mine", mechanically forced Rubinstein to be constant speed
* Mention axes
* Note the "slow down" at the end (phrase boundary)

---
class: inverse, center, middle

# What is this "music"?

---

## Important musical terms

### Beat

### Measure

### Time signature

### Tempo

### Dynamics
]

Strongly felt impetus

Collections of notes delimited by vertical "barlines"

Number of beats / measure; type of note that gets the beat

The prevailing speed, measured in bpm

Loudness of the note
]

???

* notes indicate pitch and relative duration
* (relative to...) beat is the baseline
* accents / pedal markings

---

## Data

* CHARM Mazurka Project

* Focus on timing only (dynamics also available)

* 50 recordings: Chopin Mazurka Op. 68 No. 3

* Recorded between 1931 and 2006

* 45 different performers

---
class: middle

???

* Note the shaded region

---

## Chopin & Mazurkas

* Born in Poland

* Moved to Paris at 21

* Virtuoso pianist

* Wrote mainly piano music
]

* A Polish dance

* Chopin composed at least 58 for Piano

* Repetition is very important

* Certain rhythmic figures

]

???

Everything he wrote includes piano

---

background-image: url("gfx/entire-mazurka.jpg")
background-position: center
background-size: contain

???

* Recording by Christoph Zbinden, available from the IMSLP under CC By 4.0.
* Tempo markings, importantly, only 2 + rit and fermata
* Dotted eighth sixteenth
* ABA structure
* Minor phrases
* Repetition
* Chord progression

---

## Previous analysis

Nicholas Cook, Craig Sapp, and Andrew Earis at CHARM

Among other things, examines correlations between tempo curves

.secondary[Richter 1976]
.center[
![:scale 66%](http://www.mazurka.org.uk/ana/pcor/mazurka68-3-s50/Richter1976-tn.png)
]]

.pull-right[
.secondary[Cortot 1951]
.center[
![:scale 66%](http://www.mazurka.org.uk/ana/pcor/mazurka68-3-s50/Cortot1951-tn.png)
]]

---

## Tempos and smoothing

* Most statistical methods for estimating functions assume "smoothness"

* Splines, wavelets, trend filtering

---
class: center, middle

---
class: center, middle, inverse

# Switching Kalman Filter

---

## Thinking about tempo

<br/>

<br/>

### 2. Accelerando (speed up)

<br/>

### 3. Allargando (slow down)

<br/>

### 4. Tenuto (emphasis)
]

.pull-right[.center[
![](https://upload.wikimedia.org/wikipedia/commons/thumb/4/43/Metronome_Mälzel_1.jpg/291px-Metronome_Mälzel_1.jpg)
]]

---

## Transition diagram

.pull-left-wide[
<img src="gfx/markov-trans.svg" width="80%" style="display: block; margin: auto;" />
]

<h3 class=const>1. Constant tempo</h3>

<br/>

<h3 class=accel>2. Speeding up</h3>

<br/>

<h3 class=decel>3. Slowing down</h3>

<br/>

<h3 class=stress>4. Emphasis</h3>

]

---

## Intentions vs. observations

???

* Musicians aren't perfect.
* Imagine a speed that they'll maintain in CT state.
* Accelerate/Decel from there.
* Need to track tempo/accel in `$X_k$`.
* These depend on `$S_k$` and `$S_{k-1}$`
* Observe noisy realization

---

## Kalman filter

$$
`\begin{aligned}
X_{k+1} &= d_k + T_k X_k + \eta_{k+1} & \eta_{k+1} &\sim \textrm{N}(0, Q_{k+1})\\
Y_k &= c_k + Z_k X_k + \epsilon_{k}&\epsilon_k & \sim \textrm{N}(0, G_k)\\
\end{aligned}`
$$

* Just track mean and variance of `$X_k\ |\ \{Y_i\}_{i=1}^k$`

* Does this iteratively for each `$k$`

* Gives "filter" estimate of `$\{X_k\}_{k=1}^n$` and likelihood

* Matrices can depend on parameters `$\theta$`

* or on previous `$X$` / `$Y$`
]

---

## Inference

.emphasis[
 We want to estimate 
 
1. parameters `$\theta$` &mdash; transition probabilities, mean tempos, variances
2. hidden states `$S_k$` 
3. hidden states `$X_k$`
]

If you know `$\{S_k\}_{k=1}^n$` and `$\theta$`  <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> .tertiary[Kalman filter] gives `$\{\widehat{X}_k\}_{k=1}^n$` 
* this is `$O(n)$` for each likelihood evaluation

If you know `$\{X_k\}_{k=1}^n$` and `$\theta$` <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> .tertiary[Viterbi algorithm] gives `$\{\widehat{S}_k\}_{k=1}^n$` 
* this is `$O(n)$` for each likelihood evaluation

But how do we estimate all of it? &mdash; both unknown is `$O(b^n)$` for each likelihood evaluation.

]

???

* Evaluating the likelihood for one `$\theta$` is exponential in \# of notes.
* Non-convex to optimize over both.

---

## Switching Kalman filter (for our model)

.pull-left[
$$
`\begin{aligned}
X_{k} &= d(s_k,\ s_{k-1}) + T(s_k,\ s_{k-1}) X_{k-1} + \eta_{k}\\\\
Y_k &= c(s_k) + Z(s_k) X_k + \epsilon_{k}\\\\
\eta_{k} &\sim \textrm{N}(0, Q(s_k,\ s_{k-1}))\\\\
\epsilon_k & \sim \textrm{N}(0, G(s_k))
\end{aligned}`
$$
]

---
## Examples

`\begin{align}
  1\rightarrow 1 && 1\rightarrow 2\\
  x_{2} &= 
  \begin{pmatrix}1&0\\0&0\end{pmatrix} x_{1} 
        &   x_{3}
                    &= \begin{pmatrix} l_i\mu_{\textrm{acc}}\\ \mu_{\textrm{acc}}\end{pmatrix} +
  \begin{pmatrix}1&0\\0&0\end{pmatrix} x_{1} +
                         \mbox{N}\left(0,\ \sigma_{\textrm{acc}}^2\begin{pmatrix} l_i^2 & l_i\\ l_i & 1 \end{pmatrix}\right)\\\\
  y_2 &= (1\quad  0)  x_2 + \mbox{N}(0,\
                                 \sigma_\epsilon^2) &
y_3 &= (1\quad  0) x_3 + \mbox{N}(0,\
                                 \sigma_\epsilon^2).
\end{align}`

`\begin{align}
  1\rightarrow 4 && 4\rightarrow 1\\
  x_{2} &= \begin{pmatrix}0 \\ \mu_{\textrm{stress}} \end{pmatrix} +
  \begin{pmatrix}1&0\\0&0\end{pmatrix} x_{1} + 
  \textrm{N}\left(0,\ \begin{pmatrix}0&0\\0&\sigma^2_{\textrm{stress}}\end{pmatrix}\right)
        &  x_{3} &= 
  \begin{pmatrix}1&0\\0&0\end{pmatrix} x_{2} \\\\
  y_2 &= (1\quad  1)  x_2 + \mbox{N}(0,\
                                 \sigma_\epsilon^2) &
y_3 &= (1\quad  0) x_3 + \mbox{N}(0,\
                                 \sigma_\epsilon^2).
\end{align}`

???

x is dim-2 (speed, acceleration)

What is `$li$`?
---

## We don't know the discrete states

<div id="htmlwidget-d72f528ec5d616703041" style="width:100%;height:576px;" class="collapsibleTree html-widget"></div>
<script type="application/json" data-for="htmlwidget-d72f528ec5d616703041">{"x":{"data":{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Speed up","fill":"#B2ABD2","children":[{"name":"Speed up","fill":"#B2ABD2","children":[{"name":"Speed up","fill":"#B2ABD2"},{"name":"Slow down","fill":"#FDB863"},{"name":"Constant tempo","fill":"#5E3C99"}]}]},{"name":"Slow down","fill":"#FDB863","children":[{"name":"Slow down","fill":"#FDB863","children":[{"name":"Slow down","fill":"#FDB863"},{"name":"Speed up","fill":"#B2ABD2"},{"name":"Constant tempo","fill":"#5E3C99"}]}]},{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Stress","fill":"#E66101"},{"name":"Constant tempo","fill":"#5E3C99"},{"name":"Slow down","fill":"#FDB863"},{"name":"Speed up","fill":"#B2ABD2"}]},{"name":"Slow down","fill":"#FDB863","children":[{"name":"Slow down","fill":"#FDB863"}]},{"name":"Speed up","fill":"#B2ABD2","children":[{"name":"Speed up","fill":"#B2ABD2"}]},{"name":"Stress","fill":"#E66101","children":[{"name":"Constant tempo","fill":"#5E3C99"}]}]},{"name":"Stress","fill":"#E66101","children":[{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Stress","fill":"#E66101"},{"name":"Constant tempo","fill":"#5E3C99"},{"name":"Speed up","fill":"#B2ABD2"},{"name":"Slow down","fill":"#FDB863"}]}]}]},"options":{"hierarchy":[1,2,3,4],"input":null,"attribute":"leafCount","linkLength":null,"fontSize":16,"tooltip":false,"collapsed":true,"zoomable":true,"margin":{"top":20,"bottom":20,"left":137,"right":137}}},"evals":[],"jsHooks":[]}</script>

???

I have 4 states

2nd order Markov

Leads to 11 states in 1-Markov

Piece has 231 notes

---

## Discrete particle filter &mdash; .secondary[dpf()]

1. Track at most `$J$` paths through the `$M^n$` tree

2. At time `$k$`, given `$J$` paths, propogate each one forward

3. Sample the `$JM$` possibilities to get only `$J$`

4. iterate forward through time until done

<div id="htmlwidget-af67efb36b54be3208ac" style="width:100%;height:288px;" class="collapsibleTree html-widget"></div>
<script type="application/json" data-for="htmlwidget-af67efb36b54be3208ac">{"x":{"data":{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Speed up","fill":"#B2ABD2","children":[{"name":"Speed up","fill":"#B2ABD2","children":[{"name":"Speed up","fill":"#B2ABD2"},{"name":"Slow down","fill":"#FDB863"},{"name":"Constant tempo","fill":"#5E3C99"}]}]},{"name":"Slow down","fill":"#FDB863","children":[{"name":"Slow down","fill":"#FDB863","children":[{"name":"Slow down","fill":"#FDB863"},{"name":"Speed up","fill":"#B2ABD2"},{"name":"Constant tempo","fill":"#5E3C99"}]}]},{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Stress","fill":"#E66101"},{"name":"Constant tempo","fill":"#5E3C99"},{"name":"Slow down","fill":"#FDB863"},{"name":"Speed up","fill":"#B2ABD2"}]},{"name":"Slow down","fill":"#FDB863","children":[{"name":"Slow down","fill":"#FDB863"}]},{"name":"Speed up","fill":"#B2ABD2","children":[{"name":"Speed up","fill":"#B2ABD2"}]},{"name":"Stress","fill":"#E66101","children":[{"name":"Constant tempo","fill":"#5E3C99"}]}]},{"name":"Stress","fill":"#E66101","children":[{"name":"Constant tempo","fill":"#5E3C99","children":[{"name":"Stress","fill":"#E66101"},{"name":"Constant tempo","fill":"#5E3C99"},{"name":"Speed up","fill":"#B2ABD2"},{"name":"Slow down","fill":"#FDB863"}]}]}]},"options":{"hierarchy":[1,2,3,4],"input":null,"attribute":"leafCount","linkLength":null,"fontSize":16,"tooltip":false,"collapsed":true,"zoomable":true,"margin":{"top":20,"bottom":20,"left":137,"right":137}}},"evals":[],"jsHooks":[]}</script>

???

* This is a greedy approximation
* AKA "Beam Search"
* The sampling step is important
* Probability of sampling is proportional to current weight times likelihood times trans prob
* Example supposing J = 5

---

## The complete procedure

For each performance:

* Iterate 1&mdash;3 to maximize for `$\theta \in \Theta$`, produce `$\widehat\theta$`

1. Guess a parameter vector `$\theta$` (in some sensible way)

2. .secondary[dpf()] gives greedy state sequence `$\{\widehat{S}_k\}_{k=1}^n$`

3. It gives the likelihood as a side effect via .secondary[kfilter()]

* Rerun .secondary[dpf()] and .secondary[ksmoother()] at `$\widehat{\theta}$` to get `$\{\widehat{S}_k\}_{k=1}^n$` and `$\{\widehat{X}_k\}_{k=1}^n$`

???

* 1-step kfilter() step appears in dpf()
* ksmoother() is conditional on all the data

---

<br/> 
 
![:scale 50%](https://alliancecan.ca/themes/custom/site_theme/logo.svg)

???

* Takes a few minutes per performance when 4 is done intelligently
* I used 6hr walltime on Cedar with 1 perf / core and 10 restarts

---

---

# Similar performances

---

## The estimated parameters

For each performance, we estimate `$\theta$` by penalized maximum likelihood.

The parameters are things like:

- average speed in different states
  
  - some variance parameters
  
  - transition probabilities
  
We have strong prior information.

???

Examples of strong priors

---

## Distance matrix on parameters

.pull-left[
* Use Mahalanobis
`$$d(\theta,\theta') = \sqrt{(\theta-\theta')^\top V^{-1}(\theta-\theta')}$$`

<br/>

* `$V$` is prior covariance matrix

<br/>

* Incorporates correlations correctly on probability vectors

<br/>

* Some performances have no "close" neighbors
]

]

---

---

---

## Probability of "stress"

---

---
<img src="gfx/clust-2-1.svg" width="100%" style="display: block; margin: auto;" />

---
<img src="gfx/similar-perfs-1.svg" width="100%" style="display: block; margin: auto;" />

---
<img src="gfx/rubinstein-perfs-1.svg" width="100%" style="display: block; margin: auto;" />

---
class: middle

???

* Not actually Cortot
* Likely Hatto or perhaps even an engineered recording
* This was a major scandal for the British Concert Artist record label in about 2006/7
* ~100 other fake recordings 
* Discovered by uploading to Gracenote database

---

## In summary

* We develop a switching model for tempo decisions

* We give an algorithm for performing likelihood inference

* We estimate our model using a large collection of recordings of the same composition

* We demonstrate how the model is able to recover performer intentions

* We use the learned representations to compare and contrast recordings

### Future work

* Similar model for dynamics

* Can we do this fast and "online"?

* Use it for teaching?

---

## Collaborators, funding, etc.

.pull-left[
<p style="text-align:center;">
<img src="gfx/craphael.jpg" height="200px">
<img src="gfx/mmcbride.jpg" height="200px">
</p>
<p style="text-align:center;">
<img src="gfx/rob_granger.jpg" height="200px"></p>
]

.pull-right[
<iframe width="460" height="250" src="https://www.youtube.com/embed/W8RTpOe-AqA?start=68" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen align="center"></iframe>

<img src="gfx/nsf-logo.png" height="100px" align="left" hspace=5%>
<img src="https://www.nserc-crsng.gc.ca/img/logos/img-logo2-en.png" height="100px" align="right" hspace=5%>
]

---
class: inverse, middle, center

# Appendix

---

## 1-step Kalman filter &mdash; .secondary[kalman()]

Get estimates of `$X_{k}$` given a new observation `$y_k$`

Input: 
  * New data &mdash; `$y_k$`, 
  * Parameter matrices &mdash; `$d_k$`, `$c_k$`, `$T_k$`, `$Z_k$`, `$Q_k$`, `$G_k$`, 
  * Previous state mean and variance &mdash; `$x_{k-1}$`, `$P_{k-1}$`

Predict new state <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> `$\hat{x}_k = d + Tx_{k-1}$` &emsp; `$\hat{P}_k = Q+TP_{k-1}T^\top$`

Predict current data <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> `$\hat{y}_k = c + Z\hat{x}_k$` &emsp; `$F=G + Z\hat{P}_kZ^\top$`

Calculate error <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> `$v = y_k - \hat{y}_k$` &emsp; `$K = \hat{P}_kZ^\top F^{-1}$`

Update <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> `$x_k = \hat{x}_k + Kv$` &emsp; `$P_k = \hat{P}_k(I - Z^\top K)$`

Log Likelihood <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#8BC1FA;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#6495ed;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M504.3 273.6l-112.1 104c-6.992 6.484-17.18 8.218-25.94 4.406c-8.758-3.812-14.42-12.45-14.42-21.1L351.9 288H32C14.33 288 .0002 273.7 .0002 255.1S14.33 224 32 224h319.9l0-72c0-9.547 5.66-18.19 14.42-22c8.754-3.809 18.95-2.075 25.94 4.41l112.1 104C514.6 247.9 514.6 264.1 504.3 273.6z"/></svg> `$\ell_k(\theta) \propto \ell_{k-1}(\theta) - v^\top F^{-1} v - \log(|F|)$`

???

If know S, then that pins down all the parameter matrices

Loop this over 1 ... n

Maximize over theta