The Kalman Gain

The final equation is the Kalman Gain Equation.

The Kalman Gain in matrix notation is given by:

\[ \boldsymbol{K}_{n} = \boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}\left(\boldsymbol{HP}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n} \right)^{-1} \]
Where:
\( \boldsymbol{K}_{n} \) is the Kalman Gain
\( \boldsymbol{P}_{n,n-1} \) is the prior estimate covariance matrix of the current state (predicted at the previous step)
\( \boldsymbol{H} \) is the observation matrix
\( \boldsymbol{R}_{n} \) is the measurement noise covariance matrix

Intuitive Explanation of the Kalman Gain

Let's start by examining the 1D state update equation:

\[ \hat{x}_{n,n} = \hat{x}_{n,n-1} + {k}_{n}({z}_{n} - \hat{x}_{n,n-1}) \]

This equation shows how the updated state estimate \( \hat{x}_{n,n} \) is obtained by adjusting the prior estimate \( \hat{x}_{n,n-1} \) using the innovation (the difference between the measurement \( {z}_{n} \) and the prior estimate). The Kalman Gain \( {k}_{n} \) determines how much we trust the measurement versus the prior estimate.

Extending to the Multivariate Case

In the multivariate case, things are a bit different. The measurement \( \boldsymbol{{z}}_{n} \) and the system state \( \boldsymbol{\hat{x}}_{n,n-1} \) often exist in different domains. Thus, we cannot directly subtract the two to compute the innovation. Instead, we must transform the system state into the measurement domain using the observation matrix \( \boldsymbol{H} \):

\[ \text{innovation} = (\boldsymbol{{z}}_{n} - \boldsymbol{H}\boldsymbol{\hat{x}}_{n,n-1}) \]

Understanding the Kalman Gain

In the 1D case, the Kalman Gain is given by:

\[ k_{n} = \frac{p_{n,n-1}}{p_{n,n-1} + r_{n}} \]

which can also be rewritten as:

\[ k_{n} = p_{n,n-1}(p_{n,n-1} + r_{n})^{-1} \]

For the multivariate case, instead of a single variance term, we use the prior estimate uncertainty \( \boldsymbol{P}_{n,n-1} \). Since the \( \boldsymbol{P}_{n,n-1} \) is a covariance matrix, and variance is a squared term, we need to project the prior estimate uncertainty into the measurement domain using the transformation:

\[ \boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} \]

(See the proof in Chapter 7.5 of the book.)

After projecting the estimate uncertainty into the measurement domain, the Kalman Gain takes the form:

\[ \color{green}{\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}(\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n})^{-1}} \]

Multiplied by innovation:

\[ \color{green}{\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}(\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n})^{-1}}(\boldsymbol{{z}}_{n} - \boldsymbol{H}\boldsymbol{\hat{x}}_{n,n-1}) \]

The above expression is in the measurement domain, but in the State Update Equation, we need to update the system state \( \boldsymbol{\hat{x}}_{n,n-1} \). To achieve this, we must convert it back to the system state domain. We can achieve this by multiplying the above term by \( \boldsymbol{H}^{-1} \):

\[ \boldsymbol{\hat{x}}_{n,n} = \boldsymbol{\hat{x}}_{n,n-1} + \color{red}{\boldsymbol{H}^{-1}\boldsymbol{H}}\color{blue}{\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}(\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n})^{-1}}(\boldsymbol{{z}}_{n} - \boldsymbol{H}\boldsymbol{\hat{x}}_{n,n-1}) \]

Since \( \boldsymbol{H}^{-1}\boldsymbol{H} = \boldsymbol{I} \):

\[ \boldsymbol{\hat{x}}_{n,n} = \boldsymbol{\hat{x}}_{n,n-1} + \color{blue}{\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}(\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n})^{-1}}(\boldsymbol{{z}}_{n} - \boldsymbol{H}\boldsymbol{\hat{x}}_{n,n-1}) \]

The term \( \color{blue}{\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}(\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n})^{-1}} \) is a multivariate Kalman Gain:

\[ \boldsymbol{\hat{x}}_{n,n} = \boldsymbol{\hat{x}}_{n,n-1} + \color{blue}{\boldsymbol{K}_{n}}(\boldsymbol{{z}}_{n} - \boldsymbol{H}\boldsymbol{\hat{x}}_{n,n-1}) \]

Note that \( \boldsymbol{H} \) is not necessarily invertible, so the explanation above is not a formal derivation of the Kalman Gain. Instead, it provides an intuitive understanding of how the Kalman Gain works by projecting uncertainties into the measurement domain and appropriately weighting the correction term.

Kalman Gain Equation Derivation

This chapter includes the derivation of the Kalman Gain Equation. You can jump to the next topic if you don't care about the derivation.

First, let's rearrange the Covariance Update Equation:

Notes
\( \boldsymbol{P}_{n,n} = \left(\boldsymbol{I} - \boldsymbol{K}_{n}\boldsymbol{H} \right) \boldsymbol{P}_{n,n-1} \color{blue}{\left(\boldsymbol{I} - \boldsymbol{K}_{n}\boldsymbol{H} \right)^{T}} + \boldsymbol{K}_{n} \boldsymbol{R}_{n}\boldsymbol{K}_{n}^{T} \) Covariance Update Equation
\( \boldsymbol{P}_{n,n} = \left(\boldsymbol{I} - \boldsymbol{K}_{n}\boldsymbol{H} \right) \boldsymbol{P}_{n,n-1} \color{blue}{\left(\boldsymbol{I} - \left(\boldsymbol{K}_{n}\boldsymbol{H}\right)^{T}\right)} + \boldsymbol{K}_{n} \boldsymbol{R}_{n} \boldsymbol{K}_{n}^{T} \) \( \boldsymbol{I}^{T} = \boldsymbol{I} \)
\( \boldsymbol{P}_{n,n} = \color{green}{\left(\boldsymbol{I} - \boldsymbol{K}_{n}\boldsymbol{H} \right) \boldsymbol{P}_{n,n-1}} \color{blue}{\left(\boldsymbol{I} - \boldsymbol{H}^{T}\boldsymbol{K}_{n}^{T}\right)} + \boldsymbol{K}_{n} \boldsymbol{R}_{n} \boldsymbol{K}_{n}^{T} \) Apply the matrix transpose property: \( (\boldsymbol{AB})^{T} = \boldsymbol{B}^{T}\boldsymbol{A}^{T} \)
\( \boldsymbol{P}_{n,n} = \color{green}{\left(\boldsymbol{P}_{n,n-1} - \boldsymbol{K}_{n}\boldsymbol{H}\boldsymbol{P}_{n,n-1} \right)} \left(\boldsymbol{I} - \boldsymbol{H}^{T}\boldsymbol{K}_{n}^{T}\right) + \boldsymbol{K}_{n} \boldsymbol{R}_{n} \boldsymbol{K}_{n}^{T} \)
\( \boldsymbol{P}_{n,n} = \boldsymbol{P}_{n,n-1} - \boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}\boldsymbol{K}_{n}^{T} - \boldsymbol{K}_{n}\boldsymbol{H}\boldsymbol{P}_{n,n-1} + \\ + \color{#7030A0}{\boldsymbol{K}_{n}\boldsymbol{H}\boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}\boldsymbol{K}_{n}^{T} + \boldsymbol{K}_{n} \boldsymbol{R}_{n} \boldsymbol{K}_{n}^{T} } \) Expand
\( \boldsymbol{P}_{n,n} = \boldsymbol{P}_{n,n-1} - \boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T}\boldsymbol{K}_{n}^{T} - \boldsymbol{K}_{n}\boldsymbol{H}\boldsymbol{P}_{n,n-1} + \\ + \color{#7030A0}{\boldsymbol{K}_{n} \left( \boldsymbol{H} \boldsymbol{P}_{n,n-1}\boldsymbol{H}^{T} + \boldsymbol{R}_{n} \right) \boldsymbol{K}_{n}^{T} } \) Group the last two terms

The Kalman Filter is an optimal filter. Thus, we seek a Kalman Gain that minimizes the estimate variance.

To minimize the estimate variance, we need to minimize the main diagonal (from the upper left to the lower right) of the covariance matrix \( \boldsymbol{P}_{n,n} \).

The sum of the main diagonal of the square matrix is the trace of the matrix. Thus, we need to minimize \( tr(\boldsymbol{P}_{n,n}) \). To find the conditions required to produce a minimum, we differentiate the trace of \( \boldsymbol{P}_{n,n} \) with respect to \( \boldsymbol{K}_{n} \) and set the result to zero.

Previous Next