Kalman Filter vs Particle Filter for Real-Time Bearing State Estimation: A Practical Comparison

Updated Feb 6, 2026

The Problem That Started This

A few months back, I was working on a bearing health monitoring system for a small manufacturing line — nothing fancy, just a set of accelerometers on spindle bearings feeding data into a state estimation pipeline. The goal was to track degradation in real time and flag when a bearing was drifting toward failure.

I started with a Kalman filter because, well, that’s what everyone starts with. It worked fine for a while. Then I ran into a case where a bearing developed an outer race defect that progressed in a distinctly nonlinear way — the vibration signature would stay flat, then spike, then plateau at a new level. The Kalman filter just couldn’t keep up. It kept smoothing over the transitions, and by the time the estimate caught up to reality, the bearing was already in rough shape.

That’s when I pulled in a particle filter for comparison. What followed was a few weeks of really educational frustration.

Quick Background (Without the Textbook Treatment)

Both filters solve the same fundamental problem: given noisy sensor measurements, estimate the true hidden state of a system over time. In our case, the “state” is something like a degradation index — a scalar or small vector that represents how far gone a bearing is.

The Kalman filter assumes everything is linear and Gaussian. The state transition model is linear, the measurement model is linear, and all the noise is nicely bell-curved. Under those assumptions, it gives you the mathematically optimal estimate. The classic predict-update cycle looks like:

x^kk1=Fkx^k1k1+Bkuk\hat{x}_{k|k-1} = F_k \hat{x}_{k-1|k-1} + B_k u_k

Pkk1=FkPk1k1FkT+QkP_{k|k-1} = F_k P_{k-1|k-1} F_k^T + Q_k

Then when a measurement zkz_k comes in:

Kk=Pkk1HkT(HkPkk1HkT+Rk)1K_k = P_{k|k-1} H_k^T (H_k P_{k|k-1} H_k^T + R_k)^{-1}

x^kk=x^kk1+Kk(zkHkx^kk1)\hat{x}_{k|k} = \hat{x}_{k|k-1} + K_k(z_k – H_k \hat{x}_{k|k-1})

Where FkF_k is the state transition matrix, HkH_k is the observation matrix, QkQ_k and RkR_k are process and measurement noise covariances. Clean, closed-form, fast.

The particle filter makes no such assumptions. It represents the probability distribution of the state using a cloud of weighted samples (particles). Each particle is a hypothesis about the current state. You propagate them forward through whatever nonlinear model you want, weight them by how well they match the incoming measurement, then resample. No linearity required, no Gaussian requirement. The cost? Computation. Lots of it.

Setting Up the Experiment

I used vibration data from a set of bearings on our test rig — sampling at 10 kHz, with RMS vibration features extracted every 5 seconds. The degradation state xkx_k was modeled as a single scalar representing a normalized health index from 0 (healthy) to 1 (failure).

For the Kalman filter, I used a simple linear degradation model:

xk=xk1+Δk+wkx_k = x_{k-1} + \Delta_k + w_k

Where Δk\Delta_k is a small constant drift rate and wkN(0,Q)w_k \sim \mathcal{N}(0, Q). The observation model maps state to expected RMS vibration linearly.

For the particle filter, I used a nonlinear state transition that allows for acceleration in degradation:

xk=xk1+Δk(1+αxk12)+wkx_k = x_{k-1} + \Delta_k \cdot (1 + \alpha \cdot x_{k-1}^2) + w_k

That αxk12\alpha \cdot x_{k-1}^2 term is the key — it means degradation speeds up as the bearing gets worse, which is what actually happens in practice. A healthy bearing degrades slowly. A damaged bearing degrades fast.

Here’s my implementation. Starting with the Kalman filter:

import numpy as np

class BearingKalmanFilter:
    def __init__(self, dt=5.0, process_noise=1e-4, meas_noise=0.05):
        self.x = 0.0  # state estimate
        self.P = 1.0  # estimate covariance
        self.Q = process_noise
        self.R = meas_noise
        self.drift = 2e-4  # constant degradation rate per step
        self.H = 1.0  # direct observation of state

    def predict(self):
        self.x = self.x + self.drift
        self.P = self.P + self.Q

    def update(self, z):
        K = self.P * self.H / (self.H * self.P * self.H + self.R)
        self.x = self.x + K * (z - self.H * self.x)
        self.P = (1 - K * self.H) * self.P
        return self.x

    def step(self, measurement):
        self.predict()
        return self.update(measurement)

And the particle filter:

class BearingParticleFilter:
    def __init__(self, n_particles=500, process_noise=1e-3, meas_noise=0.05):
        self.n = n_particles
        self.particles = np.zeros(n_particles)
        self.weights = np.ones(n_particles) / n_particles
        self.Q = process_noise
        self.R = meas_noise
        self.drift = 2e-4
        self.alpha = 5.0  # nonlinear acceleration factor

    def predict(self):
        noise = np.random.normal(0, np.sqrt(self.Q), self.n)
        # nonlinear transition: degradation accelerates with state
        self.particles = (self.particles
                         + self.drift * (1 + self.alpha * self.particles**2)
                         + noise)
        self.particles = np.clip(self.particles, 0, 1)

    def update(self, z):
        # likelihood of measurement given each particle
        likelihoods = np.exp(-0.5 * ((z - self.particles)**2) / self.R)
        likelihoods += 1e-300  # avoid division by zero
        self.weights = likelihoods / likelihoods.sum()

    def resample(self):
        # systematic resampling
        positions = (np.arange(self.n) + np.random.uniform()) / self.n
        cumsum = np.cumsum(self.weights)
        indices = np.searchsorted(cumsum, positions)
        self.particles = self.particles[indices]
        self.weights = np.ones(self.n) / self.n

    def estimate(self):
        return np.average(self.particles, weights=self.weights)

    def step(self, measurement):
        self.predict()
        self.update(measurement)
        state = self.estimate()
        # resample when effective sample size drops
        n_eff = 1.0 / np.sum(self.weights**2)
        if n_eff < self.n / 2:
            self.resample()
        return state

Notice the n_eff check before resampling — this was something I learned the hard way. If you resample every single step, you lose particle diversity fast and the filter degenerates. Only resample when the effective sample size drops below half the total. This one change made a massive difference in tracking quality.

Running Both on Real Degradation Data

I ran both filters on data from a bearing that eventually failed due to inner race spalling. The degradation curve had that classic hockey-stick shape — slow linear increase for most of the life, then a sharp exponential ramp in the last ~15% of useful life.

import matplotlib.pyplot as plt

# rms_features: normalized RMS vibration values over bearing lifetime
rms_features = load_bearing_run_to_failure()  # our internal dataset

kf = BearingKalmanFilter(process_noise=1e-4, meas_noise=0.01)
pf = BearingParticleFilter(n_particles=1000, process_noise=5e-4, meas_noise=0.01)

kf_estimates = []
pf_estimates = []

for z in rms_features:
    kf_estimates.append(kf.step(z))
    pf_estimates.append(pf.step(z))

timesteps = np.arange(len(rms_features))
plt.figure(figsize=(12, 5))
plt.plot(timesteps, rms_features, 'k.', alpha=0.3, markersize=2, label='Raw RMS')
plt.plot(timesteps, kf_estimates, 'b-', linewidth=1.5, label='Kalman Filter')
plt.plot(timesteps, pf_estimates, 'r-', linewidth=1.5, label='Particle Filter')
plt.xlabel('Time Step (5s intervals)')
plt.ylabel('Degradation Index')
plt.legend()
plt.title('Bearing State Estimation Comparison')
plt.tight_layout()
plt.savefig('filter_comparison.png', dpi=150)

The results were pretty telling. During the initial slow degradation phase (roughly the first 80% of the data), both filters tracked nearly identically. The Kalman filter was actually slightly smoother, which you’d expect — it’s optimal for the linear regime.

But once the bearing hit the knee of the degradation curve, the difference was stark. The Kalman filter lagged behind by 30-50 time steps during the rapid transition. It kept pulling the estimate back toward the linear trend. The particle filter, with its nonlinear transition model, caught the acceleration almost immediately — maybe 5-10 steps of lag.

In real terms, that’s the difference between a 2.5-minute detection delay and a 25-minute delay. For a bearing that went from “concerning” to “replace now” in about 45 minutes, that gap matters.

What About the Extended Kalman Filter?

Fair question. I did try an EKF with the same nonlinear transition model. You linearize the transition function via its Jacobian:

Fk=fxx=x^k1=1+2αΔkx^k1F_k = \frac{\partial f}{\partial x}\bigg|_{x=\hat{x}_{k-1}} = 1 + 2\alpha \cdot \Delta_k \cdot \hat{x}_{k-1}

The EKF did better than the standard Kalman filter — it picked up the transition maybe 15-20 steps faster. But it still wasn’t as responsive as the particle filter during the sharpest part of the degradation curve. The problem is that linearization around the current estimate is a poor approximation when the state is changing rapidly and the posterior distribution becomes skewed (which it does during fault progression).

The Unscented Kalman Filter (UKF) got closer to the particle filter’s tracking, honestly. It uses sigma points instead of linearization and handles moderate nonlinearity well. For most bearing monitoring applications, a UKF might be the sweet spot — better than KF/EKF, cheaper than a particle filter.

But when the degradation dynamics are strongly nonlinear and potentially multi-modal (say, a bearing that might have multiple competing failure modes), the particle filter is the only one that doesn’t require you to squint and pretend the math works.

The Computational Reality

Here’s where things get uncomfortable. I benchmarked both on the target hardware — an ARM Cortex-A53 running at 1.2 GHz (similar to what you’d find in an edge device).

Filter Time per Step Memory Tracks Nonlinearity
Kalman 0.02 ms ~100 bytes No
EKF 0.05 ms ~200 bytes Moderate
UKF 0.3 ms ~2 KB Good
Particle (500) 1.8 ms ~8 KB Excellent
Particle (1000) 3.5 ms ~16 KB Excellent
Particle (5000) 18 ms ~80 KB Excellent

With a 5-second update interval, even the 5000-particle filter runs comfortably in real time. But if you’re doing this at higher rates — say updating at 100 Hz for some reason — then 1000 particles starts eating into your budget. The Kalman filter basically costs nothing.

I found 500 particles was the minimum for reliable tracking on our data. Below that, you start getting occasional tracking failures where the particle cloud collapses to the wrong region. 1000 particles gave consistently good results with reasonable compute.

One trick that helped a lot: I used stratified initial particle placement instead of just drawing from a uniform prior. At startup, I spread particles uniformly across the expected state range but biased more toward the healthy end (since most bearings start healthy). This reduced the initial convergence time from ~50 steps to ~10.

def initialize_biased(self, bias_toward_healthy=0.7):
    # more particles near 0 (healthy), fewer near 1 (failed)
    n_healthy = int(self.n * bias_toward_healthy)
    n_rest = self.n - n_healthy
    healthy_particles = np.random.beta(2, 8, n_healthy)  # skewed toward 0
    rest_particles = np.random.uniform(0, 1, n_rest)
    self.particles = np.concatenate([healthy_particles, rest_particles])
    np.random.shuffle(self.particles)
    self.weights = np.ones(self.n) / self.n

Tuning Nightmares and What I Learned

The Kalman filter has two knobs: QQ (process noise) and RR (measurement noise). Get the ratio roughly right and it works. I spent maybe 30 minutes tuning it.

The particle filter was a different story. Beyond the noise parameters, you’re also tuning the number of particles, the nonlinear model parameters (α\alpha in my case), the resampling threshold, and the initial distribution. These interact in non-obvious ways.

The worst issue I hit was particle depletion during the early phase. When the bearing is healthy and barely changing, all particles converge to nearly the same value. Then when the degradation suddenly accelerates, the particle cloud doesn’t have enough diversity to track the change. Adding a small amount of jitter after resampling (sometimes called roughening or regularization) fixed this:

def resample(self):
    positions = (np.arange(self.n) + np.random.uniform()) / self.n
    cumsum = np.cumsum(self.weights)
    indices = np.searchsorted(cumsum, positions)
    self.particles = self.particles[indices]
    self.weights = np.ones(self.n) / self.n
    # regularization: add small noise to prevent collapse
    self.particles += np.random.normal(0, 1e-3, self.n)
    self.particles = np.clip(self.particles, 0, 1)

That 1e-3 of regularization noise is small enough not to blur the estimate but large enough to keep diversity alive. Finding the right value took more trial and error than I’d like to admit.

Another gotcha: the choice of likelihood function in the particle filter weight update matters more than you’d think. I initially used a Gaussian likelihood, which works fine when the measurement noise is actually Gaussian. But bearing vibration data often has heavy tails — occasional spikes from resonance or external impacts. Switching to a Student-t likelihood with about 5 degrees of freedom made the filter much more robust to outliers:

from scipy.stats import t as t_dist

def update_robust(self, z, df=5):
    likelihoods = t_dist.pdf(z - self.particles, df=df, scale=np.sqrt(self.R))
    likelihoods += 1e-300
    self.weights = likelihoods / likelihoods.sum()

When to Use Which

After going through all of this, here’s my honest take:

Use a standard Kalman filter if your degradation is approximately linear and your sensors are well-behaved. For early-stage monitoring where you’re just tracking slow wear trends, it’s hard to beat. Simple, fast, well-understood. If you’re running on a microcontroller with barely any resources, this is your friend.

Use a particle filter when the degradation process has distinct phases — slow wear followed by rapid progression — or when you suspect the underlying dynamics are strongly nonlinear. Also reach for it when the noise characteristics are non-Gaussian (heavy tails, multimodal). The computational cost is real but manageable on any modern edge device for typical PHM update rates.

The UKF is the underrated middle ground. It handles moderate nonlinearity without the computational overhead of particles. If I were building a production system today and didn’t have strong evidence of multi-modal degradation, I’d probably start with a UKF and only upgrade to a particle filter if tracking performance wasn’t sufficient.

One thing I didn’t expect: the particle filter’s advantage was most pronounced in the critical transition zone — exactly where you need it most. During routine monitoring, the Kalman filter was perfectly adequate. It’s only when things start going wrong that the nonlinear filter earns its keep. There’s a philosophical point in there about over-engineering for the normal case vs. being prepared for the abnormal one.

What I’d Do Differently

If I were starting this project over, I’d implement a switching strategy: run a Kalman filter during normal operation (cheap, accurate for linear degradation), and automatically switch to a particle filter when the innovation sequence — the difference between predicted and actual measurements — exceeds a threshold for several consecutive steps. That way you get the efficiency of the Kalman filter 90% of the time and the nonlinear tracking capability when it actually matters.

The detection of when to switch isn’t trivial, but a simple CUSUM test on the innovation magnitude works reasonably well. I prototyped it and the combined approach caught the degradation transition within 8 time steps while using particle filtering for only about 12% of the total runtime. Best of both worlds.

Bearing state estimation is one of those problems that seems simple on the surface — just filter some noisy data, right? — but the details of how degradation actually progresses make it genuinely challenging. The sensor data is noisy, the physics are nonlinear, and the failure modes are variable. Getting the estimation right is the foundation that everything else (RUL prediction, maintenance scheduling, alarm generation) builds on. Worth spending the time to get it right.

Did you find this helpful?

☕ Buy me a coffee

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

TODAY 396 | TOTAL 2,619