Market Analysis using Kalman Paris

From Noisy Songs to Stock Profits: How I Taught My Computer to “Hear” the Market
Turning signal processing math into a market-neutral trading bot using Cointegration, Kalman Filters, and Mean Reversion
Introduction
Ever tried listening to your favorite song while someone is vacuuming right next to you? You can still catch the melody. Your brain filters out the noise and focus on what matters. That idea became the foundation of this project. In my previous post on MFCCs, I explored how computers extract human voice signals from noisy audio. But while looking at a stock price chart one night, something clicked in my mind
The "Wait, What?" Moment
Price charts look exactly like sound waves.
Random spikes. Hidden patterns. Underlying rhythm buried in chaos.
So, What if we treated financial data like audio signals?
Filter the noise, extract the structure and then trade the pattern.
This blog walks through the full system I built from theory to Python implementation.
Signal Processing: Audio vs Finance
At their core, both domains deal with signals over time.

AUDIO SIGNAL PROCESSING FINANCIAL SIGNAL PROCESSING
----------------------- ---------------------------
Raw sound waves Raw price data
Noise filtering Market noise filtering
Feature extraction Spread extraction
Voice detection Trading signals
Same mathematics but different application.
Step 1 — Cointegration: The Drunk Guy and His Dog

Imagine:
A drunk guy stumbling home
His dog running around
Both connected by a leash
Individually → random movement
Together → constrained distance
That leash is cointegration.
In markets, some assets move together long-term because of economic linkage.
Examples:
Gold ↔ Silver
Coke ↔ Pepsi
Bitcoin ↔ Ethereum
We don't bet on where Gold is going.
We bet on the leash stretching too far and snapping back.
We trade distance between them.
When the leash stretches too far → we bet on snap back.
Step 2 — Mean Reversion: The Rubber Band Model
Normal stock prices behave like balloons — they drift.
But spreads between cointegrated assets behave like rubber bands.
They revert to a long-term average.
This behavior is modeled by the Ornstein-Uhlenbeck Process.
Key Parameters
| Dial | What It Means | Real-World Analogy |
| θ (theta) : Mean Reversion Speed | How fast the rubber band snaps back | Tight rubber band = fast snap. Loose = slow |
| μ (mu) : Long-Term Mean | Where the ball "wants" to be | The resting point. Home base |
| σ (sigma) : Volatility | How wildly it bounces around | Calm day = small bounces. Crazy day = wild shaking |
High θ? The spread snaps back quickly → More trading opportunities → More profit.
High σ? Lots of bouncing → Noisier signal → Harder to trade.
This technique measures these dials in real-time and decides is this rubber band stretched enough to bet on?
Step 3 — Z-Score: The Stretch Meter
We need a way to measure how “abnormal” the spread is.
That’s where Z-Score comes in.
Interpretation:
THE Z-SCORE THERMOMETER
+3.0 ┃ 🔴 EXTREMELY EXPENSIVE (rare!)
+2.0 ┃ 🔴 Very expensive
+1.5 ┃ 🟡 ──── SELL SIGNAL ──── ←── Robot screams "SELL!"
+1.0 ┃
0.0 ┃ 🟢 Normal. Chill. Do nothing.
-1.0 ┃
-1.5 ┃ 🟡 ──── BUY SIGNAL ──── ←── Robot screams "BUY!"
-2.0 ┃ 🔵 Very cheap
-3.0 ┃ 🔵 EXTREMELY CHEAP (rare!)
Z-Score near 0: The leash is relaxed. Nothing interesting. Go get coffee.
Z-Score hits +1.5: Gold is WAY too expensive relative to Silver. SELL GOLD.
Z-Score hits -1.5: Gold is WAY too cheap relative to Silver. BUY GOLD.
Z-Score comes back to 0: Close the trade. Pocket the difference.
Step 4: The Robot's Brain That Learns (Kalman Filter)
Here's where most trading bots mess up.
Old-school bots calculate the relationship between Gold and Silver ONCE and assume it never changes. Like measuring your shoe size at age 5 and buying the same shoes forever.
Spoiler: your feet grow. Markets change.
Sometimes Gold and Silver move in lockstep. Sometimes a financial crisis hits and they stop caring about each other entirely. The "leash length" (called beta, or β) is constantly shifting.
Predicts current relationship
Compares with new data
Measures error
Updates belief
Step 5 — Full Trading Pipeline
Download prices
↓
Log transform
↓
Kalman Filter → dynamic β
↓
Compute spread
↓
Compute Z-Score
↓
Generate signals
↓
Execute trades

| Auto-Tune | Kalman Filter |
| Listens to the singer's voice | Listens to market prices |
| Detects if the pitch is off | Detects if β has changed |
| Adjusts pitch in real-time | Adjusts β in real-time |
| Singer sounds perfect | Robot trades perfectly |
Kalman Filter
def run_kalman_filter(x, y):
n = len(x)
delta = 0.0001
Vw = delta / (1 - delta) * np.eye(2)
Ve = 0.001
theta = np.zeros((2, n))
P = np.zeros((2, 2, n))
theta[:, 0] = [0, 0]
P[:, :, 0] = np.eye(2) * 1000
for t in range(1, n):
theta_pred = theta[:, t-1]
R = P[:, :, t-1] + Vw
H = np.array([x.iloc[t], 1.0])
y_pred = np.dot(H, theta_pred)
error = y.iloc[t] - y_pred
Q = np.dot(np.dot(H, R), H.T) + Ve
K = np.dot(R, H.T) / Q
theta[:, t] = theta_pred + K * error
P[:, :, t] = R - np.outer(K * Q, K)
return theta
state = run_kalman_filter(x, y)
dynamic_beta = state[0, :]
Typical observations:
Assets move together → cointegration confirmed
β changes over time → Kalman adapts
Z-Score oscillates → mean reversion visible
Trades cluster at extremes
Equity curve trends upward in stable regimes
This is a market-neutral strategy.
Mathematical Foundation
Ornstein-Uhlenbeck SDE: is a stochastic, continuous-time process that models mean-reverting behavior, where a variable tends to drift towards a long-term mean over time.
dS = θ(μ − S)dt + σdW
Where:
S → Spread
θ → Reversion speed
μ → Mean
σ → Volatility
dW → Brownian noise
Rubber band force vs random shocks.
State-Space Model: A state-space model in finance is a powerful mathematical framework for analyzing complex systems with unobservable (latent) factors, using a state equation (how hidden factors evolve, e.g., market sentiment, volatility) and an observation equation (linking these states to actual data like prices).
Observation:
y(t) = β(t)x(t) + α(t) + ε(t)
Transition:
β(t) = β(t−1) + η(t)
OBSERVATION EQUATION:
y(t) = β(t) × x(t) + α(t) + ε(t)
Translation: "Silver price = β × Gold price + intercept + noise"
STATE TRANSITION:
β(t) = β(t-1) + η(t)
Translation: "Today's β is yesterday's β plus some small change"
KALMAN GAIN (K):
K = (Predicted Uncertainty) / (Predicted Uncertainty + Measurement Noise)
Translation:
- If we're very uncertain → K is large → Trust the new data more
- If we're very certain → K is small → Stick with our prediction
Kalman Gain balances prediction vs measurement trust.


✨ The Takeaway
"The market isn't chaotic. It's a song with too much static. My job wasn't to predict the next note it was to hear the rhythm underneath."
Next time it rains, listen to the drops hitting your window. There's a rhythm in there fast, slow, fast again. That pattern? That's a mean-reverting process. You've been hearing them your whole life.
You're already a quant. You just didn't know it yet.