Undergraduate Research Project
Regime Interpretability
Can sparse autoencoders make learned market regime embeddings easier to interpret?
Andrew Stewart · UC San Diego · exploratory ML interpretability project
Motivation
Market regimes are useful, but often hard to explain.
Dense embeddings
Autoencoders can compress rolling market windows, but dense latent dimensions are not naturally human-readable.
Regime labels
Cluster or HMM states can be convenient, but a state number alone does not explain what changed in the market.
Research goal
Test whether sparse features give a cleaner inspection layer over learned regime embeddings.
Scope
This is interpretability research, not a claim of production trading performance.
Data Setup
Rolling multi-asset return windows
Universe
SPY QQQ IWM XLF XLE XLK TLT IEF HYG GLD SLV DBA UUP ^VIX EEM
Windowing
Daily prices become log returns, then standardized 20-day rolling windows.
Sample
Current config covers 2015-10-01 through 2025-09-30.
External checks
VIX, VIX change, yield spread proxy, SPY momentum, and cross-asset correlation.
Dense Autoencoder
A compact representation of each market window
Input
Flattened 20-day return windows across 15 tickers, matching the configured 300-dimensional input.
Encoder
MLP hidden layers compress each window into a 32-dimensional latent embedding.
Training signal
Reconstruct the original standardized window using mean squared error.
Output artifact
results/embeddings.pt stores full, train, validation, and test embeddings.
Sparse Autoencoder
TopK sparse features over dense embeddings
Input
Frozen 32-dimensional dense embeddings.
Dictionary
128 sparse features, an overcomplete representation.
Activation
TopK keeps 8 active features per sample.
dense embedding -> sparse code -> reconstructed embedding
TopK active features
Pipeline
End-to-end artifact flow
Interpretability Checks
Feature meaning is tested, not assumed
Indicator correlations
Compare feature activations against market indicators and assign heuristic labels only when correlation is strong enough.
Event heatmaps
Inspect average feature activity across selected market stress windows.
UMAP views
Visualize dense embeddings colored by VIX tercile, calendar year, or dominant sparse feature.
Backtest sanity check
Use regime-conditioned SPY timing only as a downstream plausibility check.
Baselines and Ablations
Keep the sparse model in context
HMM baseline
Gaussian hidden Markov models with several state counts.
K-means baseline
PCA reduction followed by K-means regime assignments.
Ablations
Latent dimension, TopK, dictionary size, window size, and sparsity penalty sweeps.
Transition timing
Compare detected regime changes around VIX spike events.
Outputs
Figures appear after running the pipeline
Generated as
results/umap_*.pngGenerated as
results/event_feature_heatmap.pngGenerated as
results/*_training_curves.pngGenerated as
results/equity_curve_*.pngLimitations and Future Work
Useful as a research probe, not a trading product
Limitations
- Yahoo Finance data dependence.
- Heuristic feature labels.
- Lightly tuned baselines.
- Backtest is a sanity check.
Future work
- Add pinned sample outputs.
- Improve experiment tracking.
- Test simpler PCA/factor baselines.
- Use calendar-aware event definitions.