Project Overview
This research project represents a comprehensive investigation into the predictability of USD pip movements
in foreign exchange markets using state-of-the-art machine learning techniques. The study employed a hybrid
architecture combining Temporal Convolutional Autoencoders (TCNAE) with LightGBM gradient boosting to analyze
3 years of hourly market data across 24 major currency pairs.
Key Finding: Despite sophisticated methodology and rigorous experimental design,
no statistically significant market edge was discovered, providing valuable negative evidence
that validates the efficient market hypothesis for hourly FX movements.
Python
PyTorch
LightGBM
OANDA v20 API
Docker
Pandas/NumPy
Temporal CNNs
Financial ML
Technical Architecture
Complete 3-stage ML pipeline: TCNAE autoencoder → LightGBM models → USD predictions
Core Components
- TCNAE (Temporal Convolutional Autoencoder): 537K-parameter model compressing 4-hour sequences into 120-dimensional latent representations
- LightGBM Models: 48 specialized gradient boosting models (2 per instrument) for pip magnitude and direction prediction
- Cross-Instrument Context: 24×5 feature tensor enabling information sharing across currency pairs
- USD Conversion Engine: Proper financial mathematics for actual trading value calculations
Dataset & Methodology
Data Sources & Quality
- OANDA v20 API: Live trading environment data ensuring market realism
- Hourly Frequency: Optimal balance between signal and noise for technical analysis
- Rigorous Cleaning: 50% retention rate after aggressive quality filtering
- Temporal Validation: Strict chronological splits preventing lookahead bias
Results & Performance
Comprehensive performance analysis showing no statistically significant edges discovered
0.02%
Average Correlation
5.83%
Best Correlation (USD_CAD)
Key Findings
- Direction accuracy clustered around 50% (random baseline) across all instruments
- Correlation coefficients remained below 10% for all currency pairs
- Both log returns and direct USD pip training approaches converged to identical conclusions
- Model uncertainty appropriately reflected market unpredictability
Technical Innovation
Novel Contributions
- Dual Model Architecture: Separate regression and classification models for pip magnitude and directional prediction
- Latent Caching System: Optimized training pipeline reducing computational overhead by 70%
- USD-Centric Design: Direct financial value calculation enabling economic interpretation
- Production-Grade Implementation: Docker containerization, comprehensive error handling, and OANDA live API integration
Methodological Rigor
- Temporal validation preventing data leakage
- Cross-instrument feature engineering with causal constraints
- Statistical significance testing across multiple timeframes
- Comprehensive ablation studies validating architectural choices
Scientific Value
Research Contribution
This study provides crucial negative evidence using modern ML techniques,
serving as a methodological template for rigorous financial prediction research.
The honest reporting of unsuccessful edge discovery attempts contributes valuable
knowledge to the field by demonstrating that sophisticated technical approaches
cannot overcome fundamental market efficiency.
Industry Implications
- Market Efficiency Validation: Strong evidence supporting efficient market hypothesis at hourly timeframes
- Methodological Framework: Reusable architecture for systematic market prediction research
- Technical Benchmarking: Establishes baseline performance expectations for FX prediction models
- Risk Management: Demonstrates importance of skeptical approach to technical trading strategies
Technical Skills Demonstrated
Deep Learning Architecture
Financial Data Engineering
Production ML Pipelines
Statistical Validation
Time Series Analysis
API Integration
Docker Containerization
Scientific Computing
Research Methodology
Code Quality & Testing
Repository Access
The complete codebase, experimental results, and documentation are available on GitHub.
The repository includes trained models, comprehensive analysis reports, and visualization
tools for full reproducibility of the research findings.
🔗 Explore the Complete Repository
Repository Features: Production-ready code, comprehensive documentation,
experimental results, trained models, visualization suite, and Docker deployment configuration.