Engineering rigour, applied to machine learning.
We've replaced the slow, document-heavy CRISP-DM lifecycle with a modern, production-first delivery model. Every project is wired end-to-end from sprint one — ingestion, feature engineering, modelling, deployment, and continuous monitoring — so capability accumulates and never decays.
Ingest
APIs, files, streams, MariaDB, FRED, Finage, OANDA, MetaTrader
Engineer
Cleaning, alignment, resampling, feature stores, lookback windows
Model
MRN ensembles, LightGBM, deep nets, LLMs/RAG, agentic workflows
Deploy
CI/CD via GitHub Actions, AWS EC2/EBS, Cloudflare, containerisation
Monitor
Drift detection, walk-forward validation, performance dashboards
Every engagement is wired end-to-end. We do not deliver a notebook — we deliver a continuously-running system you can trust on Monday morning.
Four non-negotiables.
Start from the decision, not the data
Every engagement begins by characterising the decision the model will inform: cadence, cost of error, stakeholder, and downstream system. Modelling choices flow from there — not the other way around.
Honest validation or it does not ship
Time-series problems demand walk-forward validation, leak-free splits, and out-of-sample stress tests. We never report metrics we wouldn't stake our own money on.
Production from day one
Pipelines are versioned, containerised, and CI/CD-deployed from the first sprint. There is no "throw it over the wall" handover — there's never a wall.
Monitor, retrain, defend
Drift detection, performance dashboards, and scheduled retraining are part of the deliverable. Models age — our systems are designed to know it before you do.
GenAI, LLMs, RAG, and agentic AI — used where they earn it.
Retrieval-Augmented Generation
Grounded LLM systems that answer from your documents, warehouses, and knowledge bases — with citation, evaluation, and guardrails wired in from the start.
Agentic workflows
Multi-step LLM agents that orchestrate tools, APIs, and classical ML models — with explicit cost, latency, and reliability budgets, not magical promises.
LLMs alongside time-series ML
We use language models to augment — not replace — the rigorous time-series and classification systems that drive measurable business outcomes.
A modern, opinionated stack.
Modelling
- MRN ensembles
- LightGBM / XGBoost
- PyTorch / TensorFlow
- scikit-learn
- Hugging Face Transformers
GenAI & LLMs
- OpenAI / Anthropic / open-weight LLMs
- RAG pipelines
- LangChain / LlamaIndex
- Agentic workflows
- Evaluation harnesses
Data & storage
- MariaDB / PostgreSQL
- Pandas / NumPy / Polars
- Feature stores
- Time-series DBs
- S3 / object storage
Sources & APIs
- FRED / OECD macro data
- Finage / OANDA / MetaTrader
- Bloomberg / Refinitiv
- Custom scrapers
- Internal warehouses
MLOps & deploy
- GitHub Actions CI/CD
- Docker / containerisation
- AWS EC2 / EBS / RDS
- Cloudflare Pages / Workers
- MLflow tracking
Validation & monitoring
- Walk-forward backtesting
- Drift detection
- Performance dashboards
- Statistical significance tests
- A/B and shadow deploys
Have a forecasting or classification problem that needs to work in production?
Tell us about it. First conversations are confidential, no-obligation, and usually end with a clear view of feasibility, data needs, and time-to-value.