Approach

Engineering rigour, applied to machine learning.

We've replaced the slow, document-heavy CRISP-DM lifecycle with a modern, production-first delivery model. Every project is wired end-to-end from sprint one — ingestion, feature engineering, modelling, deployment, and continuous monitoring — so capability accumulates and never decays.

Ingest

APIs, files, streams, MariaDB, FRED, Finage, OANDA, MetaTrader

Engineer

Cleaning, alignment, resampling, feature stores, lookback windows

Model

MRN ensembles, LightGBM, deep nets, LLMs/RAG, agentic workflows

Deploy

CI/CD via GitHub Actions, AWS EC2/EBS, Cloudflare, containerisation

Monitor

Drift detection, walk-forward validation, performance dashboards

Every engagement is wired end-to-end. We do not deliver a notebook — we deliver a continuously-running system you can trust on Monday morning.

Operating principles

Four non-negotiables.

Start from the decision, not the data

Every engagement begins by characterising the decision the model will inform: cadence, cost of error, stakeholder, and downstream system. Modelling choices flow from there — not the other way around.

Honest validation or it does not ship

Time-series problems demand walk-forward validation, leak-free splits, and out-of-sample stress tests. We never report metrics we wouldn't stake our own money on.

Production from day one

Pipelines are versioned, containerised, and CI/CD-deployed from the first sprint. There is no "throw it over the wall" handover — there's never a wall.

Monitor, retrain, defend

Drift detection, performance dashboards, and scheduled retraining are part of the deliverable. Models age — our systems are designed to know it before you do.

What's new in our toolkit

GenAI, LLMs, RAG, and agentic AI — used where they earn it.

Retrieval-Augmented Generation

Grounded LLM systems that answer from your documents, warehouses, and knowledge bases — with citation, evaluation, and guardrails wired in from the start.

Agentic workflows

Multi-step LLM agents that orchestrate tools, APIs, and classical ML models — with explicit cost, latency, and reliability budgets, not magical promises.

LLMs alongside time-series ML

We use language models to augment — not replace — the rigorous time-series and classification systems that drive measurable business outcomes.

Tools we reach for

A modern, opinionated stack.

Modelling

MRN ensembles
LightGBM / XGBoost
PyTorch / TensorFlow
scikit-learn
Hugging Face Transformers

GenAI & LLMs

OpenAI / Anthropic / open-weight LLMs
RAG pipelines
LangChain / LlamaIndex
Agentic workflows
Evaluation harnesses

Data & storage

MariaDB / PostgreSQL
Pandas / NumPy / Polars
Feature stores
Time-series DBs
S3 / object storage

Sources & APIs

FRED / OECD macro data
Finage / OANDA / MetaTrader
Bloomberg / Refinitiv
Custom scrapers
Internal warehouses

MLOps & deploy

GitHub Actions CI/CD
Docker / containerisation
AWS EC2 / EBS / RDS
Cloudflare Pages / Workers
MLflow tracking

Validation & monitoring

Walk-forward backtesting
Drift detection
Performance dashboards
Statistical significance tests
A/B and shadow deploys

Let's build something measurable

Have a forecasting or classification problem that needs to work in production?

Tell us about it. First conversations are confidential, no-obligation, and usually end with a clear view of feasibility, data needs, and time-to-value.

Email Tepper jtepper@perceptronix.net