Building Tickstream: Real-Time Crypto Analytics with an AI Assist
I built Tickstream over a few weekends. It’s a real-time crypto analytics pipeline: BTC and ETH prices stream in from Coinbase, anomalies get detected against a rolling baseline, a local LLM generates narrative for significant moves, and news events get correlated with price reactions after the fact. Everything is observable through a Grafana dashboard embedded in a landing page.
This post is a walkthrough of how the system works and an honest account of building it with AI assistance.
The Architecture⌗
The system is a set of independent services, each responsible for one concern, communicating through an in-process event bus. This keeps the design modular: each piece can fail, retry, or be replaced without cascading through the rest.
Coinbase WebSocket
│
▼
Price Consumer Verticle ─────────────────────┐
│ │
▼ ▼
Anomaly Detection Verticle InfluxDB Verticle
│ ▲
├──► Push Notification (ntfy) │
│ │
└──► InfluxDB (anomaly regions) ───────┘
Finnhub News ──► Sentiment (Ollama) ──► Price Impact ──► InfluxDB (events)
|
▼
SQLite storage
│
▼
Daily Briefing (ntfy)
Prices stream in from Coinbase’s WebSocket feed and flow in two directions: straight to storage, and through an anomaly detector that watches for significant moves against a rolling baseline. When something crosses the threshold, a push notification goes out and the anomaly is recorded alongside the price data.
News events are handled separately. A scheduler polls Finnhub at regular intervals, and each new item gets run through a local LLM for sentiment analysis before being matched against the price data in the surrounding window. The result is a record of what happened in the market alongside what was being said about it at the time — the correlation is more interesting than either signal alone.
Everything is observable through Grafana: price history, detected anomalies, annotated news events, and application metrics and logs. The whole thing runs in Docker Compose on a Hetzner CX32, with Caddy as the reverse proxy handling the landing page and Grafana panel embedding.
What I Learned⌗
Vert.x is well suited to event-driven processing — the verticle model keeps concerns clean and the event loop handles high-frequency ticks without drama. The harder part was making it all stable: getting flows to restart on failure, wiring up retries with the right backoff, and making sure the observability was good enough to actually know the app was working as it should. Health checks, Prometheus metrics, and structured logs help keep an eye on the status of the app.
InfluxDB 3 was a deliberate choice to work with a timeseries database outside my usual stack. Grafana was the highlight. The flexibility of the dashboard builder — composable layouts, annotation overlays, variable-driven queries, Loki integration alongside metrics — is something I hadn’t fully appreciated before. The fact that it’s open source and self-hosted makes it even better.
AI-Assisted Development: What It Actually Looked Like⌗
I used Claude Code throughout — it runs in the terminal, integrates naturally with Vim and my Omarchy setup, and keeps context on the full codebase across the session.
The best use was kick-starting unfamiliar territory. InfluxDB 3 and the Grafana dashboard format were both fairly new to me, and having a working starting point — even an imperfect one — was faster than reading the full API surface from scratch. The generated code for InfluxDB queries gave me something to run against and iterate on. Same with Grafana: the dashboard JSON was based on older schema versions and needed corrections, but it was a better starting point than a blank file. The problem is that the model’s training data skews toward older documentation and Stack Overflow answers. For anything touching a recently-changed API or a niche tool, going straight to the official docs is faster than iterating through confident wrong answers. The one frustrating limitation: Claude won’t give you direct links. Being pointed to the right documentation page would have saved real time on multiple occasions.
The bigger lesson was about architecture. I generated a first version quickly and then realised most of the structural decisions were wrong. Vibe coding doesn’t work — not because the generated code is bad line by line, but because you can produce plausible-looking architecture that solves the wrong problem. The anomaly detector was the clearest example: the first generated version looked reasonable, but when I read it carefully it didn’t actually match what I needed. The issue was that I hadn’t thought through what anomaly detection should mean for this feed — what constitutes an entry, an exit, where to anchor the baseline. Only after working that out myself was I able to describe the problem clearly enough to get good code. The model can implement a solution well; it can’t figure out what the solution should be.
This also puts new pressure on code review. It’s surprisingly easy to generate nonsense that looks correct, compiles, and passes tests — especially when the tests were also written by AI. The test suite validates the implementation’s own assumptions, not yours. Reading AI-generated code requires more attention than reading code a colleague wrote, not less — because a colleague usually has a mental model of the problem that leaks into the code in legible ways. Generated code can be locally coherent but globally wrong, and the only way to catch that is to actually understand what it’s doing.
Working with Local LLMs⌗
Running Ollama locally is useful for a side project — no API costs, no data leaving the machine, and baking the model into the Docker image at build time means the container starts with no runtime dependency on a model registry. The harder problem is getting consistent, meaningful output. Out of the box, llama3.2:3b was pretty good at generating the daily briefing based on the news from the day, but when doing sentiment analysis it was skewing bearish on almost everything — I think that’s an inherent bias in the base model toward negative framing in financial contexts. I fine-tuned it to correct that, and overcorrected: it started reading bullish into news that clearly wasn’t. After another round it’s drifted bearish again. The scores are still inconsistent in a way that’s hard to reason about. Fine-tuning a small model for a specific analytical task is not as straightforward as it sounds, and I’m not convinced I’ve found the right approach yet. It’s something I want to dig into more seriously — the question of how to reliably adapt a general-purpose model to a narrow domain is genuinely interesting and I’ve only scratched the surface.
The Project as a Portfolio Artifact⌗
The goal from the start was to demonstrate Kotlin stream processing in a real setting. The AI assistance made it faster to build, and occasionally introduced errors that I had to catch and fix. That’s about the right way to describe it. The judgment about what to build, whether it was working correctly, and where the design was going wrong — that stayed with me. Which is, I think, exactly where it should be.
The project page is at tickstream.lzag.dev.