How I Built ClawCost in a Week (And Why the API Bill Scared Me Into It)

If you’ve ever built anything serious on top of an LLM API, you’ve probably had a moment where you opened your usage dashboard and felt your stomach drop.

I had several of those moments.

The actual problem

When you’re building products with the Claude API, your token usage can escalate fast. It’s not that any single call is expensive — it’s the volume. You’re testing a feature, iterating on a prompt, running an evaluation loop, and suddenly you’ve burned through a lot more than you intended. And because the feedback is asynchronous — the bill comes later, not in the moment — you often don’t realize how fast it’s happening until after the fact.

I was hitting my usage limits quickly while building. Not because I was being reckless, but because that’s what active development looks like. I couldn’t afford to run models locally — a proper GPU setup is $5,000+ and I wasn’t at that point. So I was entirely dependent on the API, which meant I was entirely dependent on not losing track of what I was spending.

I started reading about other developers having the same experience. Horror stories of unexpected bills. Teams that had a runaway process blow through their budget overnight. People setting calendar reminders to check their dashboards because there was no automatic alerting. The Claude API in particular — because the models are capable enough to be genuinely useful in production workflows — tends to attract the kind of usage patterns that can get expensive fast.

How the idea took shape

I was working with OpenClaw, and I used it to do some research — pulling threads from Reddit and Hacker News about pain points around Claude API usage. Cost visibility and control came up repeatedly. The “I had no idea how much I was spending until the invoice hit” problem was real and shared across a lot of developers.

The specific thing I wanted was simple: know what I’m spending in real time, at the session and project level, and have something that would stop me before I went too far. Not a dashboard that told me what happened last week — something that told me what was happening right now.

That’s what I built.

How the week went

From validation to live product took about a week. That sounds fast, and it was — but it was fast in the way that solo building with AI assistance can be fast. Claude did a lot of the implementation heavy lifting. I was the one who knew what I actually wanted, kept the design coherent, and made the decisions about what to cut.

The technical decisions weren’t the hard part. The architecture is straightforward: intercept API calls, track token counts per session and project, calculate cost in real time against current Anthropic pricing, surface that clearly. The WebSerial connection is the interesting technical piece, but it worked without drama.

The hard part was the design and feel of the thing.

That’s the part nobody talks about enough. You can build functional software in a week. Building software that feels good to use is harder. I went through several iterations of the dashboard UI before landing on something I was satisfied with. How do you display cost information without creating anxiety? How do you make the alerting useful without making it annoying? What’s the right level of granularity for usage data?

These aren’t engineering problems. They’re product problems, and they take longer than the code.

What I shipped

ClawCost is live at getclawcost.com. It tracks token spend per model, session, and project in real time. There’s a free tier for solo developers and a Pro tier for teams that need to track usage across multiple people. It’s self-hostable if you’d rather keep the data entirely on your own infrastructure.

The thing I’m most satisfied with is that it does what it promised: it tells you what you’re spending before you regret it. That’s the whole job.

The broader lesson

I’ve started to think of ClawCost as a proof of concept for how I want to work. Find a pain I’m personally experiencing. Validate that others have the same pain. Build the minimum version that actually solves it. Ship it, then learn what to add.

The week-to-launch timeline is possible because I’m not trying to build everything upfront. I’m trying to build the thing that solves the specific problem first, and let the product grow from there based on what users actually need.

That applies to every product I’m working on. KerfOS started the same way. Gaphunter will too. The mistake most builders make is trying to get to feature-complete before launch. Feature-complete is a moving target. Solving the core problem isn’t.

Build that part first. Ship it. Then figure out what else needs to exist.