Building a Local-First ML Engine in Rust as a Solo Developer

Brainlet started as a small tool for one developer.

The problem was simple: every coding assistant could read files, but none of them understood the project. They could find matching text. They could pull nearby code into context. They could summarize what was already in front of them. But they kept missing the architecture.

The answer was not a larger context window. It was a different kind of context.

That became the design constraint for Brainlet: build a small engine that learns the shape of a codebase before an LLM is asked to act on it. The goal is not to replace the model. The goal is to stop wasting the model’s reasoning on reconstructing project context from scattered files.

Why Rust

The engine has to index large repositories, parse many languages, train lightweight models, and answer queries quickly on a local machine. Rust fits that constraint well. It gives the engine predictable performance, low memory overhead, and enough control to keep the pipeline local-first.

Brainlet is built to run on a developer laptop or a small company server. No cloud dependency. No code leaving the machine.

That choice affects every part of the architecture. Parsing has to be incremental. Intermediate representations have to be compact. The query layer has to respond fast enough to be useful inside normal developer workflows, not as an offline research job.

Why local-first

Code is sensitive. For many teams, sending an entire codebase to a hosted service is not acceptable. For regulated companies, it is often impossible.

Brainlet keeps the project where it already lives. The intelligence layer is computed locally. The LLM connected at inference time can be anything the team chooses, including open-source models running on their own hardware.

This matters most for teams with private repositories, regulated data, or strict vendor review. A code intelligence engine that requires uploading the full repository creates a security conversation before it creates value. Brainlet is built to avoid that tradeoff.

Why specialized models

Brainlet does not train a general-purpose LLM. It trains multiple lightweight models on the structure of the specific project in front of it.

Those models learn task-specific signals: how components relate, which files behave similarly, where changes propagate, what patterns are normal in this codebase, and which constraints matter.

The result is not a pile of retrieved files. It is computed project understanding.

That distinction is the heart of Cognitive Augmented Generation. Retrieval asks, “which chunks look similar to this prompt?” Brainlet asks, “what does this model need to know about the project before it answers?” Those are different problems, and they produce different systems.

The small brain bet

The industry keeps assuming bigger is better. Bigger models. Bigger context windows. Bigger bills.

Brainlet takes the opposite bet. A smaller engine that understands the project can make any model more useful. The point is not to replace the LLM. The point is to give it the context it should have had from the start.

Public benchmark methodology and results publish in June 2026. Until then, the honest claim is architectural: better project context should reduce the amount of reasoning the LLM has to spend rediscovering the codebase.

Building an ML Engine in Rust as a Solo Developer

Why Rust

Why local-first

Why specialized models

The small brain bet