Run the whole thing on your own GPU (yes, with the Wi-Fi off)

If your code can't leave the building, your test generator shouldn't phone home either. TFactory runs fully on local models.

A fun question to ask any “AI for your codebase” tool: where does my code go? For a lot of teams — banks, hospitals, anyone with a compliance officer who owns a red pen — the only acceptable answer is nowhere. Their code does not leave the network. Full stop.

That rules out most AI test tools, which are thin wrappers around a single hosted API. TFactory isn’t.

Bring your own brain

TFactory picks its LLM provider from the model string, and the list is deliberately long: Claude, Codex, Gemini, GitHub Copilot — and, crucially, the local crowd: Ollama, vLLM, LM Studio, LocalAI, and anything that speaks the OpenAI-compatible protocol. Point it at a model running on the box under your desk and the entire pipeline — Planner, Gen-Functional, Evaluator, Triager — runs against that. No token leaves the room.

# the whole pipeline, on a local Qwen, no internet required
TF_MODEL=ollama:qwen2.5-coder:14b   tfactory handover --spec 001

“Air-gapped” as a feature, not an apology

This isn’t a degraded mode you tolerate. It’s a first-class path:

So you can prove it to the person with the red pen, not just claim it.

The boring truth about local models

They’re not as sharp as the frontier hosted ones — a 14B model plans a feature more conservatively than Claude does. But “runs entirely on our hardware, verifiably, with no egress” is worth a lot more than a few extra IQ points to the teams who need it. And the gap shrinks every month.

If that’s you, the BYO-LLM guide has the provider config, and the architecture page explains how the model string routes to a provider.


← All posts