How to Use North Mini Code: The Free Agentic Coding Model for Hermes Agent

Verdict: North Mini Code is the first specialized "agentic" coding model that balances a small active footprint (3B) with high reasoning performance (30B MoE). For developers and small businesses using Hermes Agent, it provides a cost-free, high-speed alternative to frontier models for terminal tasks and software engineering.

Last verified: 2026-06-20
Best for: Agentic coding, terminal automation, and low-cost 24/7 research bots.
Key Specs: 30B Sparse MoE (3B active), 256K Context, Apache 2.0 License.

What is North Mini Code?

Released on June 9, 2026, North Mini Code is Cohere Labs’ first open-weight model purpose-built for the developer community. It belongs to the new "North" family of models, designed to move beyond simple chat completion and into the realm of agentic software engineering.

Unlike generalist models, North Mini Code is a sparse Mixture-of-Experts (MoE) architecture. It has 30 billion total parameters, but only activates 3 billion parameters per token. This design allows for the reasoning depth of a mid-sized model with the speed and low latency of a small model, making it ideal for the multi-turn "think-act-verify" loops required by AI agents.

How to Get the North Mini Code API for Free

The most accessible way to use North Mini Code today is through the OpenRouter Free API. OpenRouter provides a hosted version of the model that costs $0.00 per million tokens (as of June 2026), making it a perfect "free tier" for autonomous workflows.

OpenRouter: Search for cohere/north-mini-code:free.
Cohere Model Vault: Managed inference with production-grade rate limits.
Local Deployment: Available via Ollama or Hugging Face (requires ~1x H100 GPU for FP8/FP4 precision).

Setting Up North Mini Code in Hermes Agent

To wire North Mini Code into Hermes Agent, you should create a dedicated agent profile. This allows you to route specific "grind" tasks to the free model while keeping your frontier models (like Claude 3.7) for complex architectural decisions.

1. Configure the Provider

Set up OpenRouter as a provider in your Hermes configuration:

hermes model set OpenRouter

2. Create the North Mini Profile

Run this command to spin up a specialized profile:

hermes profile create north-mini --model "cohere/north-mini-code:free"

Once created, you can delegate tasks specifically to this profile. Because the API is free, you can let North Mini run 24/7 on Kanban tasks without worrying about your token budget.

Why North Mini Code Wins for Agentic Tasks

Most "Mini" models struggle with complex tool use or multi-step reasoning. North Mini Code solves this by being trained against multiple agent harnesses rather than just static datasets. It was optimized using Reinforcement Learning with Verifiable Rewards (RLVR) against frameworks like SWE-Agent and OpenCode.

Metric	North Mini Code (30B-A3B)	Gemma 4 (26B-A4B)	Qwen 3.5 (35B-A3B)
Artificial Analysis Coding Index	33.4	31.2	30.1
SWE-bench Verified	61.0	58.5	56.2
HumanEval (Pass@1)	78.4%	76.1%	75.8%
Context Window	256K	128K	128K

Source: Cohere Labs Internal Benchmarks (June 2026).

What this means for you

In 2026, the winning strategy for AI-driven business is Model Routing. Don't waste your expensive frontier model tokens on repetitive terminal tasks, file searching, or basic refactoring.

Route the "manual labor" of software engineering to North Mini Code. It can run in the background 24/7, managing your local AI assistant infrastructure, while you save your premium models for the high-level strategy and complex debugging that actually requires a "Frontier" brain.

North Mini Code vs Gemma 4 showdown

FAQ

Q: Is North Mini Code truly free?
A: Yes, the model weights are Apache 2.0, meaning you can run it locally for free forever. It is also currently offered as a free API on OpenRouter and through Cohere's trial keys.

Q: Does it support tool use and JSON?
A: Yes. North Mini Code is natively optimized for interleaved reasoning and tool use via JSON schema, allowing it to "think" before it calls a tool.

Q: Can it handle large codebases?
A: With a 256K context window, it can ingest significant portions of a repository at once, which is a major advantage over other small coding models.

Q: How do I run it locally?
A: Use ollama run north-mini-code. You will need at least 24GB of VRAM (like an RTX 4090 or Mac Studio) for a quantized version, or a dedicated H100 for full FP8 performance.

Sources

Updates & Corrections

2026-06-20: Article published. Fact-checked architecture (30B MoE) and benchmarks vs Gemma 4.

Last verified: 2026-06-20
Best for: Agentic coding, terminal automation, and low-cost 24/7 research bots.
Key Specs: 30B Sparse MoE (3B active), 256K Context, Apache 2.0 License.

What is North Mini Code?

How to Get the North Mini Code API for Free

OpenRouter: Search for cohere/north-mini-code:free.
Cohere Model Vault: Managed inference with production-grade rate limits.
Local Deployment: Available via Ollama or Hugging Face (requires ~1x H100 GPU for FP8/FP4 precision).

Setting Up North Mini Code in Hermes Agent

1. Configure the Provider

Set up OpenRouter as a provider in your Hermes configuration:

hermes model set OpenRouter

2. Create the North Mini Profile

Run this command to spin up a specialized profile:

hermes profile create north-mini --model "cohere/north-mini-code:free"

Once created, you can delegate tasks specifically to this profile. Because the API is free, you can let North Mini run 24/7 on Kanban tasks without worrying about your token budget.

Why North Mini Code Wins for Agentic Tasks

Metric	North Mini Code (30B-A3B)	Gemma 4 (26B-A4B)	Qwen 3.5 (35B-A3B)
Artificial Analysis Coding Index	33.4	31.2	30.1
SWE-bench Verified	61.0	58.5	56.2
HumanEval (Pass@1)	78.4%	76.1%	75.8%
Context Window	256K	128K	128K

Source: Cohere Labs Internal Benchmarks (June 2026).

What this means for you

In 2026, the winning strategy for AI-driven business is Model Routing. Don't waste your expensive frontier model tokens on repetitive terminal tasks, file searching, or basic refactoring.

North Mini Code vs Gemma 4 showdown

FAQ

Q: Does it support tool use and JSON?
A: Yes. North Mini Code is natively optimized for interleaved reasoning and tool use via JSON schema, allowing it to "think" before it calls a tool.

Q: Can it handle large codebases?
A: With a 256K context window, it can ingest significant portions of a repository at once, which is a major advantage over other small coding models.

Sources

Updates & Corrections

2026-06-20: Article published. Fact-checked architecture (30B MoE) and benchmarks vs Gemma 4.

How to Use North Mini Code: The Free Agentic Coding Model for Hermes Agent

What is North Mini Code?

How to Get the North Mini Code API for Free

Setting Up North Mini Code in Hermes Agent

1. Configure the Provider

2. Create the North Mini Profile

Why North Mini Code Wins for Agentic Tasks

What this means for you

FAQ

Get the practical AI brief

Discussion

How to Use North Mini Code: The Free Agentic Coding Model for Hermes Agent

What is North Mini Code?

How to Get the North Mini Code API for Free

Setting Up North Mini Code in Hermes Agent

1. Configure the Provider

2. Create the North Mini Profile

Why North Mini Code Wins for Agentic Tasks

What this means for you

FAQ

Get the practical AI brief

Discussion

What is North Mini Code?

How to Get the North Mini Code API for Free

Setting Up North Mini Code in Hermes Agent

1. Configure the Provider

2. Create the North Mini Profile

Why North Mini Code Wins for Agentic Tasks

What this means for you

Related reading

FAQ

Get the practical AI brief

Discussion

What is North Mini Code?

How to Get the North Mini Code API for Free

Setting Up North Mini Code in Hermes Agent

1. Configure the Provider

2. Create the North Mini Profile

Why North Mini Code Wins for Agentic Tasks

What this means for you

Related reading

FAQ

Get the practical AI brief

Discussion