Meta Pro · $9/month 10 tokens / message Slow

Llama 4 Maverick · Meta

Llama 3.1 405B from Meta. 405 billion parameters. Quality close to GPT-4 but open-weights. Excellent for tasks that require depth.

405B open Flagship quality Depth

Try it free

Llama 4 Maverick requires the Pro plan

Create a free account to start, or subscribe to a plan for unlimited use of premium models.

No card to start · Cancel anytime

Open in full chat → Compare models side by side, save your sessions and memory

About Llama 4 Maverick

Llama 3.1 405B from Meta. 405 billion parameters. Quality close to GPT-4 but open-weights. Excellent for tasks that require depth.

Where it shines: 405B open · Flagship quality · Depth.

How to use Llama 4 Maverick

  1. 1

    Type or upload

    Type what you want in the box above — or upload the file if the tool asks for one.

  2. 2

    Generate

    Click the main button. Wait 2-30 seconds depending on the model and input size.

  3. 3

    Download or share

    Download the result or share the direct link. No watermark, ready to use.

Frequently asked questions

How much does it cost to use Llama 3.1 405B?

Llama 3.1 405B is a Pro model: it costs 20 tokens per use (~$0.10 real cost for us). You need a Pro plan ($9/month → 15,000 tokens) or a one-shot pack.If you already have tokens in the free account, you can also spend them directly.

How many uses of Llama 3.1 405B are included in the Pro plan?

Pro ($9/month) gives you 15,000 recurring tokens. At 20 tokens per use of Llama 3.1 405B, that's ~750 full uses per cycle.If you run out, the one-shot packs (5,000 / 25,000 / 80,000 tokens) add to the balance without expiring before one year.

What makes Llama 3.1 405B special?

Meta publishes the full weights of the Llama family — well-tested in general chat, code, and multilingual, with specific strengths in open 405b, flagship quality, and depth.

How fast does Llama 3.1 405B respond?

Call 3.1 405B is a "thinker" model: it reasons explicitly before responding, so it takes longer — waits 15-45 seconds.The actual time also depends on the length of the prompt and the load of the datacenter — models with huge context take longer when you enter very long texts.

How do I use Llama 3.1 405B in ia.gratis?

You can use Llama 3.1 405B from /chat/ by selecting Llama 3.1 405B in the picker, or via the REST API with `model=llama-405b` in the POST body. Quick summary: the largest Llama. Open flagship quality. The internal model identifier is `llama-405b` — useful when integrating via API.