BiTree
  • Search For Lessons
  • Curriculum
  • Pricing
  • For Educators
  • Become a Tutor
  • About
  • Contact
Log InGet Started

Questions, concerns, bug reports, or suggestions? We read every message, write to us at [email protected].

More ways to reach us →
BiTree

Live coding lessons for aspiring developers and security professionals.

[email protected]

(201) 785-7951

Mon–Fri, 9 AM–5 PM EST

Learn

  • Search For Lessons
  • Curriculum
  • Pricing

Company

  • About
  • For Educators & Schools
  • Become a Tutor
  • Contact Us

Legal

  • Terms of Service
  • Privacy Policy
© 2026 BiTree. All rights reserved.
Curriculum/Artificial Intelligence/The LLM Landscape: Models and Companies
35 minBeginner

The LLM Landscape: Models and Companies

After this lesson, you will be able to: Know the major models, the companies behind them, and how to evaluate which to use when.

The LLM landscape changes every quarter. This lesson maps the players (OpenAI, Anthropic, Google, Meta, Mistral), their models' strengths, and how to think about evaluation.

Prerequisites:What are Large Language Models?

The major players (as of 2026)

Anthropic. Claude family (Opus, Sonnet, Haiku). Strong reasoning, long context, safety focus. OpenAI. GPT-4, GPT-4o, o1 (reasoning). Broadest ecosystem. Google. Gemini family. Multi-modal, longest context. Meta. Llama family. Open weights, runnable locally. Mistral, open + commercial. Strong European alternative.

ℹ️ Open vs closed weights

Closed (Claude, GPT-4o, Gemini): API only. Pay-per-token. Best frontier capability. Open (Llama 3.x, Mistral, DeepSeek, Qwen): download weights, run yourself or via inference providers. Privacy, cost predictability, customization. Often 6-12 months behind frontier. Hugging Face is the canonical hub for open-source models, datasets, and inference endpoints. Browse hf.co/models, search by task and license, and you can run most of them in a Colab notebook within minutes.

Multimodal capability matters more than benchmarks

Modern frontier models read images, parse PDFs, transcribe audio, and (in some cases) watch video. Claude 3.5+ and Claude 4 handle images and PDFs natively. GPT-4o is multimodal in/out (text, image, audio). Gemini 2.x handles longest video. When picking a model, ask: 'what input shapes will my users send?' before benchmark scores.

Context window changes the design space

Context window = how much input the model can read at once. Modern frontier models offer 128K-1M tokens (Claude Sonnet 4.x ~200K, GPT-4o 128K, Gemini 2 Pro 1M+). Large context lets you skip RAG for medium documents (one long prompt with the document inline). Long contexts cost more per request and degrade in attention to the middle (the 'lost in the middle' effect). Always test retrieval quality at the size you'll deploy.

Picking a model

  1. 1

    1. Capability tier (frontier vs cheap fast).

  2. 2

    2. Latency requirements (Haiku < Sonnet < Opus on Anthropic; gpt-4o-mini < gpt-4o on OpenAI).

  3. 3

    3. Cost per token (compare via OpenRouter, llmprices.com).

  4. 4

    4. Modality requirements (text only? image input? long PDFs?).

  5. 5

    5. Context window (does your worst-case input fit?).

  6. 6

    6. Privacy (data residency, training opt-out, BYOK).

  7. 7

    7. Tools/features (function calling, vision, computer use, structured output).

  8. 8

    8. Always benchmark on YOUR data, not generic leaderboards.

Benchmarks (with skepticism)

MMLU (broad knowledge), HumanEval (code), GPQA (graduate-level science), SWE-bench (real GitHub issues), LMSYS Arena (human pref). Useful as filters; final choice should be your own evals on your task.

Common mistakes only experienced AI engineers catch

Picking by leaderboard, not by task. Arena rankings shift weekly; the task that matters for your product probably isn't a benchmark. Defaulting to the most expensive model. Sonnet/GPT-4o-mini handle ~80% of production tasks at a fraction of the cost. Ignoring multimodal options when input includes screenshots, scanned forms, or audio. Locking in a single provider. Build a thin abstraction (or use OpenRouter, LiteLLM) so swapping providers takes a config change, not a rewrite.

Sign in and purchase access to unlock this lesson.

Sign in to purchase
←What are Large Language Models?
Back to Artificial Intelligence
Hands-on: Hugging Face and Open-Source Models→