BiTree
  • Search For Lessons
  • Curriculum
  • Pricing
  • For Educators
  • Become a Tutor
  • About
  • Contact
Log InGet Started

Questions, concerns, bug reports, or suggestions? We read every message, write to us at [email protected].

More ways to reach us →
BiTree

Live coding lessons for aspiring developers and security professionals.

[email protected]

(201) 785-7951

Mon–Fri, 9 AM–5 PM EST

Learn

  • Search For Lessons
  • Curriculum
  • Pricing

Company

  • About
  • For Educators & Schools
  • Become a Tutor
  • Contact Us

Legal

  • Terms of Service
  • Privacy Policy
© 2026 BiTree. All rights reserved.
Curriculum/Artificial Intelligence/Building with AI APIs
60 minIntermediate

Building with AI APIs

After this lesson, you will be able to: Use AI APIs in code: send requests, stream responses, manage tokens and costs.

Prompts in a chat UI are the start. Building products means using the API. This lesson covers Anthropic and OpenAI APIs end-to-end.

Prerequisites:Prompt Engineering

API basics

Get an API key (Anthropic console or OpenAI platform). Auth via Bearer header. Send POST with JSON. Get JSON response. Same shape for both providers (mostly).

Anthropic SDK example (Python)

Install with `pip install anthropic`:

python
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.environ['ANTHROPIC_API_KEY'])
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain RAG in 3 sentences."}
],
)
print(response.content[0].text)

💡 Streaming responses

Don't make users wait 5 seconds for a paragraph. Stream tokens as they're generated. Both Anthropic and OpenAI support SSE-style streams. Use `stream=True` and iterate.

Streaming example

Stream tokens to user as they come:

python
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Tell me a story."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)

Production gotchas

  1. 1

    1. Token counting, set max_tokens conservatively.

  2. 2

    2. Rate limits, implement exponential backoff.

  3. 3

    3. Caching. Anthropic prompt caching cuts cost on long system prompts.

  4. 4

    4. System prompt vs user message, system is for instructions, user for input.

  5. 5

    5. Logging, store prompts + responses for debugging + evals.

  6. 6

    6. Never put API keys in client code.

Cost management

Sonnet is ~5x cheaper than Opus. Use the smallest model that works. Use prompt caching for static system content. Aggressive max_tokens. Streaming = perceived latency, not real cost reduction.

Sign in and purchase access to unlock this lesson.

Sign in to purchase
←Prompt Engineering
Back to Artificial Intelligence
AI Agents and Agentic Systems→