- Cursor
- AI
- SaaS
OpenAI API Integration with Cursor: Real Apps, Not Demos
Wire chatbots, agents, and automations with auth, RAG, rate limits, and cost controls — production patterns for small teams.
Published May 21, 2026 · 11 min read
Introduction
Tutorial repos call OpenAI from the client with a leaked key. Production apps do not. You need server-side routes, authentication, rate limits, logging, and cost caps — patterns Cursor can scaffold in an afternoon if you specify the architecture upfront.
This guide walks through integrating the OpenAI API for real apps: chat, embeddings, RAG, and agents — deployed on DigitalOcean, not localhost demos.
Demo vs production
| Concern | Demo shortcut | Production pattern |
|---|---|---|
| API key | NEXT_PUBLIC_* env | Server-only OPENAI_API_KEY |
| Auth | None | Session or JWT on /api routes |
| Abuse | Unlimited | Rate limit per user + IP |
| Cost | Ignored | Token logging + daily budget cap |
| Models | Hard-coded | Env-driven model router |
Request lifecycle
Put OpenAI calls in /services/openai or your API layer — not in React components. Validate input with Zod, truncate prompts server-side, and return structured errors the UI can display without exposing stack traces.
RAG with pgvector
Retrieval-augmented generation keeps answers grounded in your docs. Chunk markdown, embed with text-embedding-3-small, store vectors in Postgres with pgvector, and retrieve top-k chunks before each chat completion.
Cost and model routing
Route simple tasks to smaller models; reserve GPT-4 class models for complex reasoning. Cache identical FAQ questions. Log prompt and completion tokens per user for billing and debugging.
| Use case | Model tier | Cost note |
|---|---|---|
| Classification | Small / mini | Sub-cent per call |
| Support bot | Mid + RAG | Cache repeated questions |
| Agent planning | Large | Cap steps + timeout |
| Embeddings | Embedding model | Batch on ingest, not per chat |
Deploy on DigitalOcean
Run API and Postgres on one Droplet for MVPs; move to managed DB when backups matter. Set OPENAI_API_KEY in Docker env or DO App Platform secrets — never in git. See Full-stack deployment on DigitalOcean for the full pipeline.
Common mistakes
- Streaming responses without abort controllers — runaway token burn
- No max_tokens limit on user-facing chat
- Pasting user content into system prompts without sanitization
- Skipping idempotency keys on webhook-triggered agent runs
FAQ
Can Cursor generate the whole integration?
Yes — routes, services, Zod schemas, and pgvector migrations. You still review auth boundaries, rate limits, and env handling before production.
OpenAI or Anthropic?
Abstract behind one interface in your service layer. Pick OpenAI for v1 if you need embeddings + chat in one vendor; swap via env later.
How do I estimate monthly API cost?
Log tokens for 50 test sessions, multiply by expected daily users, add 30% buffer. Set hard daily caps in code until you trust the math.
Next steps
Scaffold one authenticated POST /api/chat route with rate limiting, deploy to a Droplet, and load-test with 100 parallel requests. Read Build AI business tools with Cursor for product patterns beyond the API layer.