Cursor
AI
SaaS

OpenAI API Integration with Cursor: Real Apps, Not Demos

Wire chatbots, agents, and automations with auth, RAG, rate limits, and cost controls — production patterns for small teams.

Published May 21, 2026 · 11 min read

Production OpenAI apps route all model calls through your server — never the browser.

Introduction

Tutorial repos call OpenAI from the client with a leaked key. Production apps do not. You need server-side routes, authentication, rate limits, logging, and cost caps — patterns Cursor can scaffold in an afternoon if you specify the architecture upfront.

This guide walks through integrating the OpenAI API for real apps: chat, embeddings, RAG, and agents — deployed on DigitalOcean, not localhost demos.

Demo vs production

The gap between a hackathon demo and a shippable OpenAI integration.

Concern	Demo shortcut	Production pattern
API key	NEXT_PUBLIC_* env	Server-only OPENAI_API_KEY
Auth	None	Session or JWT on /api routes
Abuse	Unlimited	Rate limit per user + IP
Cost	Ignored	Token logging + daily budget cap
Models	Hard-coded	Env-driven model router

Patterns to implement before you share a public URL.

Request lifecycle

Every chat request passes auth, limits, optional RAG, then OpenAI — with logging on the way out.

Put OpenAI calls in /services/openai or your API layer — not in React components. Validate input with Zod, truncate prompts server-side, and return structured errors the UI can display without exposing stack traces.

RAG with pgvector

Retrieval-augmented generation keeps answers grounded in your docs. Chunk markdown, embed with text-embedding-3-small, store vectors in Postgres with pgvector, and retrieve top-k chunks before each chat completion.

Support bot RAG flow — ingest, embed, retrieve, generate.

Cost and model routing

Route simple tasks to smaller models; reserve GPT-4 class models for complex reasoning. Cache identical FAQ questions. Log prompt and completion tokens per user for billing and debugging.

Use case	Model tier	Cost note
Classification	Small / mini	Sub-cent per call
Support bot	Mid + RAG	Cache repeated questions
Agent planning	Large	Cap steps + timeout
Embeddings	Embedding model	Batch on ingest, not per chat

Deploy on DigitalOcean

Run API and Postgres on one Droplet for MVPs; move to managed DB when backups matter. Set OPENAI_API_KEY in Docker env or DO App Platform secrets — never in git. See Full-stack deployment on DigitalOcean for the full pipeline.

Common mistakes

Streaming responses without abort controllers — runaway token burn
No max_tokens limit on user-facing chat
Pasting user content into system prompts without sanitization
Skipping idempotency keys on webhook-triggered agent runs

FAQ

Can Cursor generate the whole integration?

Yes — routes, services, Zod schemas, and pgvector migrations. You still review auth boundaries, rate limits, and env handling before production.

OpenAI or Anthropic?

Abstract behind one interface in your service layer. Pick OpenAI for v1 if you need embeddings + chat in one vendor; swap via env later.

How do I estimate monthly API cost?

Log tokens for 50 test sessions, multiply by expected daily users, add 30% buffer. Set hard daily caps in code until you trust the math.

Next steps

Scaffold one authenticated POST /api/chat route with rate limiting, deploy to a Droplet, and load-test with 100 parallel requests. Read Build AI business tools with Cursor for product patterns beyond the API layer.