DeepSeek V4 API: The Full Developer Information

June 25, 2026

3

DeepSeek V3 marks a notable shift within the open-weight LLM market, delivering aggressive reasoning and code era outcomes at a fraction of GPT-4o’s per-token value (see the DeepSeek pricing web page for present charges). For builders constructing JavaScript purposes, the DeepSeek API presents a direct path to high-quality reasoning, code era, and multi-turn dialog capabilities with out the worth tag related to GPT-4o or Claude 3.5 Sonnet.

This information walks by all the pieces wanted to go from zero to a working software: setting setup, core API options, streaming, structured output, a whole code evaluate device, and migration from different suppliers.

Find out how to Combine the DeepSeek V3 API in a JavaScript Undertaking

Create a DeepSeek account at platform.deepseek.com and generate an API key.
Retailer the important thing in a .env file and add it to .gitignore.
Set up the OpenAI SDK and dotenv: npm set up openai@4 dotenv@16.
Configure the SDK with DeepSeek’s base URL and your API key.
Ship a chat completion request utilizing the deepseek-chat mannequin.
Allow streaming for decrease time-to-first-token in user-facing apps.
Implement retry logic with exponential backoff for 429 and 5xx errors.
Monitor token utilization by way of the response utilization object for value management.

Desk of Contents

By the tip, readers could have constructed a useful AI-powered code reviewer CLI device and can perceive the important thing floor space of the DeepSeek API by the OpenAI-compatible SDK.

You will want Node.js 18 or later, npm, a DeepSeek API key (lined beneath), and dealing familiarity with JavaScript and async/await patterns.

What’s New in DeepSeek V3

Structure and Efficiency Enhancements

DeepSeek V3 makes use of a Combination-of-Consultants (MoE) structure with refined professional routing and a 64K-token context window. The mannequin posts aggressive scores on code era benchmarks like HumanEval and MBPP, and performs properly on mathematical reasoning duties (GSM8K, MATH) and instruction following. Towards frontier opponents, DeepSeek V3 positions itself in the identical tier as GPT-4o and Gemini 2.5 Professional on these benchmarks whereas sustaining the open-weight availability that distinguishes the DeepSeek household.

API Modifications and Compatibility

The API floor stays OpenAI-compatible, that means any software constructed in opposition to the OpenAI REST API specification can swap to DeepSeek by altering the bottom URL and mannequin identifier. The first API mannequin identifier is deepseek-chat, which at present resolves to the DeepSeek V3 mannequin. DeepSeek’s pricing construction continues to undercut main opponents considerably on each enter and output token prices, making it notably engaging for high-volume purposes. New parameters and structured output modes can be found, detailed within the sections that comply with.

Any software constructed in opposition to the OpenAI REST API specification can swap to DeepSeek by altering the bottom URL and mannequin identifier.

Getting Began: API Key and Setting Setup

Creating Your DeepSeek Account and API Key

Registration begins at platform.deepseek.com. After creating an account, navigate to the API Keys part within the dashboard and generate a brand new key. Copy the important thing instantly; it won’t be proven once more.

Retailer the important thing in an setting variable. By no means hardcode API keys in supply recordsdata, commit them to model management, or expose them in client-side code.

Undertaking Initialization

Arrange a brand new Node.js mission and set up the required dependencies:

mkdir deepseek-v3-demo
cd deepseek-v3-demo
npm init -y
npm set up openai@4 dotenv@16

This information was examined with openai@4.x and dotenv@16.x. Pin variations to keep away from breaking modifications.

All instance recordsdata use the .mjs extension to allow ES module syntax (together with top-level await). Alternatively, add "kind": "module" to package deal.json to make use of import in .js recordsdata.

Create a .env file within the mission root:

DEEPSEEK_API_KEY=your_api_key_here
DEEPSEEK_BASE_URL=https://api.deepseek.com

Add .env to .gitignore to forestall unintended publicity:

echo ".env" >> .gitignore

Your First DeepSeek V3 API Name

Configuring the OpenAI SDK for DeepSeek

The OpenAI Node.js SDK accepts a baseURL constructor parameter. Pointing it at https://api.deepseek.com routes all requests to DeepSeek’s servers whereas preserving the very same technique signatures, request codecs, and response shapes. You do not want a wrapper library or adapter.

Create a file named primary.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = course of.env.DEEPSEEK_API_KEY;
const baseURL = course of.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY just isn't set or is empty in your .env file.");
  course of.exit(1);
}
if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL just isn't set or is empty in your .env file.");
  course of.exit(1);
}

const consumer = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const response = await consumer.chat.completions.create({
  mannequin: "deepseek-chat",
  messages: [
    { role: "system", content: "You are a helpful programming assistant." },
    { role: "user", content: "Explain the difference between map and flatMap in JavaScript." },
  ],
});

const selection = response.decisions?.[0];
if (!selection || selection.finish_reason === "content_filter") {
  console.error(
    "No legitimate completion returned. finish_reason:",
    selection?.finish_reason ?? "no decisions"
  );
  course of.exit(1);
}
const content material = selection.message?.content material;
if (typeof content material !== "string") {
  console.error("Sudden response form: lacking message content material.");
  course of.exit(1);
}

console.log(content material);
console.log("Token utilization:", response.utilization);

Run with node primary.mjs.

Understanding the Response Object

The response follows the OpenAI chat completion schema. response.decisions is an array the place every entry comprises a message object with function and content material fields. The finish_reason discipline signifies why era stopped: "cease" for pure completion, "size" if the response hit the max_tokens cap, "tool_calls" if the mannequin invoked a operate, or "content_filter" if content material filtering blocked the response. The utilization object studies prompt_tokens, completion_tokens, and total_tokens, which map on to billing. Monitoring these values is crucial for value monitoring in manufacturing.

Core API Options and Parameters

System Prompts and Multi-Flip Conversations

The messages array helps three roles: system (units habits and constraints), consumer (end-user enter), and assistant (mannequin responses from earlier turns). Multi-turn conversations require the developer to keep up and append to this array throughout interactions.

Create multiturn.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = course of.env.DEEPSEEK_API_KEY;
const baseURL = course of.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY just isn't set or is empty in your .env file.");
  course of.exit(1);
}
if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL just isn't set or is empty in your .env file.");
  course of.exit(1);
}

const consumer = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const conversationHistory = [
  { role: "system", content: "You are a senior JavaScript developer. Be concise and precise." },
];

const MAX_HISTORY_TURNS = 10; 

operate appendAndTrim(historical past, function, content material) {
  historical past.push({ function, content material });
  
  const systemPrompt = historical past[0].function === "system" ? [history[0]] : [];
  const turns = historical past.slice(systemPrompt.size);
  const maxTurnMessages = MAX_HISTORY_TURNS * 2; 
  const trimmed = turns.slice(Math.max(0, turns.size - maxTurnMessages));
  historical past.size = 0;
  historical past.push(...systemPrompt, ...trimmed);
}

async operate chat(userMessage) {
  appendAndTrim(conversationHistory, "consumer", userMessage);

  const response = await consumer.chat.completions.create({
    mannequin: "deepseek-chat",
    messages: conversationHistory,
  });

  const selection = response.decisions?.[0];
  if (!selection || selection.finish_reason === "content_filter") {
    throw new Error(
      `No legitimate completion returned. finish_reason: ${selection?.finish_reason ?? "no decisions"}`
    );
  }
  const assistantMessage = selection.message?.content material;
  if (typeof assistantMessage !== "string") {
    throw new Error("Sudden response form: lacking message content material.");
  }
  appendAndTrim(conversationHistory, "assistant", assistantMessage);

  return assistantMessage;
}

console.log(await chat("What's a closure in JavaScript?"));
console.log(await chat("Are you able to give me a sensible instance of 1?"));
console.log(await chat("How does that relate to the module sample?"));

Every name sends the collected historical past (trimmed to a sliding window), permitting the mannequin to reference earlier turns with out unbounded reminiscence progress.

Key Parameters for Controlling Output

The API accepts a number of parameters for shaping era habits. temperature (0 to 2) controls randomness; decrease values produce extra deterministic output. Test the DeepSeek API docs for the present default. Use top_p (0 to 1) for nucleus sampling, and max_tokens to cap response size. frequency_penalty and presence_penalty (each -2 to 2 per the OpenAI-compatible spec; confirm these parameters are honored by the DeepSeek endpoint, as habits could differ from OpenAI) discourage repetition and encourage subject range respectively. If that you must halt era at particular delimiter strings, cross them by way of the cease parameter as an array.

For structured output, set response_format: { kind: "json_object" } and instruct the mannequin within the system or consumer immediate to supply JSON. The place the endpoint helps it, this mode will increase the probability of legitimate JSON output. Confirm help within the DeepSeek API docs and at all times wrap JSON.parse() in a strive/catch block.

Create jsonmode.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = course of.env.DEEPSEEK_API_KEY;
const baseURL = course of.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY just isn't set or is empty in your .env file.");
  course of.exit(1);
}
if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL just isn't set or is empty in your .env file.");
  course of.exit(1);
}

const consumer = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const response = await consumer.chat.completions.create({
  mannequin: "deepseek-chat",
  response_format: { kind: "json_object" },
  messages: [
    {
      role: "system",
      content: "You are an API that returns JSON. Always respond with a valid JSON object.",
    },
    {
      role: "user",
      content: "List three common JavaScript array methods with their descriptions and return types.",
    },
  ],
});

const selection = response.decisions?.[0];
if (!selection || selection.finish_reason === "content_filter") {
  console.error(
    "No legitimate completion returned. finish_reason:",
    selection?.finish_reason ?? "no decisions"
  );
  course of.exit(1);
}
const rawContent = selection.message?.content material;
if (typeof rawContent !== "string") {
  console.error("Sudden response form: lacking message content material.");
  course of.exit(1);
}

let parsed;
strive {
  parsed = JSON.parse(rawContent);
} catch (e) {
  console.error("Did not parse mannequin response as JSON:", e.message);
  console.error("Uncooked response:", rawContent);
  course of.exit(1);
}

const isValidObject = parsed !== null && typeof parsed === "object" && !Array.isArray(parsed);
console.log(isValidObject ? "Legitimate JSON object obtained" : "Sudden format");
console.log(JSON.stringify(parsed, null, 2));

Streaming Responses

Streaming reduces perceived latency by delivering tokens as they’re generated, which is crucial for user-facing purposes the place time-to-first-token (TTFT – the elapsed time between sending a request and receiving the primary token of the response) issues greater than complete era time.

Create streaming.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = course of.env.DEEPSEEK_API_KEY;
const baseURL = course of.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY just isn't set or is empty in your .env file.");
  course of.exit(1);
}
if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL just isn't set or is empty in your .env file.");
  course of.exit(1);
}

const consumer = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const stream = await consumer.chat.completions.create({
  mannequin: "deepseek-chat",
  messages: [
    { role: "user", content: "Write a brief explanation of event-driven architecture." },
  ],
  stream: true,
});

let fullResponse = "";

for await (const chunk of stream) {
  const content material = chunk.decisions?.[0]?.delta?.content material ?? "";
  course of.stdout.write(content material);
  fullResponse += content material;
}

console.log("

Full response size:", fullResponse.size, "characters");

Every chunk comprises a delta object with incremental content material. The loop assembles the whole response whereas concurrently writing to stdout.

Constructing a Full Software: AI-Powered Code Reviewer

Software Structure

This CLI device reads a JavaScript file from disk, sends its contents to DeepSeek with an in depth code evaluate system immediate, and requests structured JSON suggestions. The applying workout routines DeepSeek V3’s code understanding, reasoning, and structured output capabilities in a single cohesive workflow.

This CLI device reads a JavaScript file from disk, sends its contents to DeepSeek with an in depth code evaluate system immediate, and requests structured JSON suggestions.

Create evaluate.mjs:

import "dotenv/config";
import OpenAI from "openai";
import { readFile } from "fs/guarantees";
import { resolve, extname } from "path";

const apiKey = course of.env.DEEPSEEK_API_KEY;
const baseURL = course of.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY just isn't set or is empty in your .env file.");
  course of.exit(1);
}
if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL just isn't set or is empty in your .env file.");
  course of.exit(1);
}

const consumer = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const filePath = course of.argv[2];
if (!filePath) {
  console.error("Utilization: node evaluate.mjs <path-to-js-file>");
  course of.exit(1);
}

const resolvedPath = resolve(filePath);
const allowedBase = resolve(course of.cwd());
if (!resolvedPath.startsWith(allowedBase + "https://www.sitepoint.com/") && resolvedPath !== allowedBase) {
  console.error("Error: File have to be throughout the present working listing.");
  course of.exit(1);
}

const allowedExtensions = [".js", ".mjs", ".cjs", ".ts"];
if (!allowedExtensions.contains(extname(resolvedPath))) {
  console.error("Error: Solely JavaScript/TypeScript recordsdata are supported.");
  course of.exit(1);
}


const code = await readFile(resolvedPath, "utf-8");
if (Buffer.byteLength(code, "utf-8") > 100_000) {
  console.error("File too giant (>100KB). Truncate or cut up earlier than reviewing.");
  course of.exit(1);
}

const systemPrompt = `You're a senior code reviewer. Analyze the supplied JavaScript code and return a JSON object with the next construction:
{
  "abstract": "Transient general evaluation",
  "points": [
     "medium" 
  ],
  "strengths": ["List of things done well"],
  "rating": <1-10 general high quality rating>
}
Solely return legitimate JSON. No markdown fences.`;

const response = await consumer.chat.completions.create({
  mannequin: "deepseek-chat",
  response_format: { kind: "json_object" },
  temperature: 0.3,
  max_tokens: 2048,
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: `Review this code:

${code}` },
  ],
});

const selection = response.decisions?.[0];
if (!selection || selection.finish_reason === "content_filter") {
  console.error(
    "No legitimate completion returned. finish_reason:",
    selection?.finish_reason ?? "no decisions"
  );
  course of.exit(1);
}
const responseContent = selection.message?.content material;
if (typeof responseContent !== "string") {
  console.error("Sudden response form: lacking message content material.");
  course of.exit(1);
}

let evaluate;
strive {
  evaluate = JSON.parse(responseContent);
} catch (e) {
  console.error("Did not parse mannequin response as JSON:", e.message);
  console.error("Uncooked response:", responseContent);
  course of.exit(1);
}

console.log(`
📋 Code Overview: ${filePath}`);
console.log(`Rating: ${evaluate.rating}/10`);
console.log(`Abstract: ${evaluate.abstract}
`);

if (evaluate.points?.size) {
  console.log("Points:");
  evaluate.points.forEach((subject, i) => {
    const severityRaw = subject.severity;
    const severity =
      typeof severityRaw === "string" ? severityRaw.toUpperCase() : "UNKNOWN";
    const line = subject.line != null ? ` (line ~${subject.line})` : "";
    const description =
      typeof subject.description === "string" ? subject.description : "[no description]";
    const suggestion =
      typeof subject.suggestion === "string" ? subject.suggestion : "[no suggestion]";
    console.log(`  ${i + 1}. [${severity}]${line} ${description}`);
    console.log(`     Repair: ${suggestion}`);
  });
}

if (evaluate.strengths?.size) {
  console.log("
Strengths:");
  evaluate.strengths.forEach((s) => console.log(`  ✓ ${s}`));
}

console.log(`
Tokens used: ${response.utilization?.total_tokens ?? "unknown"}`);

Run in opposition to any JavaScript file: node evaluate.mjs ./primary.mjs.

Enhancing with Error Dealing with and Retries

Manufacturing API calls should account for price limits (HTTP 429) and transient server errors (5xx). DeepSeek returns customary price restrict headers. A retry wrapper with exponential backoff handles each circumstances gracefully.

Create retry.mjs:

export async operate withRetry(fn, { maxRetries = 3, baseDelay = 1000 } = {}) {
  if (maxRetries < 1) throw new RangeError("maxRetries have to be >= 1");

  let lastError;
  for (let try = 0; try < maxRetries; try++) {
    strive {
      return await fn();
    } catch (error) {
      lastError = error;
      const standing = error?.standing ?? error?.response?.standing;

      
      
      const isRetryable =
        typeof standing === "quantity" &&
        (standing === 429 || (standing >= 500 && standing < 600));

      if (!isRetryable || try === maxRetries - 1) {
        throw error;
      }

      const delay = baseDelay * Math.pow(2, try) + Math.random() * 500;
      console.warn(
        `Retryable error (HTTP ${standing}): ${error.message}. ` +
          `Try ${try + 1}/${maxRetries}. Ready ${Math.spherical(delay)}ms...`
      );
      await new Promise((resolve) => setTimeout(resolve, delay));
    }
  }
  
  throw lastError;
}

Instance utilization in one other file:

import { withRetry } from "./retry.mjs";

const response = await withRetry(() => consumer.chat.completions.create({  }));

The operate applies exponential backoff with jitter, retries solely on 429 and 5xx HTTP standing codes, and throws instantly on non-retryable errors (together with non-HTTP errors like community failures or DNS decision errors).

Migration Information: Switching from OpenAI or Different Suppliers

The Two-Line Migration

For purposes already utilizing the OpenAI Node.js SDK, switching to DeepSeek requires altering at minimal two values: the bottom URL and the mannequin identifier. Overview the behavioral variations part beneath earlier than assuming a drop-in swap.


const consumer = new OpenAI({
  apiKey: course of.env.OPENAI_API_KEY,
});



const consumer = new OpenAI({
  baseURL: "https://api.deepseek.com",
  apiKey: course of.env.DEEPSEEK_API_KEY,
});

The remainder of the applying code, together with message formatting, parameter passing, and response parsing, stays an identical.

Behavioral Variations to Watch For

Regardless of API compatibility, the fashions differ in refined methods. Default temperature habits, token restrict defaults, and system immediate sensitivity range between suppliers. DeepSeek V3 could produce noticeably totally different output for a similar immediate; for instance, it tends to generate shorter, extra direct solutions to open-ended questions than GPT-4o, and it might probably interpret ambiguous directions extra actually. DeepSeek V3 does help operate calling (device use) utilizing the identical OpenAI device schema; nonetheless, DeepSeek has not confirmed parity for all OpenAI options (e.g., fine-tuning, sure response format modes), so confirm the most recent supported capabilities within the DeepSeek API documentation.

The beneficial testing technique: run current immediate suites by each suppliers, examine output high quality and construction, and alter system prompts the place DeepSeek V3’s habits diverges.

Greatest Practices and Optimization

Immediate Engineering Suggestions for DeepSeek V3

DeepSeek V3 responds properly to structured directions with specific output format specs. Chain-of-thought prompting (asking the mannequin to motive step-by-step earlier than answering) improves accuracy on math and multi-step reasoning duties in DeepSeek’s printed evaluations. Imprecise prompts and really lengthy context home windows with out clear focus are inclined to degrade output high quality; benchmark your particular use case, however deal with high quality degradation previous roughly 40K tokens of context as an inexpensive default assumption to validate.

Lowering Token Prices

Estimate token counts earlier than sending requests utilizing tokenizer libraries to keep away from sudden prices. Word that OpenAI’s tiktoken library makes use of a special tokenizer than DeepSeek V3, so token counts might be approximate; examine DeepSeek’s documentation for a suitable tokenizer if exact estimates are wanted. Cache responses for repeated or an identical queries. Set max_tokens to the minimal essential for every use case fairly than counting on defaults. For less complicated duties like classification or short-form extraction, take into account whether or not a lighter, cheaper mannequin suffices earlier than routing all the pieces by DeepSeek V3.

Estimate token counts earlier than sending requests utilizing tokenizer libraries to keep away from sudden prices.

Implementation Guidelines

Setup

DeepSeek account created and API key generated
API key saved in setting variable (by no means hardcoded)
OpenAI SDK put in and configured with DeepSeek base URL
Fundamental chat completion working

Testing

Error dealing with and retry logic carried out
Streaming carried out for user-facing options
JSON mode examined for structured outputs
Charge restrict dealing with confirmed
Present prompts examined and tailored for DeepSeek V3 behavioral variations

Manufacturing

Token utilization monitoring in place
Validate value estimates in opposition to pricing tiers
Manufacturing logging and monitoring configured

What Comes Subsequent

Begin with the code samples above, then discover the official DeepSeek API documentation for the most recent on operate calling help, fine-tuning capabilities, and price restrict specs. DeepSeek V3 pairs low per-token value with aggressive benchmark outcomes by a well-recognized API floor, so the migration value is low for groups already on the OpenAI SDK.

The open-weight nature of DeepSeek fashions allows self-hosting, topic to the phrases of the DeepSeek License Settlement, which incorporates restrictions on industrial use. Overview the license earlier than self-hosting for manufacturing or industrial functions.

Supply hyperlink