LLM with Anthropic - Upstash Documentation

QStash integrates smoothly with Anthropic’s API, allowing you to send LLM requests and leverage QStash features like retries, callbacks, and batching. This is especially useful when working in serverless environments where LLM response times vary and traditional timeouts may be limiting. QStash provides an HTTP timeout of up to 2 hours, which is ideal for most LLM cases.

Example: Publishing and Enqueueing Requests

Specify the api as llm with the provider set to anthropic() when publishing requests. Use the Upstash-Callback header to handle responses asynchronously, as streaming completions aren’t supported for this integration.

Publishing a Request

import { anthropic, Client } from "@upstash/qstash";

const client = new Client({ token: "<QSTASH_TOKEN>" });
await client.publishJSON({
  api: { name: "llm", provider: anthropic({ token: "<ANTHROPIC_TOKEN>" }) },
  body: {
    model: "claude-3-5-sonnet-20241022",
    messages: [{ role: "user", content: "Summarize recent tech trends." }],
  },
  callback: "https://example.com/callback",
});

Enqueueing a Chat Completion Request

Use enqueueJSON with Anthropic as the provider to enqueue requests for asynchronous processing.

import { anthropic, Client } from "@upstash/qstash";

const client = new Client({ token: "<QSTASH_TOKEN>" });

const result = await client.queue({ queueName: "your-queue-name" }).enqueueJSON({
  api: { name: "llm", provider: anthropic({ token: "<ANTHROPIC_TOKEN>" }) },
  body: {
    model: "claude-3-5-sonnet-20241022",
    messages: [
      {
        role: "user",
        content: "Generate ideas for a marketing campaign.",
      },
    ],
  },
  callback: "https://example.com/callback",
});

console.log(result);

Sending Chat Completion Requests in Batches

Use batchJSON to send multiple requests at once. Each request in the batch specifies the same Anthropic provider and includes a callback URL.

import { anthropic, Client } from "@upstash/qstash";

const client = new Client({ token: "<QSTASH_TOKEN>" });

const result = await client.batchJSON([
  {
    api: { name: "llm", provider: anthropic({ token: "<ANTHROPIC_TOKEN>" }) },
    body: {
      model: "claude-3-5-sonnet-20241022",
      messages: [
        {
          role: "user",
          content: "Describe the latest in AI research.",
        },
      ],
    },
    callback: "https://example.com/callback1",
  },
  {
    api: { name: "llm", provider: anthropic({ token: "<ANTHROPIC_TOKEN>" }) },
    body: {
      model: "claude-3-5-sonnet-20241022",
      messages: [
        {
          role: "user",
          content: "Outline the future of remote work.",
        },
      ],
    },
    callback: "https://example.com/callback2",
  },
  // Add more requests as needed
]);

console.log(result);

Analytics with Helicone

To monitor usage, include Helicone analytics by passing your Helicone API key under analytics:

await client.publishJSON({
  api: {
    name: "llm",
    provider: anthropic({ token: "<ANTHROPIC_TOKEN>" }),
    analytics: { name: "helicone", token: process.env.HELICONE_API_KEY! },
  },
  body: { model: "claude-3-5-sonnet-20241022", messages: [{ role: "user", content: "Hello!" }] },
  callback: "https://example.com/callback",
});

With this setup, Anthropic can be used seamlessly in any LLM workflows in QStash.

QStash

​Example: Publishing and Enqueueing Requests

​Publishing a Request

​Enqueueing a Chat Completion Request

​Sending Chat Completion Requests in Batches

​Analytics with Helicone

Example: Publishing and Enqueueing Requests

Publishing a Request

Enqueueing a Chat Completion Request

Sending Chat Completion Requests in Batches

Analytics with Helicone