Getting started
@axflow/models is an SDK for building natural language powered applications. This includes basic functionality for invoking the models, but it also includes richer functionality like augmented response streaming, hooks for building client applications, and more.
npm i @axflow/models
npm i @axflow/models
Basic example
import { OpenAIChat } from '@axflow/models/openai/chat';
import { StreamingJsonResponse } from '@axflow/models/shared';
// @axflow/models is available in edge environments
export const runtime = 'edge';
// POST /api/chat
export async function POST(request: Request) {
const { query } = await request.json();
// Create a stream of tokens from OpenAI
const stream = await OpenAIChat.streamTokens(
{
model: 'gpt-4',
messages: [{ role: 'user', content: query }],
},
{
apiKey: process.env. OPENAI_API_KEY,
},
);
// Stream the tokens from OpenAI back to the client
// as newline-delimited JSON, which is easy to extend,
// parse, and reason about.
return new StreamingJsonResponse(stream);
}
import { OpenAIChat } from '@axflow/models/openai/chat';
import { StreamingJsonResponse } from '@axflow/models/shared';
// @axflow/models is available in edge environments
export const runtime = 'edge';
// POST /api/chat
export async function POST(request: Request) {
const { query } = await request.json();
// Create a stream of tokens from OpenAI
const stream = await OpenAIChat.streamTokens(
{
model: 'gpt-4',
messages: [{ role: 'user', content: query }],
},
{
apiKey: process.env. OPENAI_API_KEY,
},
);
// Stream the tokens from OpenAI back to the client
// as newline-delimited JSON, which is easy to extend,
// parse, and reason about.
return new StreamingJsonResponse(stream);
}
Why this package?
Through building LLM-powered applications, we have found ourselves repeatedly writing the same streaming utilities, client hooks, model sdk wrappers/connectors, etc. We occasionally had to bypass the lagging JS client SDKs because they did not yet implement functionality supported by the REST API (like the openai package when it hadn't surfaced the streaming functionality). Some libraries have node dependencies and are thus not usable in browsers, vscode extensions, or edge environments.
All this to say: the TypeScript ecosystem for building AI apps is lagging behind. And yet, we believe it is the future. We're committed to making TypeScript the best environment for building AI applications. For this to be true, we need better tooling.
This package is a foundational layer for building AI applications. It offers clean, consistent interfaces, a modular design, and is built solely on web standards to ensure compatibility in all modern environments.
Use cases
This package is useful when any of the following are true:
- You want to build against one or more LLM or embedding model APIs
- You want to Proxy the model APIs
- You want basic and advanced streaming capabilities, including streaming custom data alongside the LLM response
- You want client side utilities or React hooks that abstract away the underlying streaming and state-management complexities
- You want to work with LLMs, embedding models, and streaming in client or edge environments (no Node.js dependencies)
- You want to simplify your stack by using one package with the simplest and cleanest code to interact with model APIs, stream data, and build client applications
What's in this guide
This guide covers the basics of working with models. For more information, see the following guides:
Working with models
Axflow provides wrapping for the most popular LLM APIs for convenience. Each API is wrapped using only web standards (e.g., fetch
) which enables this library to be used in any environment that supports these standards, including edge environments like Next.js Edge Runtime. Similarly, there are no server-side dependencies (e.g., Node stdlib), meaning this library works in client environments like browsers or VSCode extensions.
While Axflow wraps popular LLM APIs with a set of methods, the inputs and outputs to these APIs are defined by the underlying API and not Axflow. Therefore, OpenAICompletion.run()
, AnthropicCompletion.run()
, etc. take different arguments and return different responses. Axflow TypeScript types will reflect the expected inputs and outputs, but for more information see the documentation by the underlying model provider.
Axflow does not support every LLM or embedding model, but you can bring your own.
Running a model
All models define a run
method that invokes the model and returns its response.
run
import { OpenAIEmebdding } from '@axflow/models/openai/embedding';
import { AnthropicCompletion } from '@axflow/models/anthropic/completion';
// Embedding model example
const response = await OpenAIEmebdding.run(
{
model: 'text-embedding-ada-002',
input: "How can I deploy on Vercel?"
},
{ apiKey: process.env. OPENAI_API_KEY }
);
console.log(response);
// {
// object: 'list',
// data: [ { object: 'embedding', index: 0, embedding: [ ... ] } ],
// model: 'text-embedding-ada-002-v2',
// usage: { prompt_tokens: 9, total_tokens: 9 }
// }
// Completion model example
const response = await AnthropicCompletion.run(
{
model: 'claude-2',
prompt: "\n\nHuman: How can I deploy on Vercel?\n\nAssistant:",
max_tokens_to_sample: 80,
},
{ apiKey: process.env. ANTHROPIC_API_KEY }
);
console.log(response);
// {
// completion: ' Here are a few steps to deploy on Vercel:\n\n1. Sign up ...',
// stop_reason: 'max_tokens',
// model: 'claude-2.0',
// stop: null,
// log_id: '70f8c2a98800729d26a9435c6f5745107ed8e570f29149de5baac2e155b5e7f0'
// }
import { OpenAIEmebdding } from '@axflow/models/openai/embedding';
import { AnthropicCompletion } from '@axflow/models/anthropic/completion';
// Embedding model example
const response = await OpenAIEmebdding.run(
{
model: 'text-embedding-ada-002',
input: "How can I deploy on Vercel?"
},
{ apiKey: process.env. OPENAI_API_KEY }
);
console.log(response);
// {
// object: 'list',
// data: [ { object: 'embedding', index: 0, embedding: [ ... ] } ],
// model: 'text-embedding-ada-002-v2',
// usage: { prompt_tokens: 9, total_tokens: 9 }
// }
// Completion model example
const response = await AnthropicCompletion.run(
{
model: 'claude-2',
prompt: "\n\nHuman: How can I deploy on Vercel?\n\nAssistant:",
max_tokens_to_sample: 80,
},
{ apiKey: process.env. ANTHROPIC_API_KEY }
);
console.log(response);
// {
// completion: ' Here are a few steps to deploy on Vercel:\n\n1. Sign up ...',
// stop_reason: 'max_tokens',
// model: 'claude-2.0',
// stop: null,
// log_id: '70f8c2a98800729d26a9435c6f5745107ed8e570f29149de5baac2e155b5e7f0'
// }
Streaming a model
For models that support streaming, there are a few streaming methods: streamBytes
, stream
, and streamTokens
. Each of these take the same arguments as the run
methods unless noted otherwise and returns a promise that resolves to a ReadableStream
.
streamBytes
streamBytes
is a low-level method for streaming the raw response bytes. This is useful if you want to forward the response bytes without the overhead of parsing them into objects.
For example, creating an OpenAI proxy on the edge using Next.js is only a couple lines of code:
import { OpenAIChat } from '@axflow/models/openai/chat';
export const runtime = 'edge'
export async function POST(request: Request) {
const chatRequest = await request.json();
const stream = await OpenAIChat.streamBytes(chatRequest, {
apiKey: process.env. OPENAI_API_KEY,
});
return new Response(stream);
}
import { OpenAIChat } from '@axflow/models/openai/chat';
export const runtime = 'edge'
export async function POST(request: Request) {
const chatRequest = await request.json();
const stream = await OpenAIChat.streamBytes(chatRequest, {
apiKey: process.env. OPENAI_API_KEY,
});
return new Response(stream);
}
stream
stream
builds upon streamBytes
by parsing the response bytes into JavaScript objects. For example,
const stream = await CohereGeneration.stream(
{
max_tokens: 80,
prompt: 'Please explain how LLMs work'
},
{ apiKey: process.env. COHERE_API_KEY },
);
for await (const chunk of stream) {
console.log(chunk); // Example chunk: { text: ' They', is_finished: false }
}
const stream = await CohereGeneration.stream(
{
max_tokens: 80,
prompt: 'Please explain how LLMs work'
},
{ apiKey: process.env. COHERE_API_KEY },
);
for await (const chunk of stream) {
console.log(chunk); // Example chunk: { text: ' They', is_finished: false }
}
streamTokens
streamTokens
builds upon stream
and returns a stream consisting only of the tokens. This is useful if all you care about is the user-facing LLM response.
const stream = await CohereGeneration.streamTokens(
{
max_tokens: 80,
prompt: 'Please explain how LLMs work'
},
{ apiKey: process.env. COHERE_API_KEY },
);
for await (const chunk of stream) {
console.log(chunk); // Example chunk: " They"
}
const stream = await CohereGeneration.streamTokens(
{
max_tokens: 80,
prompt: 'Please explain how LLMs work'
},
{ apiKey: process.env. COHERE_API_KEY },
);
for await (const chunk of stream) {
console.log(chunk); // Example chunk: " They"
}
Asynchronous iteration
The streaming methods all return a ReadableStream
. In recent Node.js versions, ReadableStream
s implement the async iterable protocol which allow these objects to be iterated over using the for await..of
syntax. However, browsers, older node versions, and the TypeScript type system do not currently support the async iterable protocol for ReadableStream
s. To get around this limitation, Axflow provides a StreamToIterable
utility function that takes a ReadableStream<T>
and returns an AsyncIterable<T>
.
import { OpenAIChat } from '@axflow/models/openai/chat';
import { StreamToIterable } from '@axflow/models/shared';
const stream = await OpenAIChat.stream(
{
model: 'gpt-4',
messages: [{ role: 'user', content: 'Please explain how LLMs work' }]
},
{ apiKey: process.env. OPENAI_API_KEY },
);
// Use StreamToIterable to convert a stream to an async iterable
for await (const chunk of StreamToIterable(stream)) {
// Do something with chunk
}
import { OpenAIChat } from '@axflow/models/openai/chat';
import { StreamToIterable } from '@axflow/models/shared';
const stream = await OpenAIChat.stream(
{
model: 'gpt-4',
messages: [{ role: 'user', content: 'Please explain how LLMs work' }]
},
{ apiKey: process.env. OPENAI_API_KEY },
);
// Use StreamToIterable to convert a stream to an async iterable
for await (const chunk of StreamToIterable(stream)) {
// Do something with chunk
}
Overriding behavior
The run and stream functions support overriding or customizing some behavior. For example, we support:
- Supplying a custom
apiUrl
to point to a different API implementation, e.g., proxy or otherwise compatible API. - Supplying custom request headers.
- Supplying an abort signal which enables aborting the underlying fetch request.
- Supplying a custom
fetch
implementation, which could act as a client-side proxy, add logging, or otherwise customize request behavior.
For example:
import { CohereGeneration } from '@axflow/models/cohere/generation';
const response = await CohereGeneration.stream({ /* ... */ }, {
apiUrl: 'https://api.example.com/v1/generate',
headers: {
'x-my-custom-header': 'custom-value',
},
signal: AbortSignal.timeout(5000),
});
import { CohereGeneration } from '@axflow/models/cohere/generation';
const response = await CohereGeneration.stream({ /* ... */ }, {
apiUrl: 'https://api.example.com/v1/generate',
headers: {
'x-my-custom-header': 'custom-value',
},
signal: AbortSignal.timeout(5000),
});
Additionally, this enables easy client-side usage (e.g., in browsers) because you could have an API proxy that simply adds the api key and forwards the request to the underlying provider.