npmpackage.info

Gathering detailed insights and metrics for ollama

Other packages similar to ollama

@langchain/ollama

0.2.3

Ollama integration for LangChain.js

ollama-ai-provider

1.2.0

Vercel AI Provider for running LLMs locally using Ollama

@llamaindex/ollama

0.1.13

Ollama Adapter for LlamaIndex

koishi-plugin-chatluna-ollama-adapter

1.2.0

ollama adapter for chatluna

Gathering detailed insights and metrics for ollama

ollama - 0.5.16 | npmpackage.info

ollama

Ollama JavaScript library

0.5.16

3,501

MIT

TypeScript

5.54 kB

6,182,726

Installations

npm install ollama

Developer Guide

BETA

Typescript

Yes

Module System

CommonJS, ESM

Node Version

24.1.0

NPM Version

11.3.0 Score

99.5

Supply Chain

99.6

Quality

88.4

Maintenance

100

Vulnerability

100

License

Pull Requests

Open

14

Total

112

Closed

17

Merged

81

Issues

Open

50

Total

112

Closed

62

Releases

v0.5.16

Updated on May 30, 2025

v0.5.15

Updated on Apr 16, 2025

v0.5.14

Updated on Feb 24, 2025

v0.5.13

Updated on Feb 10, 2025

v0.5.12

Updated on Jan 14, 2025

v0.5.11

Updated on Dec 06, 2024

View All 23 releases

Languages

TypeScript

JavaScript

TypeScript (98.11%)

JavaScript (1.89%)

Developer

ollama

Download Statistics

Total Downloads

6,182,726

Last Day

24,389

Last Week

245,505

Last Month

957,678

Last Year

5,743,602

GitHub Statistics

MIT License

3,501 Stars

186 Commits

315 Forks

28 Watchers

37 Branches

34 Contributors

Updated on Jul 02, 2025

Bundle Size

16.76 kB

Minified

5.54 kB

Minified + Gzipped

Bundlephobia

Maintainers

View All 34 Contributors

Package Meta Information

Latest Version

0.5.16

Package Id

ollama@0.5.16

Unpacked Size

107.78 kB

Size

21.32 kB

File Count

NPM Version

11.3.0

Node Version

24.1.0

Published on

May 30, 2025

Total Downloads

Cumulative downloads

Total Downloads

6,182,726

Last Day

18.6%

24,389

Compared to previous day

Last Week

245,505

Compared to previous week

Last Month

12.6%

957,678

Compared to previous month

Last Year

1,208%

5,743,602

Compared to previous year

Weekly Downloads

Monthly Downloads

Yearly Downloads

Dependencies

whatwg-fetch

Dev Dependencies

@swc/core @types/whatwg-fetch @typescript-eslint/eslint-plugin @typescript-eslint/parser eslint vitest prettier typescript unbuild

Ollama JavaScript Library

The Ollama JavaScript library provides the easiest way to integrate your JavaScript project with Ollama.

Getting Started

npm i ollama

Usage

1import ollama from 'ollama'
2
3const response = await ollama.chat({
4  model: 'llama3.1',
5  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
6})
7console.log(response.message.content)

Browser Usage

To use the library without node, import the browser module.

1import ollama from 'ollama/browser'

Streaming responses

Response streaming can be enabled by setting stream: true, modifying function calls to return an AsyncGenerator where each part is an object in the stream.

1import ollama from 'ollama'
2
3const message = { role: 'user', content: 'Why is the sky blue?' }
4const response = await ollama.chat({ model: 'llama3.1', messages: [message], stream: true })
5for await (const part of response) {
6  process.stdout.write(part.message.content)
7}

API

The Ollama JavaScript library's API is designed around the Ollama REST API

chat

1ollama.chat(request)

request <Object>: The request object containing chat parameters.
- model <string> The name of the model to use for the chat.
- messages <Message[]>: Array of message objects representing the chat history.
  - role <string>: The role of the message sender ('user', 'system', or 'assistant').
  - content <string>: The content of the message.
  - images <Uint8Array[] | string[]>: (Optional) Images to be included in the message, either as Uint8Array or base64 encoded strings.
- format <string>: (Optional) Set the expected format of the response (json).
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
- think <boolean>: (Optional) When true, the model will think about the response before responding. Requires thinking support from the model.
- keep_alive <string | number>: (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.)
- tools <Tool[]>: (Optional) A list of tool calls the model may make.
- options <Options>: (Optional) Options to configure the runtime.
Returns: <ChatResponse>

generate

1ollama.generate(request)

request <Object>: The request object containing generate parameters.
- model <string> The name of the model to use for the chat.
- prompt <string>: The prompt to send to the model.
- suffix <string>: (Optional) Suffix is the text that comes after the inserted text.
- system <string>: (Optional) Override the model system prompt.
- template <string>: (Optional) Override the model template.
- raw <boolean>: (Optional) Bypass the prompt template and pass the prompt directly to the model.
- images <Uint8Array[] | string[]>: (Optional) Images to be included, either as Uint8Array or base64 encoded strings.
- format <string>: (Optional) Set the expected format of the response (json).
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
- think <boolean>: (Optional) When true, the model will think about the response before responding. Requires thinking support from the model.
- keep_alive <string | number>: (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.)
- options <Options>: (Optional) Options to configure the runtime.
Returns: <GenerateResponse>

pull

1ollama.pull(request)

request <Object>: The request object containing pull parameters.
- model <string> The name of the model to pull.
- insecure <boolean>: (Optional) Pull from servers whose identity cannot be verified.
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
Returns: <ProgressResponse>

push

1ollama.push(request)

request <Object>: The request object containing push parameters.
- model <string> The name of the model to push.
- insecure <boolean>: (Optional) Push to servers whose identity cannot be verified.
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
Returns: <ProgressResponse>

create

1ollama.create(request)

request <Object>: The request object containing create parameters.
- model <string> The name of the model to create.
- from <string>: The base model to derive from.
- stream <boolean>: (Optional) When true an AsyncGenerator is returned.
- quantize <string>: Quanization precision level (q8_0, q4_K_M, etc.).
- template <string>: (Optional) The prompt template to use with the model.
- license <string|string[]>: (Optional) The license(s) associated with the model.
- system <string>: (Optional) The system prompt for the model.
- parameters <Record<string, unknown>>: (Optional) Additional model parameters as key-value pairs.
- messages <Message[]>: (Optional) Initial chat messages for the model.
- adapters <Record<string, string>>: (Optional) A key-value map of LoRA adapter configurations.
Returns: <ProgressResponse>

Note: The files parameter is not currently supported in ollama-js.

delete

1ollama.delete(request)

request <Object>: The request object containing delete parameters.
- model <string> The name of the model to delete.
Returns: <StatusResponse>

copy

1ollama.copy(request)

request <Object>: The request object containing copy parameters.
- source <string> The name of the model to copy from.
- destination <string> The name of the model to copy to.
Returns: <StatusResponse>

list

1ollama.list()

Returns: <ListResponse>

show

1ollama.show(request)

request <Object>: The request object containing show parameters.
- model <string> The name of the model to show.
- system <string>: (Optional) Override the model system prompt returned.
- template <string>: (Optional) Override the model template returned.
- options <Options>: (Optional) Options to configure the runtime.
Returns: <ShowResponse>

embed

1ollama.embed(request)

request <Object>: The request object containing embedding parameters.
- model <string> The name of the model used to generate the embeddings.
- input <string> | <string[]>: The input used to generate the embeddings.
- truncate <boolean>: (Optional) Truncate the input to fit the maximum context length supported by the model.
- keep_alive <string | number>: (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.)
- options <Options>: (Optional) Options to configure the runtime.
Returns: <EmbedResponse>

ps

1ollama.ps()

Returns: <ListResponse>

abort

1ollama.abort()

This method will abort all streamed generations currently running with the client instance. If there is a need to manage streams with timeouts, it is recommended to have one Ollama client per stream.

All asynchronous threads listening to streams (typically the for await (const part of response)) will throw an AbortError exception. See examples/abort/abort-all-requests.ts for an example.

Custom client

A custom client can be created with the following fields:

host <string>: (Optional) The Ollama host address. Default: "http://127.0.0.1:11434".
fetch <Object>: (Optional) The fetch library used to make requests to the Ollama host.

1import { Ollama } from 'ollama'
2
3const ollama = new Ollama({ host: 'http://127.0.0.1:11434' })
4const response = await ollama.chat({
5  model: 'llama3.1',
6  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
7})

Building

To build the project files run:

1npm run build

No vulnerabilities found.

No security vulnerabilities found.