Gathering detailed insights and metrics for @huggingface/inference
Gathering detailed insights and metrics for @huggingface/inference
Utilities to use the Hugging Face Hub API
npm install @huggingface/inference
Typescript
Module System
Min. Node Version
Node Version
NPM Version
80.9
Supply Chain
99.2
Quality
85.1
Maintenance
100
Vulnerability
100
License
TypeScript (71.24%)
Svelte (16.4%)
JavaScript (11.59%)
CSS (0.38%)
Python (0.25%)
Shell (0.12%)
HTML (0.03%)
Total Downloads
3,060,380
Last Day
8,704
Last Week
69,945
Last Month
394,306
Last Year
2,487,666
1,455 Stars
1,111 Commits
254 Forks
48 Watching
54 Branches
272 Contributors
Latest Version
2.8.1
Package Id
@huggingface/inference@2.8.1
Unpacked Size
229.64 kB
Size
44.79 kB
File Count
160
NPM Version
10.8.2
Node Version
20.17.0
Publised On
29 Sept 2024
Cumulative downloads
Total Downloads
Last day
-10.6%
8,704
Compared to previous day
Last week
-30.1%
69,945
Compared to previous week
Last month
14.1%
394,306
Compared to previous month
Last year
334.4%
2,487,666
Compared to previous year
1
1
A Typescript powered wrapper for the Hugging Face Inference Endpoints API. Learn more about Inference Endpoints at Hugging Face. It works with both Inference API (serverless) and Inference Endpoints (dedicated).
Check out the full documentation.
You can also try out a live interactive notebook, see some demos on hf.co/huggingfacejs, or watch a Scrimba tutorial that explains how Inference Endpoints works.
1npm install @huggingface/inference 2 3pnpm add @huggingface/inference 4 5yarn add @huggingface/inference
1// esm.sh 2import { HfInference } from "https://esm.sh/@huggingface/inference" 3// or npm: 4import { HfInference } from "npm:@huggingface/inference"
1import { HfInference } from '@huggingface/inference' 2 3const hf = new HfInference('your access token')
❗Important note: Using an access token is optional to get started, however you will be rate limited eventually. Join Hugging Face and then visit access tokens to generate your access token for free.
Your access token should be kept private. If you need to protect it in front-end applications, we suggest setting up a proxy server that stores the access token.
You can import the functions you need directly from the module instead of using the HfInference
class.
1import { textGeneration } from "@huggingface/inference"; 2 3await textGeneration({ 4 accessToken: "hf_...", 5 model: "model_or_endpoint", 6 inputs: ..., 7 parameters: ... 8})
This will enable tree-shaking by your bundler.
Generates text from an input prompt.
1await hf.textGeneration({ 2 model: 'gpt2', 3 inputs: 'The answer to the universe is' 4}) 5 6for await (const output of hf.textGenerationStream({ 7 model: "google/flan-t5-xxl", 8 inputs: 'repeat "one two three four"', 9 parameters: { max_new_tokens: 250 } 10})) { 11 console.log(output.token.text, output.generated_text); 12}
Using the chatCompletion
method, you can generate text with models compatible with the OpenAI Chat Completion API. All models served by TGI on Hugging Face support Messages API.
1// Non-streaming API 2const out = await hf.chatCompletion({ 3 model: "mistralai/Mistral-7B-Instruct-v0.2", 4 messages: [{ role: "user", content: "Complete the this sentence with words one plus one is equal " }], 5 max_tokens: 500, 6 temperature: 0.1, 7 seed: 0, 8}); 9 10// Streaming API 11let out = ""; 12for await (const chunk of hf.chatCompletionStream({ 13 model: "mistralai/Mistral-7B-Instruct-v0.2", 14 messages: [ 15 { role: "user", content: "Complete the equation 1+1= ,just the answer" }, 16 ], 17 max_tokens: 500, 18 temperature: 0.1, 19 seed: 0, 20})) { 21 if (chunk.choices && chunk.choices.length > 0) { 22 out += chunk.choices[0].delta.content; 23 } 24}
It's also possible to call Mistral or OpenAI endpoints directly:
1const openai = new HfInference(OPENAI_TOKEN).endpoint("https://api.openai.com"); 2 3let out = ""; 4for await (const chunk of openai.chatCompletionStream({ 5 model: "gpt-3.5-turbo", 6 messages: [ 7 { role: "user", content: "Complete the equation 1+1= ,just the answer" }, 8 ], 9 max_tokens: 500, 10 temperature: 0.1, 11 seed: 0, 12})) { 13 if (chunk.choices && chunk.choices.length > 0) { 14 out += chunk.choices[0].delta.content; 15 } 16} 17 18// For mistral AI: 19// endpointUrl: "https://api.mistral.ai" 20// model: "mistral-tiny"
Tries to fill in a hole with a missing word (token to be precise).
1await hf.fillMask({ 2 model: 'bert-base-uncased', 3 inputs: '[MASK] world!' 4})
Summarizes longer text into shorter text. Be careful, some models have a maximum length of input.
1await hf.summarization({ 2 model: 'facebook/bart-large-cnn', 3 inputs: 4 'The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930.', 5 parameters: { 6 max_length: 100 7 } 8})
Answers questions based on the context you provide.
1await hf.questionAnswering({ 2 model: 'deepset/roberta-base-squad2', 3 inputs: { 4 question: 'What is the capital of France?', 5 context: 'The capital of France is Paris.' 6 } 7})
1await hf.tableQuestionAnswering({ 2 model: 'google/tapas-base-finetuned-wtq', 3 inputs: { 4 query: 'How many stars does the transformers repository have?', 5 table: { 6 Repository: ['Transformers', 'Datasets', 'Tokenizers'], 7 Stars: ['36542', '4512', '3934'], 8 Contributors: ['651', '77', '34'], 9 'Programming language': ['Python', 'Python', 'Rust, Python and NodeJS'] 10 } 11 } 12})
Often used for sentiment analysis, this method will assign labels to the given text along with a probability score of that label.
1await hf.textClassification({ 2 model: 'distilbert-base-uncased-finetuned-sst-2-english', 3 inputs: 'I like you. I love you.' 4})
Used for sentence parsing, either grammatical, or Named Entity Recognition (NER) to understand keywords contained within text.
1await hf.tokenClassification({ 2 model: 'dbmdz/bert-large-cased-finetuned-conll03-english', 3 inputs: 'My name is Sarah Jessica Parker but you can call me Jessica' 4})
Converts text from one language to another.
1await hf.translation({ 2 model: 't5-base', 3 inputs: 'My name is Wolfgang and I live in Berlin' 4}) 5 6await hf.translation({ 7 model: 'facebook/mbart-large-50-many-to-many-mmt', 8 inputs: textToTranslate, 9 parameters: { 10 "src_lang": "en_XX", 11 "tgt_lang": "fr_XX" 12 } 13})
Checks how well an input text fits into a set of labels you provide.
1await hf.zeroShotClassification({ 2 model: 'facebook/bart-large-mnli', 3 inputs: [ 4 'Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!' 5 ], 6 parameters: { candidate_labels: ['refund', 'legal', 'faq'] } 7})
This task corresponds to any chatbot-like structure. Models tend to have shorter max_length, so please check with caution when using a given model if you need long-range dependency or not.
1await hf.conversational({ 2 model: 'microsoft/DialoGPT-large', 3 inputs: { 4 past_user_inputs: ['Which movie is the best ?'], 5 generated_responses: ['It is Die Hard for sure.'], 6 text: 'Can you explain why ?' 7 } 8})
Calculate the semantic similarity between one text and a list of other sentences.
1await hf.sentenceSimilarity({ 2 model: 'sentence-transformers/paraphrase-xlm-r-multilingual-v1', 3 inputs: { 4 source_sentence: 'That is a happy person', 5 sentences: [ 6 'That is a happy dog', 7 'That is a very happy person', 8 'Today is a sunny day' 9 ] 10 } 11})
Transcribes speech from an audio file.
1await hf.automaticSpeechRecognition({ 2 model: 'facebook/wav2vec2-large-960h-lv60-self', 3 data: readFileSync('test/sample1.flac') 4})
Assigns labels to the given audio along with a probability score of that label.
1await hf.audioClassification({ 2 model: 'superb/hubert-large-superb-er', 3 data: readFileSync('test/sample1.flac') 4})
Generates natural-sounding speech from text input.
1await hf.textToSpeech({ 2 model: 'espnet/kan-bayashi_ljspeech_vits', 3 inputs: 'Hello world!' 4})
Outputs one or multiple generated audios from an input audio, commonly used for speech enhancement and source separation.
1await hf.audioToAudio({ 2 model: 'speechbrain/sepformer-wham', 3 data: readFileSync('test/sample1.flac') 4})
Assigns labels to a given image along with a probability score of that label.
1await hf.imageClassification({ 2 data: readFileSync('test/cheetah.png'), 3 model: 'google/vit-base-patch16-224' 4})
Detects objects within an image and returns labels with corresponding bounding boxes and probability scores.
1await hf.objectDetection({ 2 data: readFileSync('test/cats.png'), 3 model: 'facebook/detr-resnet-50' 4})
Detects segments within an image and returns labels with corresponding bounding boxes and probability scores.
1await hf.imageSegmentation({ 2 data: readFileSync('test/cats.png'), 3 model: 'facebook/detr-resnet-50-panoptic' 4})
Outputs text from a given image, commonly used for captioning or optical character recognition.
1await hf.imageToText({ 2 data: readFileSync('test/cats.png'), 3 model: 'nlpconnect/vit-gpt2-image-captioning' 4})
Creates an image from a text prompt.
1await hf.textToImage({ 2 inputs: 'award winning high resolution photo of a giant tortoise/((ladybird)) hybrid, [trending on artstation]', 3 model: 'stabilityai/stable-diffusion-2', 4 parameters: { 5 negative_prompt: 'blurry', 6 } 7})
Image-to-image is the task of transforming a source image to match the characteristics of a target image or a target image domain.
1await hf.imageToImage({ 2 inputs: new Blob([readFileSync("test/stormtrooper_depth.png")]), 3 parameters: { 4 prompt: "elmo's lecture", 5 }, 6 model: "lllyasviel/sd-controlnet-depth", 7});
Checks how well an input image fits into a set of labels you provide.
1await hf.zeroShotImageClassification({ 2 model: 'openai/clip-vit-large-patch14-336', 3 inputs: { 4 image: await (await fetch('https://placekitten.com/300/300')).blob() 5 }, 6 parameters: { 7 candidate_labels: ['cat', 'dog'] 8 } 9})
This task reads some text and outputs raw float values, that are usually consumed as part of a semantic database/semantic search.
1await hf.featureExtraction({ 2 model: "sentence-transformers/distilbert-base-nli-mean-tokens", 3 inputs: "That is a happy person", 4});
Visual Question Answering is the task of answering open-ended questions based on an image. They output natural language responses to natural language questions.
1await hf.visualQuestionAnswering({ 2 model: 'dandelin/vilt-b32-finetuned-vqa', 3 inputs: { 4 question: 'How many cats are lying down?', 5 image: await (await fetch('https://placekitten.com/300/300')).blob() 6 } 7})
Document question answering models take a (document, question) pair as input and return an answer in natural language.
1await hf.documentQuestionAnswering({ 2 model: 'impira/layoutlm-document-qa', 3 inputs: { 4 question: 'Invoice number?', 5 image: await (await fetch('https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png')).blob(), 6 } 7})
Tabular regression is the task of predicting a numerical value given a set of attributes.
1await hf.tabularRegression({ 2 model: "scikit-learn/Fish-Weight", 3 inputs: { 4 data: { 5 "Height": ["11.52", "12.48", "12.3778"], 6 "Length1": ["23.2", "24", "23.9"], 7 "Length2": ["25.4", "26.3", "26.5"], 8 "Length3": ["30", "31.2", "31.1"], 9 "Species": ["Bream", "Bream", "Bream"], 10 "Width": ["4.02", "4.3056", "4.6961"] 11 }, 12 }, 13})
Tabular classification is the task of classifying a target category (a group) based on set of attributes.
1await hf.tabularClassification({ 2 model: "vvmnnnkv/wine-quality", 3 inputs: { 4 data: { 5 "fixed_acidity": ["7.4", "7.8", "10.3"], 6 "volatile_acidity": ["0.7", "0.88", "0.32"], 7 "citric_acid": ["0", "0", "0.45"], 8 "residual_sugar": ["1.9", "2.6", "6.4"], 9 "chlorides": ["0.076", "0.098", "0.073"], 10 "free_sulfur_dioxide": ["11", "25", "5"], 11 "total_sulfur_dioxide": ["34", "67", "13"], 12 "density": ["0.9978", "0.9968", "0.9976"], 13 "pH": ["3.51", "3.2", "3.23"], 14 "sulphates": ["0.56", "0.68", "0.82"], 15 "alcohol": ["9.4", "9.8", "12.6"] 16 }, 17 }, 18})
For models with custom parameters / outputs.
1await hf.request({ 2 model: 'my-custom-model', 3 inputs: 'hello world', 4 parameters: { 5 custom_param: 'some magic', 6 } 7}) 8 9// Custom streaming call, for models with custom parameters / outputs 10for await (const output of hf.streamingRequest({ 11 model: 'my-custom-model', 12 inputs: 'hello world', 13 parameters: { 14 custom_param: 'some magic', 15 } 16})) { 17 ... 18}
You can use any Chat Completion API-compatible provider with the chatCompletion
method.
1// Chat Completion Example 2const MISTRAL_KEY = process.env.MISTRAL_KEY; 3const hf = new HfInference(MISTRAL_KEY); 4const ep = hf.endpoint("https://api.mistral.ai"); 5const stream = ep.chatCompletionStream({ 6 model: "mistral-tiny", 7 messages: [{ role: "user", content: "Complete the equation one + one = , just the answer" }], 8}); 9let out = ""; 10for await (const chunk of stream) { 11 if (chunk.choices && chunk.choices.length > 0) { 12 out += chunk.choices[0].delta.content; 13 console.log(out); 14 } 15}
Learn more about using your own inference endpoints here
1const gpt2 = hf.endpoint('https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2'); 2const { generated_text } = await gpt2.textGeneration({inputs: 'The answer to the universe is'}); 3 4// Chat Completion Example 5const ep = hf.endpoint( 6 "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.2" 7); 8const stream = ep.chatCompletionStream({ 9 model: "tgi", 10 messages: [{ role: "user", content: "Complete the equation 1+1= ,just the answer" }], 11 max_tokens: 500, 12 temperature: 0.1, 13 seed: 0, 14}); 15let out = ""; 16for await (const chunk of stream) { 17 if (chunk.choices && chunk.choices.length > 0) { 18 out += chunk.choices[0].delta.content; 19 console.log(out); 20 } 21}
By default, all calls to the inference endpoint will wait until the model is loaded. When scaling to 0 is enabled on the endpoint, this can result in non-trivial waiting time. If you'd rather disable this behavior and handle the endpoint's returned 500 HTTP errors yourself, you can do so like so:
1const gpt2 = hf.endpoint('https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2'); 2const { generated_text } = await gpt2.textGeneration( 3 {inputs: 'The answer to the universe is'}, 4 {retry_on_error: false}, 5);
1HF_TOKEN="your access token" pnpm run test
We have an informative documentation project called Tasks to list available models for each task and explain how each task works in detail.
It also contains demos, example outputs, and other resources should you want to dig deeper into the ML side of things.
@huggingface/tasks
: Typings onlyNo vulnerabilities found.
No security vulnerabilities found.