npmpackage.info

Gathering detailed insights and metrics for @tscircuit/prompt-benchmarks

@tscircuit/prompt-benchmarks - 0.0.44 | npmpackage.info

@tscircuit/prompt-benchmarks

Benchmarking for tscircuit system prompts for different coding tasks

0.0.44

MIT

TypeScript

626.38 kB

Installations

npm install @tscircuit/prompt-benchmarks

Developer Guide

BETA

Typescript

Yes

Module System

ESM

Node Version

20.18.3

NPM Version

10.8.2 Pull Requests

Open

1

Total

44

Closed

8

Merged

35

Issues

Open

1

Total

5

Closed

4

Releases

Unable to fetch releases

Languages

TypeScript

TypeScript (100%)

Developer

tscircuit

Download Statistics

Total Downloads

Last Day

Last Week

Last Month

Last Year

GitHub Statistics

MIT License

2 Stars

168 Commits

6 Forks

1 Watchers

8 Branches

7 Contributors

Updated on Feb 28, 2025

Maintainers

View All 7 Contributors

Package Meta Information

Latest Version

0.0.44

Package Id

@tscircuit/prompt-benchmarks@0.0.44

Unpacked Size

626.38 kB

Size

166.27 kB

File Count

NPM Version

10.8.2

Node Version

20.18.3

Published on

Feb 28, 2025

Total Downloads

Cumulative downloads

Total Downloads

NaN

Last Day

NaN

Compared to previous day

Last Week

NaN

Compared to previous week

Last Month

NaN

Compared to previous month

Last Year

NaN

Compared to previous year

Weekly Downloads

Monthly Downloads

Yearly Downloads

tscircuit Prompt Benchmarks

Docs · Website · Twitter · Discord · Quickstart · Online Playground

This repository contains benchmarks for evaluating and improving the quality of system prompts used to generate tscircuit code. It includes components for:

Code Runner (in lib/code-runner): Safely transpiles, evaluates, and renders TSX code for circuit generation.
AI Integration (in lib/ai): Interfaces with Openai’s models for prompt completions and error correction.
Utility Modules (in lib/utils): Provide logging, snapshot management, and type-checking of generated circuits.
Prompt Templates (in lib/prompt-templates): Define various prompt structures for generating different circuit types.
Benchmarking & Scoring (using evalite and custom scorers in benchmarks/scorers): Run multiple tests to ensure circuit validity and quality.

Installation

You can install this package from npm using Bun:

1bun add @tscircuit/prompt-benchmarks

Using the TscircuitCoder

Below is the TscircuitCoder interface:

1export interface TscircuitCoder {
2  onStreamedChunk: (chunk: string) => void
3  onVfsChanged: () => void
4  vfs: { [filepath: string]: string }
5  availableOptions: { name: string; options: string[] }[]
6  submitPrompt: (
7    prompt: string,
8    options?: { selectedMicrocontroller?: string },
9  ) => Promise<void>
10}

*Note: The createTscirciutCoder function now accepts an optional openaiClient parameter to override the default openai client. This allows you to provide a custom client. The AI Coder supports streaming of AI responses and notifying you when the virtual file system (VFS) is updated. To achieve this, you can pass two callback functions when creating an TscircuitCoder instance:

onStreamedChunk: A callback function that receives streamed chunks from the AI. This is useful for logging or updating a UI with gradual progress.
onVfsChanged: A callback function that is invoked whenever the VFS is updated with new content. This is useful for refreshing a file view or triggering further processing.

Example Usage:

1import { createTscircuitCoder } from "@tscircuit/prompt-benchmarks/lib/ai/tscircuitCoder"
2
3// Define a callback for handling streamed chunks
4const handleStream = (chunk: string) => {
5  console.log("Streaming update:", chunk)
6}
7
8// Define a callback for when the VFS is updated
9const handleVfsUpdate = () => {
10  console.log("The virtual file system has been updated.")
11}
12
13// Create an instance of TscircuitCoder with your callbacks
14const tscircuitCoder = createTscircuitCoder(handleStream, handleVfsUpdate)
15
16// Submit a prompt to generate a circuit.
17// The onStream callback logs streaming updates and onVfsChanged notifies when a new file is added to the VFS.
18tscircuitCoder.submitPrompt("create a circuit that blinks an LED")

Running Benchmarks

To run the benchmarks using evalite, use:

1bun start

Each prompt is processed multiple times to test:

Whether the output compiles without errors.
Whether the output meets the expected circuit specifications.

After modifying prompts or system components, evalite reruns automatically, you should skip the benchmarks you don't want to run.

Problem Sets

This project uses TOML files to define problem sets for circuit generation. Each problem is defined using a TOML array of tables with the following format:

1[[problems]]
2prompt = """
3Your circuit prompt description goes here.
4"""
5title = "Sample Problem Title"
6questions = [
7  { text = "Question text", answer = true },
8  { text = "Another question text", answer = false }
9]

In each problem:

The prompt field must contain the circuit description that instructs the AI.
The title gives a short title for the problem.
The questions array contains objects with a text property (the question) and an answer property (a boolean) used to validate the generated circuit.

To add a new problem set, create a new TOML file in the problem-sets directory following this format. Each new file can contain one or more problems defined with the [[problems]] header.

Build, Test, and Start

Build: bun run build
Test: bun run test
Start: bun start

Benchmarks Directory

The benchmarks directory contains various files to help evaluate and score circuit‐generating prompts:

• benchmarks/prompt-logs/
These are text files (e.g., prompt-2025-02-05T14-07-18-242Z.txt, prompt-2025-02-05T14-10-53-144Z.txt, etc.) that log each prompt attempt and its output. They serve as a history of interactions.

• benchmarks/benchmark-local-circuit-error-correction.eval.ts
Runs local circuit evaluation with an error correction workflow. It repeatedly calls the AI (up to a set maximum) until the circuit output meets expectations, logging each attempt.

• benchmarks/benchmark-local-circuit.eval.ts
Evaluates a local circuit by running a specific user prompt and checking that the generated circuit compiles and meets expected behaviors.

• benchmarks/benchmark-local-circuit-random.eval.ts
Generates random prompts using an AI-powered prompt generator and evaluates their corresponding circuit outputs. This file is useful for stress-testing and assessing the robustness of circuit generation.

• benchmarks/scorers/ai-circuit-scorer.ts
Uses an AI model to assign a score (from 0 to 1) based on correctness, appropriate use of components, circuit complexity, and code quality.

• benchmarks/scorers/circuit-scorer.ts
A basic scorer that checks each generated circuit against predefined questions and answers from problem sets.

License

MIT License

No vulnerabilities found.

No security vulnerabilities found.

@tscircuit/prompt-benchmarks

Installations

Developer Guide

Yes

ESM

20.18.3

10.8.2

Pull Requests

1

44

8

35

Issues

1

5

4

Releases

Unable to fetch releases

Languages

Developer

tscircuit

Download Statistics

GitHub Statistics

Maintainers

Package Meta Information

Total Downloads

NaN

Weekly Downloads

Monthly Downloads

Yearly Downloads

Dependencies

Peer Dependencies

Dev Dependencies

tscircuit Prompt Benchmarks

Installation

Using the TscircuitCoder

Running Benchmarks

Problem Sets

Build, Test, and Start

Benchmarks Directory

License