Gathering detailed insights and metrics for @context-labs/node-warc
Gathering detailed insights and metrics for @context-labs/node-warc
Gathering detailed insights and metrics for @context-labs/node-warc
Gathering detailed insights and metrics for @context-labs/node-warc
Parse And Create Web ARChive (WARC) files with node.js
npm install @context-labs/node-warc
Typescript
Module System
Min. Node Version
Node Version
NPM Version
JavaScript (100%)
Total Downloads
0
Last Day
0
Last Week
0
Last Month
0
Last Year
0
MIT License
99 Stars
116 Commits
22 Forks
7 Watchers
20 Branches
6 Contributors
Updated on Jul 04, 2025
Latest Version
3.3.3
Package Id
@context-labs/node-warc@3.3.3
Unpacked Size
143.63 kB
Size
30.34 kB
File Count
47
NPM Version
9.6.7
Node Version
20.3.1
Published on
Jul 20, 2023
Cumulative downloads
Total Downloads
Last Day
0%
NaN
Compared to previous day
Last Week
0%
NaN
Compared to previous week
Last Month
0%
NaN
Compared to previous month
Last Year
0%
NaN
Compared to previous year
5
Parse Web Archive (WARC) files or create WARC files using
Run npm install node-warc
or yarn add node-warc
to ge started
Full documentation available at n0tan3rd.github.io/node-warc
Requires node 10 or greater
1const fs = require('fs') 2const zlib = require('zlib') 3// recordIterator only exported if async iteration on readable streams is available 4const { recordIterator } = require('node-warc') 5 6async function iterateRecords (warcStream) { 7 for await (const record of recordIterator(warcStream)) { 8 console.log(record) 9 } 10} 11 12iterateRecords( 13 fs.createReadStream('<path-to-gzipd-warcfile>').pipe(zlib.createGunzip()) 14).then(() => { 15 console.log('done') 16})
Or using one of the parsers
1for await (const record of new AutoWARCParser('<path-to-warcfile>')) { 2 console.log(record) 3}
1const fs = require('fs') 2const { WARCStreamTransform } = require('node-warc') 3 4fs 5 .createReadStream('<path-to-warcfile>') 6 .pipe(new WARCStreamTransform()) 7 .on('data', record => { 8 console.log(record) 9 })
.warc
and .warc.gz
1const { AutoWARCParser } = require('node-warc') 2 3const parser = new AutoWARCParser('<path-to-warcfile>') 4parser.on('record', record => { console.log(record) }) 5parser.on('done', () => { console.log('finished') }) 6parser.on('error', error => { console.error(error) }) 7parser.start()
1const { WARCGzParser } = require('node-warc') 2 3const parser = new WARCGzParser('<path-to-warcfile>') 4parser.on('record', record => { console.log(record) }) 5parser.on('done', () => { console.log('finished') }) 6parser.on('error', error => { console.error(error) }) 7parser.start()
1const { WARCGzParser } = require('node-warc') 2 3const parser = new WARCParser('<path-to-gzipd-warcfile>') 4parser.on('record', record => { console.log(record) }) 5parser.on('done', () => { console.log('finished') }) 6parser.on('error', error => { console.error(error) }) 7parser.start()
NODEWARC_WRITE_GZIPPED
- enable writing gzipped records to WARC outputs.1const CRI = require('chrome-remote-interface') 2const { RemoteChromeWARCWriter, RemoteChromeCapturer } = require('node-warc') 3 4;(async () => { 5 const client = await CRI() 6 await Promise.all([ 7 client.Page.enable(), 8 client.Network.enable(), 9 ]) 10 const cap = new RemoteChromeCapturer(client.Network) 11 cap.startCapturing() 12 await client.Page.navigate({ url: 'http://example.com' }); 13 // actual code should wait for a better stopping condition, eg. network idle 14 await client.Page.loadEventFired() 15 const warcGen = new RemoteChromeWARCWriter() 16 await warcGen.generateWARC(cap, client.Network, { 17 warcOpts: { 18 warcPath: 'myWARC.warc' 19 }, 20 winfo: { 21 description: 'I created a warc!', 22 isPartOf: 'My awesome pywb collection' 23 } 24 }) 25 await client.close() 26})()
1const { CRIExtra, Events, Page } = require('chrome-remote-interface-extra')
2const { CRIExtraWARCGenerator, CRIExtraCapturer } = require('node-warc')
3
4;(async () => {
5 let client
6 try {
7 // connect to endpoint
8 client = await CRIExtra({ host: 'localhost', port: 9222 })
9 const page = await Page.create(client)
10 const cap = new CRIExtraCapturer(page, Events.Page.Request)
11 cap.startCapturing()
12 await page.goto('https://example.com', { waitUntil: 'networkIdle' })
13 const warcGen = new CRIExtraWARCGenerator()
14 await warcGen.generateWARC(cap, {
15 warcOpts: {
16 warcPath: 'myWARC.warc'
17 },
18 winfo: {
19 description: 'I created a warc!',
20 isPartOf: 'My awesome pywb collection'
21 }
22 })
23 } catch (err) {
24 console.error(err)
25 } finally {
26 if (client) {
27 await client.close()
28 }
29 }
30})()
1const puppeteer = require('puppeteer') 2const { Events } = require('puppeteer') 3const { PuppeteerWARCGenerator, PuppeteerCapturer } = require('node-warc') 4 5;(async () => { 6 const browser = await puppeteer.launch() 7 const page = await browser.newPage() 8 const cap = new PuppeteerCapturer(page, Events.Page.Request) 9 cap.startCapturing() 10 await page.goto('http://example.com', { waitUntil: 'networkidle0' }) 11 const warcGen = new PuppeteerWARCGenerator() 12 await warcGen.generateWARC(cap, { 13 warcOpts: { 14 warcPath: 'myWARC.warc' 15 }, 16 winfo: { 17 description: 'I created a warc!', 18 isPartOf: 'My awesome pywb collection' 19 } 20 }) 21 await page.close() 22 await browser.close() 23})()
The generateWARC method used in the preceding examples is helper function for making the WARC generation process simple. See its implementation for a full example of WARC generation using node-warc
Or see one of the crawler implementations provided by Squidwarc.
No vulnerabilities found.
Reason
no binaries found in the repo
Reason
license file detected
Details
Reason
Found 4/12 approved changesets -- score normalized to 3
Reason
0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
security policy file not detected
Details
Reason
project is not fuzzed
Details
Reason
branch protection not enabled on development/release branches
Details
Reason
SAST tool is not run on all commits -- score normalized to 0
Details
Reason
71 existing vulnerabilities detected
Details
Score
Last Scanned on 2025-07-14
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn More