Gathering detailed insights and metrics for unzipper
Gathering detailed insights and metrics for unzipper
Gathering detailed insights and metrics for unzipper
Gathering detailed insights and metrics for unzipper
node.js cross-platform unzip using streams
npm install unzipper
78.7
Supply Chain
98.2
Quality
80.3
Maintenance
100
Vulnerability
100
License
Module System
Min. Node Version
Typescript Support
Node Version
NPM Version
442 Stars
387 Commits
116 Forks
8 Watching
41 Branches
1 Contributors
Updated on 27 Nov 2024
Minified
Minified + Gzipped
JavaScript (98.02%)
Shell (0.94%)
HTML (0.76%)
CSS (0.27%)
Cumulative downloads
Total Downloads
Last day
-2.9%
647,515
Compared to previous day
Last week
0.3%
3,401,959
Compared to previous week
Last month
11.5%
14,257,193
Compared to previous month
Last year
24.1%
150,388,323
Compared to previous year
1$ npm install unzipper
The open methods allow random access to the underlying files of a zip archive, from disk or from the web, s3 or a custom source.
The open methods return a promise on the contents of the central directory of a zip file, with individual files
listed in an array.
Each file record has the following methods, providing random access to the underlying files:
stream([password])
- returns a stream of the unzipped content which can be piped to any destinationbuffer([password])
- returns a promise on the buffered content of the file.If the file is encrypted you will have to supply a password to decrypt, otherwise you can leave blank.
Unlike adm-zip
the Open methods will never read the entire zipfile into buffer.
The last argument to the Open
methods is an optional options
object where you can specify tailSize
(default 80 bytes), i.e. how many bytes should we read at the end of the zipfile to locate the endOfCentralDirectory. This location can be variable depending on zip64 extensible data sector size. Additionally you can supply option crx: true
which will check for a crx header and parse the file accordingly by shifting all file offsets by the length of the crx header.
Returns a Promise to the central directory information with methods to extract individual files. start
and end
options are used to avoid reading the whole file.
Here is a simple example of opening up a zip file, printing out the directory information and then extracting the first file inside the zipfile to disk:
1async function main() { 2 const directory = await unzipper.Open.file('path/to/archive.zip'); 3 console.log('directory', directory); 4 return new Promise( (resolve, reject) => { 5 directory.files[0] 6 .stream() 7 .pipe(fs.createWriteStream('firstFile')) 8 .on('error',reject) 9 .on('finish',resolve) 10 }); 11} 12 13main();
If you want to extract all files from the zip file, the directory object supplies an extract method. Here is a quick example:
1async function main() { 2 const directory = await unzipper.Open.file('path/to/archive.zip'); 3 await directory.extract({ path: '/path/to/destination' }) 4}
This function will return a Promise to the central directory information from a URL point to a zipfile. Range-headers are used to avoid reading the whole file. Unzipper does not ship with a request library so you will have to provide it as the first option.
Live Example: (extracts a tiny xml file from the middle of a 500MB zipfile)
1const request = require('request'); 2const unzipper = require('./unzip'); 3 4async function main() { 5 const directory = await unzipper.Open.url(request,'http://www2.census.gov/geo/tiger/TIGER2015/ZCTA5/tl_2015_us_zcta510.zip'); 6 const file = directory.files.find(d => d.path === 'tl_2015_us_zcta510.shp.iso.xml'); 7 const content = await file.buffer(); 8 console.log(content.toString()); 9} 10 11main();
This function takes a second parameter which can either be a string containing the url
to request, or an options
object to invoke the supplied request
library with. This can be used when other request options are required, such as custom headers or authentication to a third party service.
1const request = require('google-oauth-jwt').requestWithJWT(); 2 3const googleStorageOptions = { 4 url: `https://www.googleapis.com/storage/v1/b/m-bucket-name/o/my-object-name`, 5 qs: { alt: 'media' }, 6 jwt: { 7 email: google.storage.credentials.client_email, 8 key: google.storage.credentials.private_key, 9 scopes: ['https://www.googleapis.com/auth/devstorage.read_only'] 10 } 11}); 12 13async function getFile(req, res, next) { 14 const directory = await unzipper.Open.url(request, googleStorageOptions); 15 const file = zip.files.find((file) => file.path === 'my-filename'); 16 return file.stream().pipe(res); 17});
This function will return a Promise to the central directory information from a zipfile on S3. Range-headers are used to avoid reading the whole file. Unzipper does not ship with with the aws-sdk so you have to provide an instantiated client as first arguments. The params object requires Bucket
and Key
to fetch the correct file.
Example:
1const unzipper = require('./unzip'); 2const AWS = require('aws-sdk'); 3const s3Client = AWS.S3(config); 4 5async function main() { 6 const directory = await unzipper.Open.s3(s3Client,{Bucket: 'unzipper', Key: 'archive.zip'}); 7 return new Promise( (resolve, reject) => { 8 directory.files[0] 9 .stream() 10 .pipe(fs.createWriteStream('firstFile')) 11 .on('error',reject) 12 .on('finish',resolve) 13 }); 14} 15 16main();
If you already have the zip file in-memory as a buffer, you can open the contents directly.
Example:
1// never use readFileSync - only used here to simplify the example 2const buffer = fs.readFileSync('path/to/arhive.zip'); 3 4async function main() { 5 const directory = await unzipper.Open.buffer(buffer); 6 console.log('directory',directory); 7 // ... 8} 9 10main();
This function can be used to provide a custom source implementation. The source parameter expects a stream
and a size
function to be implemented. The size function should return a Promise
that resolves the total size of the file. The stream function should return a Readable
stream according to the supplied offset and length parameters.
Example:
1// Custom source implementation for reading a zip file from Google Cloud Storage 2const { Storage } = require('@google-cloud/storage'); 3 4async function main() { 5 const storage = new Storage(); 6 const bucket = storage.bucket('my-bucket'); 7 const zipFile = bucket.file('my-zip-file.zip'); 8 9 const customSource = { 10 stream: function(offset, length) { 11 return zipFile.createReadStream({ 12 start: offset, 13 end: length && offset + length 14 }) 15 }, 16 size: async function() { 17 const objMetadata = (await zipFile.getMetadata())[0]; 18 return objMetadata.size; 19 } 20 }; 21 22 const directory = await unzipper.Open.custom(customSource); 23 console.log('directory', directory); 24 // ... 25} 26 27main();
The directory object returned from Open.[method]
provides an extract
method which extracts all the files to a specified path
, with an optional concurrency
(default: 1).
Example (with concurrency of 5):
1unzip.Open.file('path/to/archive.zip') 2 .then(d => d.extract({path: '/extraction/path', concurrency: 5}));
Please note: Methods that use the Central Directory instead of parsing entire file can be found under Open
Chrome extension files (.crx) are zipfiles with an extra header at the start of the file. Unzipper will parse .crx file with the streaming methods (Parse
and ParseOne
).
This library began as an active fork and drop-in replacement of the node-unzip to address the following issues:
Originally the only way to use the library was to stream the entire zip file. This method is inefficient if you are only interested in selected files from the zip files. Additionally this method can be error prone since it relies on the local file headers which could be wrong.
The structure of this fork is similar to the original, but uses Promises and inherit guarantees provided by node streams to ensure low memory footprint and emits finish/close events at the end of processing. The new Parser
will push any parsed entries
downstream if you pipe from it, while still supporting the legacy entry
event as well.
Breaking changes: The new Parser
will not automatically drain entries if there are no listeners or pipes in place.
Unzipper provides simple APIs similar to node-tar for parsing and extracting zip files. There are no added compiled dependencies - inflation is handled by node.js's built in zlib support.
1fs.createReadStream('path/to/archive.zip')
2 .pipe(unzipper.Extract({ path: 'output/path' }));
Extract emits the 'close' event once the zip's contents have been fully extracted to disk. Extract
uses fstream.Writer and therefore needs an absolute path to the destination directory. This directory will be automatically created if it doesn't already exist.
Process each zip file entry or pipe entries to another stream.
Important: If you do not intend to consume an entry stream's raw data, call autodrain() to dispose of the entry's
contents. Otherwise the stream will halt. .autodrain()
returns an empty stream that provides error
and finish
events.
Additionally you can call .autodrain().promise()
to get the promisified version of success or failure of the autodrain.
1// If you want to handle autodrain errors you can either: 2entry.autodrain().catch(e => handleError); 3// or 4entry.autodrain().on('error' => handleError);
Here is a quick example:
1fs.createReadStream('path/to/archive.zip')
2 .pipe(unzipper.Parse())
3 .on('entry', function (entry) {
4 const fileName = entry.path;
5 const type = entry.type; // 'Directory' or 'File'
6 const size = entry.vars.uncompressedSize; // There is also compressedSize;
7 if (fileName === "this IS the file I'm looking for") {
8 entry.pipe(fs.createWriteStream('output/path'));
9 } else {
10 entry.autodrain();
11 }
12 });
and the same example using async iterators:
1const zip = fs.createReadStream('path/to/archive.zip').pipe(unzipper.Parse({forceStream: true})); 2for await (const entry of zip) { 3 const fileName = entry.path; 4 const type = entry.type; // 'Directory' or 'File' 5 const size = entry.vars.uncompressedSize; // There is also compressedSize; 6 if (fileName === "this IS the file I'm looking for") { 7 entry.pipe(fs.createWriteStream('output/path')); 8 } else { 9 entry.autodrain(); 10 } 11}
If you pipe
from unzipper the downstream components will receive each entry
for further processing. This allows for clean pipelines transforming zipfiles into unzipped data.
Example using stream.Transform
:
1fs.createReadStream('path/to/archive.zip')
2 .pipe(unzipper.Parse())
3 .pipe(stream.Transform({
4 objectMode: true,
5 transform: function(entry,e,cb) {
6 const fileName = entry.path;
7 const type = entry.type; // 'Directory' or 'File'
8 const size = entry.vars.uncompressedSize; // There is also compressedSize;
9 if (fileName === "this IS the file I'm looking for") {
10 entry.pipe(fs.createWriteStream('output/path'))
11 .on('finish',cb);
12 } else {
13 entry.autodrain();
14 cb();
15 }
16 }
17 }
18 }));
Example using etl:
1fs.createReadStream('path/to/archive.zip') 2 .pipe(unzipper.Parse()) 3 .pipe(etl.map(entry => { 4 if (entry.path == "this IS the file I'm looking for") 5 return entry 6 .pipe(etl.toFile('output/path')) 7 .promise(); 8 else 9 entry.autodrain(); 10 })) 11
unzipper.parseOne([regex])
is a convenience method that unzips only one file from the archive and pipes the contents down (not the entry itself). If no search criteria is specified, the first file in the archive will be unzipped. Otherwise, each filename will be compared to the criteria and the first one to match will be unzipped and piped down. If no file matches then the the stream will end without any content.
Example:
1fs.createReadStream('path/to/archive.zip')
2 .pipe(unzipper.ParseOne())
3 .pipe(fs.createWriteStream('firstFile.txt'));
While the recommended strategy of consuming the unzipped contents is using streams, it is sometimes convenient to be able to get the full buffered contents of each file . Each entry
provides a .buffer
function that consumes the entry by buffering the contents into memory and returning a promise to the complete buffer.
1fs.createReadStream('path/to/archive.zip') 2 .pipe(unzipper.Parse()) 3 .pipe(etl.map(async entry => { 4 if (entry.path == "this IS the file I'm looking for") { 5 const content = await entry.buffer(); 6 await fs.writeFile('output/path',content); 7 } 8 else { 9 entry.autodrain(); 10 } 11 }))
The parser emits finish
and error
events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise-based structures.
Example:
1fs.createReadStream('path/to/archive.zip') 2 .pipe(unzipper.Parse()) 3 .on('entry', entry => entry.autodrain()) 4 .promise() 5 .then( () => console.log('done'), e => console.log('error',e));
Archives created by legacy tools usually have filenames encoded with IBM PC (Windows OEM) character set. You can decode filenames with preferred character set:
1const il = require('iconv-lite'); 2fs.createReadStream('path/to/archive.zip') 3 .pipe(unzipper.Parse()) 4 .on('entry', function (entry) { 5 // if some legacy zip tool follow ZIP spec then this flag will be set 6 const isUnicode = entry.props.flags.isUnicode; 7 // decode "non-unicode" filename from OEM Cyrillic character set 8 const fileName = isUnicode ? entry.path : il.decode(entry.props.pathBuffer, 'cp866'); 9 const type = entry.type; // 'Directory' or 'File' 10 const size = entry.vars.uncompressedSize; // There is also compressedSize; 11 if (fileName === "Текстовый файл.txt") { 12 entry.pipe(fs.createWriteStream(fileName)); 13 } else { 14 entry.autodrain(); 15 } 16 });
See LICENCE
The latest stable version of the package.
Stable Version
1
5.5/10
Summary
Arbitrary File Write via Archive Extraction in unzipper
Affected Versions
< 0.8.13
Patched Versions
0.8.13
Reason
no dangerous workflow patterns detected
Reason
0 existing vulnerabilities detected
Reason
packaging workflow detected
Details
Reason
binaries present in source code
Details
Reason
license file detected
Details
Reason
Found 3/11 approved changesets -- score normalized to 2
Reason
0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Reason
detected GitHub workflow tokens with excessive permissions
Details
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
security policy file not detected
Details
Reason
dependency not pinned by hash detected -- score normalized to 0
Details
Reason
project is not fuzzed
Details
Reason
branch protection not enabled on development/release branches
Details
Reason
SAST tool is not run on all commits -- score normalized to 0
Details
Score
Last Scanned on 2024-11-25
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn More