Installations
npm install chardet
Score
99.4
Supply Chain
100
Quality
76.2
Maintenance
100
Vulnerability
100
License
Developer
runk
Developer Guide
Module System
CommonJS, UMD
Min. Node Version
Typescript Support
No
Node Version
18.18.0
NPM Version
9.8.1
Statistics
283 Stars
199 Commits
73 Forks
8 Watching
17 Branches
11 Contributors
Updated on 25 Nov 2024
Languages
TypeScript (99.69%)
JavaScript (0.31%)
Total Downloads
Cumulative downloads
Total Downloads
4,673,609,772
Last day
-8.2%
4,470,486
Compared to previous day
Last week
2.1%
26,572,787
Compared to previous week
Last month
17.5%
106,611,228
Compared to previous month
Last year
11.2%
1,033,187,677
Compared to previous year
Daily Downloads
Weekly Downloads
Monthly Downloads
Yearly Downloads
chardet
Chardet is a character detection module written in pure JavaScript (TypeScript). Module uses occurrence analysis to determine the most probable encoding.
- Packed size is only 22 KB
- Works in all environments: Node / Browser / Native
- Works on all platforms: Linux / Mac / Windows
- No dependencies
- No native code / bindings
- 100% written in TypeScript
- Extensive code coverage
Installation
npm i chardet
Usage
To return the encoding with the highest confidence:
1import chardet from 'chardet'; 2 3const encoding = chardet.detect(Buffer.from('hello there!')); 4// or 5const encoding = await chardet.detectFile('/path/to/file'); 6// or 7const encoding = chardet.detectFileSync('/path/to/file');
To return the full list of possible encodings use analyse
method.
1import chardet from 'chardet'; 2chardet.analyse(Buffer.from('hello there!'));
Returned value is an array of objects sorted by confidence value in descending order
1[ 2 { confidence: 90, name: 'UTF-8' }, 3 { confidence: 20, name: 'windows-1252', lang: 'fr' }, 4];
In browser, you can use Uint8Array instead of the Buffer
:
1import chardet from 'chardet'; 2chardet.analyse(new Uint8Array([0x68, 0x65, 0x6c, 0x6c, 0x6f]));
Working with large data sets
Sometimes, when data set is huge and you want to optimize performance (with a trade off of less accuracy), you can sample only the first N bytes of the buffer:
1const encoding = await chardet.detectFile('/path/to/file', { sampleSize: 32 });
You can also specify where to begin reading from in the buffer:
1const encoding = await chardet.detectFile('/path/to/file', { 2 sampleSize: 32, 3 offset: 128, 4});
Working with strings
In both Node.js and browsers, all strings in memory are represented in UTF-16 encoding. This is a fundamental aspect of the JavaScript language specification. Therefore, you cannot use plain strings directly as input for chardet.analyse()
or chardet.detect()
. Instead, you need the original string data in the form of a Buffer or Uint8Array.
In other words, if you receive a piece of data over the network and want to detect its encoding, use the original data payload, not its string representation. By the time you convert data to a string, it will be in UTF-16 encoding.
Note on TextEncoder: By default, it returns a UTF-8 encoded buffer, which means the buffer will not be in the original encoding of the string.
Supported Encodings:
- UTF-8
- UTF-16 LE
- UTF-16 BE
- UTF-32 LE
- UTF-32 BE
- ISO-2022-JP
- ISO-2022-KR
- ISO-2022-CN
- Shift_JIS
- Big5
- EUC-JP
- EUC-KR
- GB18030
- ISO-8859-1
- ISO-8859-2
- ISO-8859-5
- ISO-8859-6
- ISO-8859-7
- ISO-8859-8
- ISO-8859-9
- windows-1250
- windows-1251
- windows-1252
- windows-1253
- windows-1254
- windows-1255
- windows-1256
- KOI8-R
Currently only these encodings are supported.
TypeScript?
Yes. Type definitions are included.
References
- ICU project http://site.icu-project.org/
No vulnerabilities found.
Reason
no dangerous workflow patterns detected
Reason
no binaries found in the repo
Reason
0 existing vulnerabilities detected
Reason
license file detected
Details
- Info: project has a license file: LICENSE:0
- Info: FSF or OSI recognized license: MIT License: LICENSE:0
Reason
packaging workflow detected
Details
- Info: Project packages its releases by way of GitHub Actions.: .github/workflows/release.yml:7
Reason
2 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 1
Reason
Found 2/25 approved changesets -- score normalized to 0
Reason
detected GitHub workflow tokens with excessive permissions
Details
- Warn: no topLevel permission defined: .github/workflows/build.yml:1
- Warn: no topLevel permission defined: .github/workflows/release.yml:1
- Info: no jobLevel write permissions found
Reason
dependency not pinned by hash detected -- score normalized to 0
Details
- Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:19: update your workflow using https://app.stepsecurity.io/secureworkflow/runk/node-chardet/build.yml/master?enable=pin
- Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/runk/node-chardet/build.yml/master?enable=pin
- Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/release.yml:12: update your workflow using https://app.stepsecurity.io/secureworkflow/runk/node-chardet/release.yml/master?enable=pin
- Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/release.yml:16: update your workflow using https://app.stepsecurity.io/secureworkflow/runk/node-chardet/release.yml/master?enable=pin
- Warn: npmCommand not pinned by hash: .github/workflows/build.yml:27
- Warn: npmCommand not pinned by hash: .github/workflows/release.yml:21
- Info: 0 out of 4 GitHub-owned GitHubAction dependencies pinned
- Info: 0 out of 2 npmCommand dependencies pinned
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
project is not fuzzed
Details
- Warn: no fuzzer integrations found
Reason
security policy file not detected
Details
- Warn: no security policy file detected
- Warn: no security file to analyze
- Warn: no security file to analyze
- Warn: no security file to analyze
Reason
SAST tool is not run on all commits -- score normalized to 0
Details
- Warn: 0 commits out of 28 are checked with a SAST tool
Score
4.3
/10
Last Scanned on 2024-11-25
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn MoreOther packages similar to chardet
@types/chardet
TypeScript definitions for chardet
jschardet
Character encoding auto-detection in JavaScript (port of python's chardet)
@pypi/chardet
Chardet: The Universal Character Encoding Detector --------------------------------------------------
bemuse-chardet
Fork of chardet for use in bemuse