Gathering detailed insights and metrics for node-html-markdown
Gathering detailed insights and metrics for node-html-markdown
Gathering detailed insights and metrics for node-html-markdown
Gathering detailed insights and metrics for node-html-markdown
unist-util-visit
unist utility to visit nodes
node-europa
Library for converting HTML into valid Markdown within Node.js
n8n-nodes-turndown-html-to-markdown
Node to use in n8n that allows you to convert HTML to MarkDown using one of the most famous JS libraries that perform this conversion to Turndown | PT-BR: Nó para usar em n8n que permite converter HTML para MarkDown usando uma das bibliotecas JS mais famo
@nice-move/eslint-config-base
A `eslint` config for best practice
npm install node-html-markdown
Typescript
Module System
Min. Node Version
56.6
Supply Chain
100
Quality
75.8
Maintenance
100
Vulnerability
100
License
HTML (99.08%)
TypeScript (0.81%)
JavaScript (0.11%)
Total Downloads
6,063,380
Last Day
12,950
Last Week
114,574
Last Month
488,062
Last Year
4,140,606
184 Stars
126 Commits
29 Forks
3 Watching
1 Branches
7 Contributors
Latest Version
1.3.0
Package Id
node-html-markdown@1.3.0
Unpacked Size
102.33 kB
Size
25.88 kB
File Count
29
Cumulative downloads
Total Downloads
Last day
-42.5%
12,950
Compared to previous day
Last week
-8.7%
114,574
Compared to previous week
Last month
9.8%
488,062
Compared to previous month
Last year
164.4%
4,140,606
Compared to previous year
1
NHM is a fast HTML to markdown converter, compatible with both node and the browser.
It was built with the following two goals in mind:
We had a need to convert gigabytes of HTML daily very quickly. All libraries we found were too slow with node. We considered using a low-level language but decided to attempt to write something that would squeeze every bit of performance out of the JIT that we could. The end result was fast enough to make the cut!
The other libraries we tested produced output that would break in numerous conditions and produced output with many repeating linefeeds, etc. Generally speaking, outside of a markdown viewer, the result was not easy to read.
We took the approach of producing a clean, concise result with consistent spacing rules.
1<yarn|npm|pnpm> add node-html-markdown
-----------------------------------------------------------------------------
Estimated processing times (fastest to slowest):
[node-html-markdown (reused instance)]
100 kB: 17ms
1 MB: 176ms
50 MB: 8.80sec
1 GB: 3min, 0sec
50 GB: 2hr, 30min, 14sec
[turndown (reused instance)]
100 kB: 27ms
1 MB: 280ms
50 MB: 13.98sec
1 GB: 4min, 46sec
50 GB: 3hr, 58min, 35sec
-----------------------------------------------------------------------------
Speed comparison - node-html-markdown (reused instance) is:
1.02 times as fast as node-html-markdown
1.57 times as fast as turndown
1.59 times as fast as turndown (reused instance)
-----------------------------------------------------------------------------
1import { NodeHtmlMarkdown, NodeHtmlMarkdownOptions } from 'node-html-markdown' 2 3 4/* ********************************************************* * 5 * Single use 6 * If using it once, you can use the static method 7 * ********************************************************* */ 8 9// Single file 10NodeHtmlMarkdown.translate( 11 /* html */ `<b>hello</b>`, 12 /* options (optional) */ {}, 13 /* customTranslators (optional) */ undefined, 14 /* customCodeBlockTranslators (optional) */ undefined 15); 16 17// Multiple files 18NodeHtmlMarkdown.translate( 19 /* FileCollection */ { 20 'file1.html': `<b>hello</b>`, 21 'file2.html': `<b>goodbye</b>` 22 }, 23 /* options (optional) */ {}, 24 /* customTranslators (optional) */ undefined, 25 /* customCodeBlockTranslators (optional) */ undefined 26); 27 28 29/* ********************************************************* * 30 * Re-use 31 * If using it several times, creating an instance saves time 32 * ********************************************************* */ 33 34const nhm = new NodeHtmlMarkdown( 35 /* options (optional) */ {}, 36 /* customTransformers (optional) */ undefined, 37 /* customCodeBlockTranslators (optional) */ undefined 38); 39 40// Single file 41nhm.translate(/* html */ `<b>hello</b>`); 42 43// Multiple Files 44nhm.translate( 45 /* FileCollection */ { 46 'file1.html': `<b>hello</b>`, 47 'file2.html': `<b>goodbye</b>` 48 }, 49);
1 2export interface NodeHtmlMarkdownOptions { 3 /** 4 * Use native window DOMParser when available 5 * @default false 6 */ 7 preferNativeParser: boolean, 8 9 /** 10 * Code block fence 11 * @default ``` 12 */ 13 codeFence: string, 14 15 /** 16 * Bullet marker 17 * @default * 18 */ 19 bulletMarker: string, 20 21 /** 22 * Style for code block 23 * @default fence 24 */ 25 codeBlockStyle: 'indented' | 'fenced', 26 27 /** 28 * Emphasis delimiter 29 * @default _ 30 */ 31 emDelimiter: string, 32 33 /** 34 * Strong delimiter 35 * @default ** 36 */ 37 strongDelimiter: string, 38 39 /** 40 * Strong delimiter 41 * @default ~~ 42 */ 43 strikeDelimiter: string, 44 45 /** 46 * Supplied elements will be ignored (ignores inner text does not parse children) 47 */ 48 ignore?: string[], 49 50 /** 51 * Supplied elements will be treated as blocks (surrounded with blank lines) 52 */ 53 blockElements?: string[], 54 55 /** 56 * Max consecutive new lines allowed 57 * @default 3 58 */ 59 maxConsecutiveNewlines: number, 60 61 /** 62 * Line Start Escape pattern 63 * (Note: Setting this will override the default escape settings, you might want to use textReplace option instead) 64 */ 65 lineStartEscape: [ pattern: RegExp, replacement: string ] 66 67 /** 68 * Global escape pattern 69 * (Note: Setting this will override the default escape settings, you might want to use textReplace option instead) 70 */ 71 globalEscape: [ pattern: RegExp, replacement: string ] 72 73 /** 74 * User-defined text replacement pattern (Replaces matching text retrieved from nodes) 75 */ 76 textReplace?: [ pattern: RegExp, replacement: string ][] 77 78 /** 79 * Keep images with data: URI (Note: These can be up to 1MB each) 80 * @example 81 * <img src="data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSK......0o/"> 82 * @default false 83 */ 84 keepDataImages?: boolean 85 86 /** 87 * Place URLS at the bottom and format links using link reference definitions 88 * 89 * @example 90 * Click <a href="/url1">here</a>. Or <a href="/url2">here</a>. Or <a href="/url1">this link</a>. 91 * 92 * Becomes: 93 * Click [here][1]. Or [here][2]. Or [this link][1]. 94 * 95 * [1]: /url 96 * [2]: /url2 97 */ 98 useLinkReferenceDefinitions?: boolean 99 100 /** 101 * Wrap URL text in < > instead of []() syntax. 102 * 103 * @example 104 * The input <a href="https://google.com">https://google.com</a> 105 * becomes <https://google.com> 106 * instead of [https://google.com](https://google.com) 107 * 108 * @default true 109 */ 110 useInlineLinks?: boolean 111}
Custom translators are an advanced option to allow handling certain elements a specific way.
These can be modified via the NodeHtmlMarkdown#translators
property, or added during creation.
For detail on how to use them see:
TranslatorConfig
defaultTranslators
The NodeHtmlMarkdown#codeBlockTranslators
property is a collection of translators which handles elements within a <pre><code>
block.
Being a performance-centric library, we're always interested in further improvements. There are several probable routes by which we could gain substantial performance increases over the current model.
Such methods include:
These would be fun to implement; however, for the time being, the present library is fast enough for my purposes. That said, I welcome discussion and any PR toward the effort of further improving performance, and I may ultimately do more work in that capacity in the future!
Looking to contribute? Check out our help wanted list for a good place to start!
No vulnerabilities found.
Reason
no binaries found in the repo
Reason
no dangerous workflow patterns detected
Reason
Found 4/25 approved changesets -- score normalized to 1
Reason
0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Reason
detected GitHub workflow tokens with excessive permissions
Details
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
dependency not pinned by hash detected -- score normalized to 0
Details
Reason
license file not detected
Details
Reason
project is not fuzzed
Details
Reason
security policy file not detected
Details
Reason
branch protection not enabled on development/release branches
Details
Reason
SAST tool is not run on all commits -- score normalized to 0
Details
Reason
19 existing vulnerabilities detected
Details
Score
Last Scanned on 2024-12-16
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn More