Gathering detailed insights and metrics for hyntax
Gathering detailed insights and metrics for hyntax
Gathering detailed insights and metrics for hyntax
Gathering detailed insights and metrics for hyntax
npm install hyntax
Typescript
Module System
Min. Node Version
Node Version
NPM Version
86.4
Supply Chain
100
Quality
76
Maintenance
100
Vulnerability
100
License
JavaScript (98.8%)
HTML (1.14%)
Shell (0.06%)
Total Downloads
1,310,050
Last Day
687
Last Week
4,167
Last Month
21,919
Last Year
271,957
139 Stars
191 Commits
8 Forks
8 Watching
18 Branches
2 Contributors
Minified
Minified + Gzipped
Latest Version
1.1.9
Package Id
hyntax@1.1.9
Size
735.93 kB
NPM Version
6.14.5
Node Version
12.18.2
Publised On
06 Sept 2020
Cumulative downloads
Total Downloads
Last day
-34.5%
687
Compared to previous day
Last week
-20%
4,167
Compared to previous week
Last month
-17.4%
21,919
Compared to previous month
Last year
34.8%
271,957
Compared to previous year
Straightforward HTML parser for JavaScript. Live Demo.
1npm install hyntax
1const { tokenize, constructTree } = require('hyntax') 2const util = require('util') 3 4const inputHTML = ` 5<html> 6 <body> 7 <input type="text" placeholder="Don't type"> 8 <button>Don't press</button> 9 </body> 10</html> 11` 12 13const { tokens } = tokenize(inputHTML) 14const { ast } = constructTree(tokens) 15 16console.log(JSON.stringify(tokens, null, 2)) 17console.log(util.inspect(ast, { showHidden: false, depth: null }))
Hyntax is written in JavaScript but has integrated TypeScript typings to help you navigate around its data structures. There is also Types Reference which covers most common types.
Use StreamTokenizer
and StreamTreeConstructor
classes to parse HTML chunk by chunk while it's still being loaded from the network or read from the disk.
1const { StreamTokenizer, StreamTreeConstructor } = require('hyntax') 2const http = require('http') 3const util = require('util') 4 5http.get('http://info.cern.ch', (res) => { 6 const streamTokenizer = new StreamTokenizer() 7 const streamTreeConstructor = new StreamTreeConstructor() 8 9 let resultTokens = [] 10 let resultAst 11 12 res.pipe(streamTokenizer).pipe(streamTreeConstructor) 13 14 streamTokenizer 15 .on('data', (tokens) => { 16 resultTokens = resultTokens.concat(tokens) 17 }) 18 .on('end', () => { 19 console.log(JSON.stringify(resultTokens, null, 2)) 20 }) 21 22 streamTreeConstructor 23 .on('data', (ast) => { 24 resultAst = ast 25 }) 26 .on('end', () => { 27 console.log(util.inspect(resultAst, { showHidden: false, depth: null })) 28 }) 29}).on('error', (err) => { 30 throw err; 31})
Here are all kinds of tokens which Hyntax will extract out of HTML string.
Each token conforms to Tokenizer.Token interface.
Resulting syntax tree will have at least one top-level Document Node with optional children nodes nested within.
1{ 2 nodeType: TreeConstructor.NodeTypes.Document, 3 content: { 4 children: [ 5 { 6 nodeType: TreeConstructor.NodeTypes.AnyNodeType, 7 content: {…} 8 }, 9 { 10 nodeType: TreeConstructor.NodeTypes.AnyNodeType, 11 content: {…} 12 } 13 ] 14 } 15}
Content of each node is specific to node's type, all of them are described in AST Node Types reference.
Hyntax has its tokenizer as a separate module. You can use generated tokens on their own or pass them further to a tree constructor to build an AST.
1tokenize(html: String): Tokenizer.Result
html
After you've got an array of tokens, you can pass them into tree constructor to build an AST.
1constructTree(tokens: Tokenizer.AnyToken[]): TreeConstructor.Result
tokens
1interface Result { 2 state: Tokenizer.State 3 tokens: Tokenizer.AnyToken[] 4}
state
tokens
1interface Result { 2 state: State 3 ast: AST 4}
state
The current state of the tree constructor. Can be persisted and passed to the next tree constructor call in case when tokens are coming in chunks.
ast
Resulting AST.
Type: TreeConstructor.AST
Generic Token, other interfaces use it to create a specific Token type.
1interface Token<T extends TokenTypes.AnyTokenType> { 2 type: T 3 content: string 4 startPosition: number 5 endPosition: number 6}
type
One of the Token types.
content
Piece of original HTML string which was recognized as a token.
startPosition
Index of a character in the input HTML string where the token starts.
endPosition
Index of a character in the input HTML string where the token ends.
Shortcut type of all possible tokens.
1type AnyTokenType = 2 | Text 3 | OpenTagStart 4 | AttributeKey 5 | AttributeAssigment 6 | AttributeValueWrapperStart 7 | AttributeValue 8 | AttributeValueWrapperEnd 9 | OpenTagEnd 10 | CloseTag 11 | OpenTagStartScript 12 | ScriptTagContent 13 | OpenTagEndScript 14 | CloseTagScript 15 | OpenTagStartStyle 16 | StyleTagContent 17 | OpenTagEndStyle 18 | CloseTagStyle 19 | DoctypeStart 20 | DoctypeEnd 21 | DoctypeAttributeWrapperStart 22 | DoctypeAttribute 23 | DoctypeAttributeWrapperEnd 24 | CommentStart 25 | CommentContent 26 | CommentEnd
Shortcut to reference any possible token.
1type AnyToken = Token<TokenTypes.AnyTokenType>
Just an alias to DocumentNode. AST always has one top-level DocumentNode. See AST Node Types
1type AST = TreeConstructor.DocumentNode
There are 7 possible types of Node. Each type has a specific content.
1type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>
1type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>
1type TextNode = Node<NodeTypes.Text, NodeContents.Text>
1type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>
1type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>
1type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>
1type StyleNode = Node<NodeTypes.Style, NodeContents.Style>
Interfaces for each content type:
Generic Node, other interfaces use it to create specific Nodes by providing type of Node and type of the content inside the Node.
1interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> { 2 nodeType: T 3 content: C 4}
Shortcut type of all possible Node types.
1type AnyNodeType = 2 | Document 3 | Doctype 4 | Tag 5 | Text 6 | Comment 7 | Script 8 | Style
Shortcut type of all possible types of content inside a Node.
1type AnyNodeContent = 2 | Document 3 | Doctype 4 | Text 5 | Tag 6 | Comment 7 | Script 8 | Style
1interface Document { 2 children: AnyNode[] 3}
1interface Doctype { 2 start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart> 3 attributes?: DoctypeAttribute[] 4 end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd> 5}
1interface Text { 2 value: Tokenizer.Token<Tokenizer.TokenTypes.Text> 3}
1interface Tag { 2 name: string 3 selfClosing: boolean 4 openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart> 5 attributes?: TagAttribute[] 6 openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd> 7 children?: AnyNode[] 8 close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag> 9}
1interface Comment { 2 start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart> 3 value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent> 4 end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd> 5}
1interface Script { 2 openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript> 3 attributes?: TagAttribute[] 4 openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript> 5 value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent> 6 close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript> 7}
1interface Style { 2 openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>, 3 attributes?: TagAttribute[], 4 openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>, 5 value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>, 6 close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle> 7}
1interface DoctypeAttribute { 2 startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>, 3 value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>, 4 endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd> 5}
1interface TagAttribute { 2 key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>, 3 startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>, 4 value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>, 5 endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd> 6}
No vulnerabilities found.
Reason
no dangerous workflow patterns detected
Reason
no binaries found in the repo
Reason
license file detected
Details
Reason
dependency not pinned by hash detected -- score normalized to 3
Details
Reason
0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Reason
Found 0/10 approved changesets -- score normalized to 0
Reason
detected GitHub workflow tokens with excessive permissions
Details
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
security policy file not detected
Details
Reason
project is not fuzzed
Details
Reason
branch protection not enabled on development/release branches
Details
Reason
SAST tool is not run on all commits -- score normalized to 0
Details
Reason
16 existing vulnerabilities detected
Details
Score
Last Scanned on 2025-01-27
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn More