Installations

npm install hyntax

Developer Guide

BETA

Typescript

No

Module System

CommonJS

Min. Node Version

>=6.11.1

Node Version

12.18.2

NPM Version

6.14.5 Score

86.4

Supply Chain

100

Quality

76

Maintenance

100

Vulnerability

100

License

Pull Requests

Open

6

Total

40

Closed

11

Merged

23

Issues

Open

8

Total

16

Closed

8

Releases

2

v1.1.9

Published on 06 Sept 2020

v1.1.6

Published on 13 Jul 2020

View all 2 releases

Contributors

2

Unable to fetch Contributors

View all 2 contributors

Languages

3

JavaScript

HTML

Shell

JavaScript (98.8%)

HTML (1.14%)

Shell (0.06%)

Developer

mykolaharmash

Download Statistics

Total Downloads

1,310,050

Last Day

687

Last Week

4,167

Last Month

21,919

Last Year

271,957

GitHub Statistics

139 Stars

191 Commits

8 Forks

8 Watching

18 Branches

2 Contributors

Bundle Size

29.39 kB

Minified

4.61 kB

Minified + Gzipped

Bundlephobia

Maintainers

1

Package Meta Information

Latest Version

1.1.9

Package Id

hyntax@1.1.9

Size

735.93 kB

NPM Version

6.14.5

Node Version

12.18.2

Publised On

06 Sept 2020

Total Downloads

Cumulative downloads

Total Downloads

1,310,050

Last day

-34.5%

687

Compared to previous day

Last week

-20%

4,167

Compared to previous week

Last month

-17.4%

21,919

Compared to previous month

Last year

34.8%

271,957

Compared to previous year

Daily Downloads

Weekly Downloads

Monthly Downloads

Yearly Downloads

Dev Dependencies

11

@babel/cli @babel/core @babel/preset-env coveralls deep-diff eslint nyc remark remark-toc tap-spec tape

Versions

Hyntax

Straightforward HTML parser for JavaScript. Live Demo.

Simple. API is straightforward, output is clear.
Forgiving. Just like a browser, normally parses invalid HTML.
Supports streaming. Can process HTML while it's still being loaded.
No dependencies.

Usage

1npm install hyntax

1const { tokenize, constructTree } = require('hyntax')
2const util = require('util')
3
4const inputHTML = `
5<html>
6  <body>
7      <input type="text" placeholder="Don't type">
8      <button>Don't press</button>
9  </body>
10</html>
11`
12
13const { tokens } = tokenize(inputHTML)
14const { ast } = constructTree(tokens)
15
16console.log(JSON.stringify(tokens, null, 2))
17console.log(util.inspect(ast, { showHidden: false, depth: null }))

TypeScript Typings

Hyntax is written in JavaScript but has integrated TypeScript typings to help you navigate around its data structures. There is also Types Reference which covers most common types.

Streaming

Use StreamTokenizer and StreamTreeConstructor classes to parse HTML chunk by chunk while it's still being loaded from the network or read from the disk.

1const { StreamTokenizer, StreamTreeConstructor } = require('hyntax')
2const http = require('http')
3const util = require('util')
4
5http.get('http://info.cern.ch', (res) => {
6  const streamTokenizer = new StreamTokenizer()
7  const streamTreeConstructor = new StreamTreeConstructor()
8
9  let resultTokens = []
10  let resultAst
11
12  res.pipe(streamTokenizer).pipe(streamTreeConstructor)
13
14  streamTokenizer
15    .on('data', (tokens) => {
16      resultTokens = resultTokens.concat(tokens)
17    })
18    .on('end', () => {
19      console.log(JSON.stringify(resultTokens, null, 2))
20    })
21
22  streamTreeConstructor
23    .on('data', (ast) => {
24      resultAst = ast
25    })
26    .on('end', () => {
27      console.log(util.inspect(resultAst, { showHidden: false, depth: null }))
28    })
29}).on('error', (err) => {
30  throw err;
31})

Tokens

Here are all kinds of tokens which Hyntax will extract out of HTML string.

Overview of all possible tokens

Each token conforms to Tokenizer.Token interface.

AST Format

Resulting syntax tree will have at least one top-level Document Node with optional children nodes nested within.

1{
2  nodeType: TreeConstructor.NodeTypes.Document,
3  content: {
4    children: [
5      {
6        nodeType: TreeConstructor.NodeTypes.AnyNodeType,
7        content: {…}
8      },
9      {
10        nodeType: TreeConstructor.NodeTypes.AnyNodeType,
11        content: {…}
12      }
13    ]
14  }
15}

Content of each node is specific to node's type, all of them are described in AST Node Types reference.

API Reference

Tokenizer

Hyntax has its tokenizer as a separate module. You can use generated tokens on their own or pass them further to a tree constructor to build an AST.

Interface

1tokenize(html: String): Tokenizer.Result

Arguments

html
HTML string to process
Required.
Type: string.

Returns Tokenizer.Result

Tree Constructor

After you've got an array of tokens, you can pass them into tree constructor to build an AST.

Interface

1constructTree(tokens: Tokenizer.AnyToken[]): TreeConstructor.Result

Arguments

tokens
Array of tokens received from the tokenizer.
Required.
Type: Tokenizer.AnyToken[]

Returns TreeConstructor.Result

Types Reference

Tokenizer.Result

1interface Result {
2  state: Tokenizer.State
3  tokens: Tokenizer.AnyToken[]
4}

state
The current state of tokenizer. It can be persisted and passed to the next tokenizer call if the input is coming in chunks.
tokens
Array of resulting tokens.
Type: Tokenizer.AnyToken[]

TreeConstructor.Result

1interface Result {
2  state: State
3  ast: AST
4}

state
The current state of the tree constructor. Can be persisted and passed to the next tree constructor call in case when tokens are coming in chunks.
ast
Resulting AST.
Type: TreeConstructor.AST

Tokenizer.Token

Generic Token, other interfaces use it to create a specific Token type.

1interface Token<T extends TokenTypes.AnyTokenType> {
2  type: T
3  content: string
4  startPosition: number
5  endPosition: number
6}

type
One of the Token types.
content
Piece of original HTML string which was recognized as a token.
startPosition
Index of a character in the input HTML string where the token starts.
endPosition
Index of a character in the input HTML string where the token ends.

Tokenizer.TokenTypes.AnyTokenType

Shortcut type of all possible tokens.

1type AnyTokenType =
2  | Text
3  | OpenTagStart
4  | AttributeKey
5  | AttributeAssigment
6  | AttributeValueWrapperStart
7  | AttributeValue
8  | AttributeValueWrapperEnd
9  | OpenTagEnd
10  | CloseTag
11  | OpenTagStartScript
12  | ScriptTagContent
13  | OpenTagEndScript
14  | CloseTagScript
15  | OpenTagStartStyle
16  | StyleTagContent
17  | OpenTagEndStyle
18  | CloseTagStyle
19  | DoctypeStart
20  | DoctypeEnd
21  | DoctypeAttributeWrapperStart
22  | DoctypeAttribute
23  | DoctypeAttributeWrapperEnd
24  | CommentStart
25  | CommentContent
26  | CommentEnd

Tokenizer.AnyToken

Shortcut to reference any possible token.

1type AnyToken = Token<TokenTypes.AnyTokenType>

TreeConstructor.AST

Just an alias to DocumentNode. AST always has one top-level DocumentNode. See AST Node Types

1type AST = TreeConstructor.DocumentNode

AST Node Types

There are 7 possible types of Node. Each type has a specific content.

1type DocumentNode = Node<NodeTypes.Document, NodeContents.Document>

1type DoctypeNode = Node<NodeTypes.Doctype, NodeContents.Doctype>

1type TextNode = Node<NodeTypes.Text, NodeContents.Text>

1type TagNode = Node<NodeTypes.Tag, NodeContents.Tag>

1type CommentNode = Node<NodeTypes.Comment, NodeContents.Comment>

1type ScriptNode = Node<NodeTypes.Script, NodeContents.Script>

1type StyleNode = Node<NodeTypes.Style, NodeContents.Style>

Interfaces for each content type:

Document
Doctype
Text
Tag
Comment
Script
Style

TreeConstructor.Node

Generic Node, other interfaces use it to create specific Nodes by providing type of Node and type of the content inside the Node.

1interface Node<T extends NodeTypes.AnyNodeType, C extends NodeContents.AnyNodeContent> {
2  nodeType: T
3  content: C
4}

TreeConstructor.NodeTypes.AnyNodeType

Shortcut type of all possible Node types.

1type AnyNodeType =
2  | Document
3  | Doctype
4  | Tag
5  | Text
6  | Comment
7  | Script
8  | Style

Node Content Types

TreeConstructor.NodeTypes.AnyNodeContent

Shortcut type of all possible types of content inside a Node.

1type AnyNodeContent =
2  | Document
3  | Doctype
4  | Text
5  | Tag
6  | Comment
7  | Script
8  | Style

TreeConstructor.NodeContents.Document

1interface Document {
2  children: AnyNode[]
3}

TreeConstructor.NodeContents.Doctype

1interface Doctype {
2  start: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeStart>
3  attributes?: DoctypeAttribute[]
4  end: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeEnd>
5}

TreeConstructor.NodeContents.Text

1interface Text {
2  value: Tokenizer.Token<Tokenizer.TokenTypes.Text>
3}

TreeConstructor.NodeContents.Tag

1interface Tag {
2  name: string
3  selfClosing: boolean
4  openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStart>
5  attributes?: TagAttribute[]
6  openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEnd>
7  children?: AnyNode[]
8  close?: Tokenizer.Token<Tokenizer.TokenTypes.CloseTag>
9}

TreeConstructor.NodeContents.Comment

1interface Comment {
2  start: Tokenizer.Token<Tokenizer.TokenTypes.CommentStart>
3  value: Tokenizer.Token<Tokenizer.TokenTypes.CommentContent>
4  end: Tokenizer.Token<Tokenizer.TokenTypes.CommentEnd>
5}

TreeConstructor.NodeContents.Script

1interface Script {
2  openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartScript>
3  attributes?: TagAttribute[]
4  openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndScript>
5  value: Tokenizer.Token<Tokenizer.TokenTypes.ScriptTagContent>
6  close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagScript>
7}

TreeConstructor.NodeContents.Style

1interface Style {
2  openStart: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagStartStyle>,
3  attributes?: TagAttribute[],
4  openEnd: Tokenizer.Token<Tokenizer.TokenTypes.OpenTagEndStyle>,
5  value: Tokenizer.Token<Tokenizer.TokenTypes.StyleTagContent>,
6  close: Tokenizer.Token<Tokenizer.TokenTypes.CloseTagStyle>
7}

TreeConstructor.DoctypeAttribute

1interface DoctypeAttribute {
2  startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperStart>,
3  value: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttribute>,
4  endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.DoctypeAttributeWrapperEnd>
5}

TreeConstructor.TagAttribute

1interface TagAttribute {
2  key?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeKey>,
3  startWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperStart>,
4  value?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValue>,
5  endWrapper?: Tokenizer.Token<Tokenizer.TokenTypes.AttributeValueWrapperEnd>
6}

No vulnerabilities found.

10

Dangerous-Workflow

Determines if the project's GitHub Action workflows avoid dangerous patterns.

10

Binary-Artifacts

Determines if the project has generated executable (binary) artifacts in the source repository.

10

License

Determines if the project has defined a license.

3

Pinned-Dependencies

Determines if the project has declared and pinned the dependencies of its build process.

0

Maintained

Determines if the project is "actively maintained".

0

Code-Review

Determines if the project requires human code review before pull requests (aka merge requests) are merged.

0

Token-Permissions

Determines if the project's workflows follow the principle of least privilege.

0

CII-Best-Practices

Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.

0

Security-Policy

Determines if the project has published a security policy.

0

Fuzzing

Determines if the project uses fuzzing.

0

Branch-Protection

Determines if the default and release branches are protected with GitHub's branch protection settings.

0

SAST

Determines if the project uses static code analysis.

0

Vulnerabilities

Determines if the project has open, known unfixed vulnerabilities.

Score

2.7

/10

Last Scanned on 2025-01-27

The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.

Learn More

Other packages similar to hyntax

Other packages similar to hyntax

hyntax

Installations

Developer Guide

No

CommonJS

>=6.11.1

12.18.2

6.14.5

Score

Pull Requests

6

40

11

23

Issues

8

16

8

Releases

Contributors

Unable to fetch Contributors

Languages

Developer

mykolaharmash

Download Statistics

GitHub Statistics

Bundle Size

29.39 kB

4.61 kB

Maintainers

Package Meta Information

Total Downloads

1,310,050

Daily Downloads

Weekly Downloads

Monthly Downloads

Yearly Downloads

Dev Dependencies

Hyntax

Table Of Contents

Usage

TypeScript Typings

Streaming

Tokens

AST Format

API Reference

Tokenizer

Interface

Arguments

Returns Tokenizer.Result

Tree Constructor

Interface

Arguments

Returns TreeConstructor.Result

Types Reference

Tokenizer.Result

TreeConstructor.Result

Tokenizer.Token

Tokenizer.TokenTypes.AnyTokenType

Tokenizer.AnyToken

TreeConstructor.AST

AST Node Types

TreeConstructor.Node

TreeConstructor.NodeTypes.AnyNodeType

Node Content Types

TreeConstructor.NodeTypes.AnyNodeContent

TreeConstructor.NodeContents.Document

TreeConstructor.NodeContents.Doctype

TreeConstructor.NodeContents.Text

TreeConstructor.NodeContents.Tag

TreeConstructor.NodeContents.Comment

TreeConstructor.NodeContents.Script

TreeConstructor.NodeContents.Style

TreeConstructor.DoctypeAttribute

TreeConstructor.TagAttribute

10

10

10