Installations

npm install @acmedinotech/docproc

Developer Guide

BETA

Typescript

Yes

Module System

CommonJS

Node Version

12.20.0

NPM Version

6.14.8 Pull Requests

Open

0

Total

20

Closed

0

Merged

20

Issues

Open

0

Total

0

Closed

0

Releases

Unable to fetch releases

Languages

3

TypeScript

JavaScript

Shell

TypeScript (98.89%)

JavaScript (0.71%)

Shell (0.4%)

Developer

kaisershahid

Download Statistics

Total Downloads

0

Last Day

0

Last Week

0

Last Month

0

Last Year

0

GitHub Statistics

GPL-3.0 License

29 Commits

1 Watchers

3 Branches

1 Contributors

Updated on Jan 15, 2021

Maintainers

1

View All 1 Contributors

Package Meta Information

Latest Version

0.8.14

Package Id

@acmedinotech/docproc@0.8.14

Unpacked Size

861.78 kB

Size

390.39 kB

File Count

75

NPM Version

6.14.8

Node Version

12.20.0

Total Downloads

Cumulative downloads

Total Downloads

NaN

Last Day

0%

NaN

Compared to previous day

Last Week

0%

NaN

Compared to previous week

Last Month

0%

NaN

Compared to previous month

Last Year

0%

NaN

Compared to previous year

Weekly Downloads

Monthly Downloads

Yearly Downloads

Dependencies

5

@types/argparse @types/node argparse ts-node typescript

Dev Dependencies

12

@types/chai @types/cheerio @types/mocha awesome-typescript-loader chai cheerio fs mocha path prettier webpack webpack-cli

docproc

An extensible document processor, suitable for human-friendly markup. Take it for a drive with your Markdown document of choice:

docproc path/to/your/file

Architecture Overview

First, let's talk document structure. Human-readable docs are linear, and they're typically organized in groups (blocks). The blocks themselves contain inline data or sub-blocks.

## html blocks at different levels

<html>
    <div><b>bold</b></div>
</html>

## markdown

> blockquote **bold**

normal paragraph

The basic approach to all solid document processors is that they use a lexer-parser pattern to break the doc down into its smallest part then sequentially put them back together (in our case, as blocks with inline text).

docproc isn't any different there. What docproc aims to do is create a pattern for configuring lexeme detection and block/inline handling. Once you get a sense for how these pieces fit it should make writing your own processor easy.

High Level Architecture

docproc makes no assumption about what you're trying to process, but it does come with a Markdown (CommonMark) plugin and DinoMark plugin, which enhances CommonMark with more dynamic processing capabilities.

How it Works (High Level)

Let's use the following snippet of Markdown as our reference:

1> **blockquote**
2
3paragraph _**bold italic**_

To start, we need to specify the following lexemes:

>
(space)
**
_
\\n

Anything that isn't explicitly identified is grouped together and emitted as their own lexemes.

We'll also need to build two block handlers:

blockquoteHandler will only accept lines beginning with >. If there are 2 consecutive newlines, the blockquote handler is done.
paragraphHandler accepts anything. Like blockquote, it also terminates after 2 consecutive newlines.

Each instance of a block has its own handler instance.

Finally, we'll need to build two inline handlers:

boldHandler starts and stops ** and allows embedded formatting
italicHandler starts and stops _ and allows embedded formatting

Follow the Tokens

Let's trace how each token changes the state of the parser, starting at the block level:

>
- blockquoteHandler can accept and is set as current handler
, **, blockquote, **
- all accepted by blockquoteHandler
\\n, \\n
- blockquote done, no longer current handler
paragraph
- paragraphHandler can accept and is set as current handler
_, **, bold, , italic, **, _
- all accepted by paragraphHandler

Pretty simple so far. Now let's look within the block and see what happens with the inline tokens. I'll use the paragraph handler:

_
- matches an inline handler. it'll take all tokens until another _, but since it allows embedding other formatting, it'll first defer the tokens to specific handlers if they exist
- stack: [italicHandler]
**
- matches an inline handler, which nests and defers
- stack: [italicHandler, boldHandler]
bold, , italic
- goes into boldHandler
**
- boldHandler is popped
- stack: [italicHandler]
_
- italicHandler is popped
- stack: []

When you turn the document into a string, you get all the pieces back, assembled from fragments of HTML returned from the different handlers.

That's basically it! You can see it all put together in readme.example.ts

Take a deeper dive:

No vulnerabilities found.

10

Binary-Artifacts

Determines if the project has generated executable (binary) artifacts in the source repository.

10

License

Determines if the project has defined a license.

0

Code-Review

Determines if the project requires human code review before pull requests (aka merge requests) are merged.

0

Maintained

Determines if the project is "actively maintained".

0

CII-Best-Practices

Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.

0

Security-Policy

Determines if the project has published a security policy.

0

Fuzzing

Determines if the project uses fuzzing.

0

Branch-Protection

Determines if the default and release branches are protected with GitHub's branch protection settings.

0

SAST

Determines if the project uses static code analysis.

0

Vulnerabilities

Determines if the project has open, known unfixed vulnerabilities.

Score

1.7

/10

Last Scanned on 2025-07-07

The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.

Learn More