Gathering detailed insights and metrics for @acmedinotech/docproc
Gathering detailed insights and metrics for @acmedinotech/docproc
Gathering detailed insights and metrics for @acmedinotech/docproc
Gathering detailed insights and metrics for @acmedinotech/docproc
docproc
A document processing pipeline
daq-proc
Simple document processor to make search running in the browser and node.js a little better. Supports 50+ languages. Removes stopwords (smaller index and less irrelevant hits), extract keywords to filter on and prepares ngrams for auto-complete functional
@acmedinotech/form-state
an intuitive approach to dynamic forms and application state.
@acmedinotech/dynform-react
managed stateful, nested form data with ease and efficiency
npm install @acmedinotech/docproc
Module System
Min. Node Version
Typescript Support
Node Version
NPM Version
29 Commits
2 Watching
3 Branches
1 Contributors
Updated on 15 Jan 2021
TypeScript (98.89%)
JavaScript (0.71%)
Shell (0.4%)
Cumulative downloads
Total Downloads
Last day
0%
3
Compared to previous day
Last week
0%
7
Compared to previous week
Last month
66.7%
10
Compared to previous month
Last year
-59.2%
119
Compared to previous year
An extensible document processor, suitable for human-friendly markup. Take it for a drive with your Markdown document of choice:
docproc path/to/your/file
First, let's talk document structure. Human-readable docs are linear, and they're typically organized in groups (blocks). The blocks themselves contain inline data or sub-blocks.
## html blocks at different levels
<html>
<div><b>bold</b></div>
</html>
## markdown
> blockquote **bold**
normal paragraph
The basic approach to all solid document processors is that they use a lexer-parser pattern to break the doc down into its smallest part then sequentially put them back together (in our case, as blocks with inline text).
docproc isn't any different there. What docproc aims to do is create a pattern for configuring lexeme detection and block/inline handling. Once you get a sense for how these pieces fit it should make writing your own processor easy.
docproc makes no assumption about what you're trying to process, but it does come with a Markdown (CommonMark) plugin and DinoMark plugin, which enhances CommonMark with more dynamic processing capabilities.
Let's use the following snippet of Markdown as our reference:
1> **blockquote** 2 3paragraph _**bold italic**_
To start, we need to specify the following lexemes:
>
(space)**
_
\\n
Anything that isn't explicitly identified is grouped together and emitted as their own lexemes.
We'll also need to build two block handlers:
blockquoteHandler
will only accept lines beginning with >
. If there are 2 consecutive newlines, the blockquote handler is done.paragraphHandler
accepts anything. Like blockquote, it also terminates after 2 consecutive newlines.Each instance of a block has its own handler instance.
Finally, we'll need to build two inline handlers:
boldHandler
starts and stops **
and allows embedded formattingitalicHandler
starts and stops _
and allows embedded formattingLet's trace how each token changes the state of the parser, starting at the block level:
>
blockquoteHandler
can accept and is set as current handler
, **
, blockquote
, **
blockquoteHandler
\\n
, \\n
paragraph
paragraphHandler
can accept and is set as current handler_
, **
, bold
,
, italic
, **
, _
paragraphHandler
Pretty simple so far. Now let's look within the block and see what happens with the inline tokens. I'll use the paragraph handler:
_
_
, but since it allows embedding other formatting,
it'll first defer the tokens to specific handlers if they exist[italicHandler]
**
[italicHandler, boldHandler]
bold
,
, italic
boldHandler
**
boldHandler
is popped[italicHandler]
_
italicHandler
is popped[]
When you turn the document into a string, you get all the pieces back, assembled from fragments of HTML returned from the different handlers.
That's basically it! You can see it all put together in readme.example.ts
Take a deeper dive:
No vulnerabilities found.
Reason
no binaries found in the repo
Reason
license file detected
Details
Reason
0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Reason
Found 0/27 approved changesets -- score normalized to 0
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
security policy file not detected
Details
Reason
project is not fuzzed
Details
Reason
branch protection not enabled on development/release branches
Details
Reason
SAST tool is not run on all commits -- score normalized to 0
Details
Reason
26 existing vulnerabilities detected
Details
Score
Last Scanned on 2024-11-18
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn More