microformats-parser
A JavaScript microformats v2 parser, with v1 back-compatibility. View the demo. Works with both the browser and node.js.
Follows the microformats2 parsing specification.
Table of contents
Quick start
Installation
# yarn
yarn add microformats-parser
# npm
npm i microformats-parser
Simple use
const { mf2 } = require("microformats-parser");
const parsed = mf2('<a class="h-card" href="/" rel="me">Jimmy</a>', {
baseUrl: "http://example.com/",
});
console.log(parsed);
Outputs:
{
"items": [
{
"properties": {
"name": ["Jimmy"],
"url": ["http://example.com/"]
},
"type": ["h-card"]
}
],
"rel-urls": {
"http://example.com": {
"rels": ["me"],
"text": "Jimmy"
}
},
"rels": {
"me": ["http://example.com/"]
}
}
API
mf2()
Use: mf2(html: string, options: { baseUrl: string, experimental: object })
html
(string, required) - the HTML string to be parsed
options
(object, required) - parsing options, with the following properties:
baseUrl
(string, required) - a base URL to resolve relative URLs
experimental
(object, optional) - experimental (non-standard) options
lang
(boolean, optional) - enable support for parsing lang
attributes
textContent
(boolean, optional) - enable support for better collapsing whitespace in text content.
metaformats
(boolean, optional) - enable meta tag fallback.
Returns the parsed microformats from the HTML string
Support
Microformats v1
This package will parse microformats v1, however support will be limited to the v1 tests in the microformats test suite. Contributions are still welcome for improving v1 support.
Microformats v2
We provide support for all microformats v2 parsing, as detailed in the microformats2 parsing specification. If there is an issue with v2 parsing, please create an issue.
Experimental options
There is also support for some experimental parsing options. These can be enabled with the experimental
flags in the options
API.
Note: Experimental options are subject to change at short notice and may change their behaviour without a major version update
lang
Parse microformats for lang
attributes. This will include lang
on microformats and e-*
properties where available.
These are sourced from the element themselves, a parent microformat, the HTML document or a meta tag.
textContent
When parsing microformats for text content, all the consecutive whitespace is collapsed into a single space. <br/>
and <p>
tags are treated as line breaks.
metaformats
Enables fallback to metaformats parsing which looks at <meta>
tags to infer content.
Contributing
See our contributing guidelines for more information.