Motivation
Extract the TLD/domain/subdomain parts of an URL/hostname against mozilla TLDs official listing.
API
var parser = require('tld-extract');
console.log( parser("http://www.google.com") );
console.log( parser("http://google.co.uk") );
/**
* >> { tld: 'com', domain: 'google.com', sub: 'www' }
* >> { tld: 'co.uk', domain: 'google.co.uk', sub: '' }
*/
Private TLDs
Private TLDs are supported, see chromium source code for specs
console.log( parser("http://jeanlebon.cloudfront.net"));
/**
* >> { tld : 'net', domain : 'cloudfront.net', sub : 'jeanlebon' };
*/
console.log( parser("http://jeanlebon.cloudfront.net", {allowPrivateTLD : true}));
/**
* >> { tld : 'cloudfront.net', domain : 'jeanlebon.cloudfront.net', sub : '' };
*/
Unknown TLDs (level0)
By default, unknown TLD throw an exception, you can allow them and use tld-extract as a parser using the allowUnknownTLD
option
parse("http://nowhere.local")
>> throws /Invalid TLD/
parse("http://nowhere.local", {allowUnknownTLD : true}))
>> { tld : 'local', domain : 'nowhere.local', sub : '' }
DotLess domain
Using a tld as a direct domain name, or dotless domain is highly not recommended (ICANN and IAB have spoken out against the practice, classifying it as a security risk among other concerns.[34] ICANN's Security and Stability Advisory Committee (SSAC) additionally claims that SMTP "requires at least two labels in the FQDN of a mail address" and, as such, mail servers would reject emails to addresses with dotless domains), and will throw an error in tld-extract
. You can override this behavior using the allowDotlessTLD
option.
parse("http://notaires.fr")
>> throws /Invalid TLD/
parse("http://notaires.fr", {allowDotlessTLD : true}))
>> { tld : 'notaires.fr', domain : 'notaires.fr', sub : '' }
Why
- no dependencies
- really fast
- full code coverage
- easy to read (10 lines)
- easily updatable vs mozilla TLDs source list
- TypeScript support
Maintenance
You can update the remote hash table using npm run update
Not Invented Here
-
A port of a yks/PHP library
-
tldextract => bad API, (no need for async, "domain" property is wrong), no need for dependencies
-
tld => (nothing bad, a bit outdated )
-
tld.js => no sane way to prove/trust/update TLD listing
Credits