Gathering detailed insights and metrics for @a-s8h/liblevenshtein
Gathering detailed insights and metrics for @a-s8h/liblevenshtein
Gathering detailed insights and metrics for @a-s8h/liblevenshtein
Gathering detailed insights and metrics for @a-s8h/liblevenshtein
Various utilities regarding Levenshtein transducers. (CoffeeScript / JavaScript / Node.js)
npm install @a-s8h/liblevenshtein
Typescript
Module System
Node Version
NPM Version
76.6
Supply Chain
99.5
Quality
75
Maintenance
100
Vulnerability
100
License
CoffeeScript (100%)
Total Downloads
20,181
Last Day
22
Last Week
80
Last Month
695
Last Year
7,051
28 Commits
1 Branches
1 Contributors
Minified
Minified + Gzipped
Latest Version
2.0.4
Package Id
@a-s8h/liblevenshtein@2.0.4
Unpacked Size
225.78 kB
Size
40.47 kB
File Count
36
NPM Version
8.5.2
Node Version
14.19.1
Publised On
01 Feb 2023
Cumulative downloads
Total Downloads
Last day
-35.3%
22
Compared to previous day
Last week
-46.7%
80
Compared to previous week
Last month
33.9%
695
Compared to previous month
Last year
-46.3%
7,051
Compared to previous year
5
Levenshtein transducers accept a query term and return all terms in a dictionary that are within n spelling errors away from it. They constitute a highly-efficient (space and time) class of spelling correctors that work very well when you do not require context while making suggestions. Forget about performing a linear scan over your dictionary to find all terms that are sufficiently-close to the user's query, using a quadratic implementation of the Levenshtein distance or Damerau-Levenshtein distance, these babies find all the terms from your dictionary in linear time on the length of the query term (not on the size of the dictionary, on the length of the query term).
If you need context, then take the candidates generated by the transducer as a starting place, and plug them into whatever model you're using for context (such as by selecting the sequence of terms that have the greatest probability of appearing together).
For a quick demonstration, please visit the Github Page, here.
The library is currently written in Java, CoffeeScript, and JavaScript, but I will be porting it to other languages, soon. If you have a specific language you would like to see it in, or package-management system you would like it deployed to, let me know.
Install the module via npm
:
% npm install liblevenshtein
info trying registry request attempt 1 at 12:59:16
http GET https://registry.npmjs.org/liblevenshtein
http 304 https://registry.npmjs.org/liblevenshtein
liblevenshtein@2.0.4 node_modules/liblevenshtein
Then, you may require
it to do whatever you need:
1var levenshtein = require('liblevenshtein'); 2 3// Assume "completion_list" is a list of terms you want to match against in 4// fuzzy queries. 5var builder = new levenshtein.Builder() 6 .dictionary(completion_list, false) // generate spelling candidates from unsorted completion_list 7 .algorithm("transposition") // use Levenshtein distance extended with transposition 8 .sort_candidates(true) // sort the spelling candidates before returning them 9 .case_insensitive_sort(true) // ignore character-casing while sorting terms 10 .include_distance(false) // just return the ordered terms (drop the distances) 11 .maximum_candidates(10); // only want the top-10 candidates 12 13// Maximum number of spelling errors we will allow the spelling candidates to 14// have, with regard to the query term. 15var MAX_EDIT_DISTANCE = 2; 16 17var transducer = builder.build(); 18 19// Assume "term" corresponds to some query term. Once invoking 20// transducer.transduce(term, MAX_EDIT_DISTANCE), candidates will contain a list 21// of all spelling candidates from the completion list that are within 22// MAX_EDIT_DISTANCE units of error from the query term. 23var candidates = transducer.transduce(term, MAX_EDIT_DISTANCE);
To use the library on your website, reference the desired file from the
<head/>
of your document, like so:
1<!DOCTYPE html> 2<html> 3 <head> 4 <!-- stuff ... --> 5 <script type="text/javascript" 6 src="http://universal-automata.github.com/liblevenshtein/javascripts/2.0.4/levenshtein-transducer.min.js"> 7 </script> 8 <!-- more stuff ... --> 9 </head> 10 <body> 11 <!-- yet another fancy document ... --> 12 </body> 13</html>
Once the script loads, you should construct a transducer via the Builder Api:
1$(function ($) { 2 "use strict"; 3 4 // Maximum number of spelling errors we will allow the spelling candidates to 5 // have, with regard to the query term. 6 var MAX_EDIT_DISTANCE = 2; 7 8 var completion_list = getCompletionList(); // fictitious method 9 10 var builder = new levenshtein.Builder() 11 .dictionary(completion_list, false) // generate spelling candidates from unsorted completion_list 12 .algorithm("transposition") // use Levenshtein distance extended with transposition 13 .sort_candidates(true) // sort the spelling candidates before returning them 14 .case_insensitive_sort(true) // ignore character-casing while sorting terms 15 .include_distance(false) // just return the ordered terms (drop the distances) 16 .maximum_candidates(10); // only want the top-10 candidates 17 18 var transducer = builder.build(); 19 20 var $queryTerm = $('#query-term-input-field'); 21 $queryTerm.keyup(function (event) { 22 var candidates, term = $.trim($queryTerm.val()); 23 24 if (term) { 25 candidates = transducer.transduce(term, MAX_EDIT_DISTANCE); 26 printAutoComplete(candidates); // print the list of completions 27 } else { 28 clearAutoComplete(); // user has cleared the search box 29 } 30 31 return true; 32 }); 33});
This will give the user autocompletion hints as he types in the search box.
This library is based largely on the work of Stoyan Mihov, Klaus Schulz, and Petar Nikolaev Mitankin: "Fast String Correction with Levenshtein-Automata". For more details, please see the wiki.
No vulnerabilities found.
No security vulnerabilities found.