Gathering detailed insights and metrics for tsv-to-database
Gathering detailed insights and metrics for tsv-to-database
Gathering detailed insights and metrics for tsv-to-database
Gathering detailed insights and metrics for tsv-to-database
npm install tsv-to-database
Typescript
Module System
Node Version
NPM Version
TypeScript (95.53%)
JavaScript (4.47%)
Total Downloads
0
Last Day
0
Last Week
0
Last Month
0
Last Year
0
5 Stars
15 Commits
1 Forks
1 Branches
1 Contributors
Updated on Oct 15, 2021
Latest Version
0.1.5
Package Id
tsv-to-database@0.1.5
Unpacked Size
26.88 kB
Size
6.88 kB
File Count
12
NPM Version
6.4.1
Node Version
10.14.1
Cumulative downloads
Total Downloads
Last Day
0%
NaN
Compared to previous day
Last Week
0%
NaN
Compared to previous week
Last Month
0%
NaN
Compared to previous month
Last Year
0%
NaN
Compared to previous year
This package will help you to convert csv/tsv files to js object and write it to file or database (sqlite3, mongodb). Can automaticaly extract columns name from first line and guess type(string or number) from second line. All operations use streams for low memory consumption.
1npm install tsv-to-database --save
I created this packadge mostly for parse movies database from IMDB so in examples I show how to use streams to download, unzip, filter and write tsv files. You can see imdb .tsv file structure here. You can use request.js to get file from internet
This example shows how to read stream from input.tsv file and write it in output.json file
1const fs = require("fs"); 2const { TsvToObjectStream, ObjectToJsonStream } = require("tsv-to-database"); 3 4// read a file 5fs.createReadStream("input.tsv") 6 // transform bytes to object 7 .pipe(new TsvToObjectStream()) 8 // transform objects to json string 9 .pipe(new ObjectToJsonStream()) 10 // write string to file 11 .pipe(fs.createWriteStream("output.json"));
If you tsv file doesn't have columns' names you must pass it
1new TsvToObjectStream({ columns: ["food", "calories", "fat", "protein"] });
If you want to use you own columns' names you must pass names in options and ignore first line
1new TsvToObjectStream({ 2 ignoreFirstLine: true, 3 columns: ["food", "calories", "fat", "protein"] 4});
This example read stream from url and write it to sqlite database
1const zlib = require("zlib"); 2const request = require("request"); 3const { TsvToObjectStream, SqliteWriteStream } = require("tsv-to-database"); 4 5// get stream from internet 6request("https://datasets.imdbws.com/title.basics.tsv.gz") 7 // unzip 8 .pipe(zlib.createGunzip()) 9 // transform bytes to object 10 .pipe(new TsvToObjectStream()) 11 // write stream to sqlite 12 .pipe( 13 new SqliteWriteStream({ 14 databasePath: "imdb.db", 15 tableName: "title_basics" 16 }) 17 );
By default SqliteWriteStream automaticaly create table and insert statment. You can use your own create and insert templates. Stream replace all {{something}} with correct values
1new SqliteWriteStream({ 2 databasePath: "imdb.db", 3 tableName: "title_basics", 4 insertTemplate: "INSERT INTO {{table}} ({{columns}}) VALUES ({{values}});", 5 createTemplate: "CREATE TABLE IF NOT EXISTS {{table}} ({{columnTypes}});" 6});
This example read stream from url and write it to mongo database
1const zlib = require("zlib"); 2const request = require("request"); 3const { TsvToObjectStream, MongoWriteStream } = require("tsv-to-database"); 4 5// get stream from internet 6request("https://datasets.imdbws.com/title.basics.tsv.gz") 7 // unzip 8 .pipe(zlib.createGunzip()) 9 // transform bytes to object 10 .pipe(new TsvToObjectStream()) 11 // write stream to mongodb 12 .pipe( 13 new MongoWriteStream({ 14 databaseUrl: "mongodb://localhost:27017", 15 databaseName: "imdb", 16 collectionName: "title.basics" 17 }) 18 );
You can use filter and transform streams for filtering and transforming object from input stream. For example, you want to filter only good movies (rating>7)
1const zlib = require("zlib"); 2const request = require("request"); 3 4const { 5 TsvToObjectStream, 6 SqliteWriteStream, 7 FilterStream 8} = require("tsv-to-database"); 9 10const filter = data => data.averageRating > 8; 11 12request("https://datasets.imdbws.com/title.ratings.tsv.gz") 13 .pipe(zlib.createGunzip()) 14 .pipe(new TsvToObjectStream()) 15 .pipe(new FilterStream(filter)) 16 .pipe( 17 new SqliteWriteStream({ 18 databasePath: "imdb.db", 19 tableName: "title_ratings" 20 }) 21 );
For example, if you don't need all columns from table and want to rename startYear ("\N" to "unknown") you can use TransformStream
1const zlib = require("zlib"); 2const request = require("request"); 3const { 4 TsvToObjectStream, 5 MongoWriteStream, 6 TransformStream 7} = require("tsv-to-database"); 8 9const transform = data => { 10 const transformed = { 11 title: data.originalTitle, 12 year: data.startYear !== "N" ? data.startYear : "unknown" 13 }; 14 return transformed; 15}; 16 17request("https://datasets.imdbws.com/title.basics.tsv.gz") 18 .pipe(zlib.createGunzip()) 19 .pipe(new TsvToObjectStream()) 20 .pipe(new TransformStream(transform)) 21 .pipe( 22 new MongoWriteStream({ 23 databaseUrl: "mongodb://localhost:27017", 24 databaseName: "imdb", 25 collectionName: "title.basics" 26 }) 27 );
Sometimes it usefull to see how long you need to wait for stream finish. You can use ProgressStream to monitor elapsed time, persentage and memory consumption. Just pass file size to constructor (for internet file you can get size from headers)
1const zlib = require("zlib"); 2const request = require("request"); 3 4const { 5 TsvToObjectStream, 6 SqliteWriteStream, 7 ProgressStream 8} = require("tsv-to-database"); 9 10request("https://datasets.imdbws.com/title.ratings.tsv.gz").on( 11 "response", 12 responce => { 13 const size = Number(response.headers["content-length"]); 14 response 15 .pipe(new ProgressStream(size)) 16 .pipe(zlib.createGunzip()) 17 .pipe(new TsvToObjectStream()) 18 .pipe(new FilterStream(filter)) 19 .pipe( 20 new SqliteWriteStream({ 21 databasePath: "imdb.db", 22 tableName: "title_ratings" 23 }) 24 ); 25 } 26);
You can just subscribe for stream "data" event and work with data from stream.
1fs.createReadStream("input.tsv") 2 .pipe(new TsvToObjectStream()) 3 .on("data", data => { 4 /* your code here */ 5 });
This class transform byte/string stream to objects. It extracts columns name from first line and types from second. If some columns have mixed type (number and string) than type fallback to "string"
1// default options 2const options = { 3 // encoding is used for decoding byte stream 4 stringEncoding = "utf8", 5 // skip first line in tsv file 6 ignoreFirstLine = "false", 7 // used to split text to lines 8 lineSeparator = "\n", 9 // used to line text to columns 10 rowSeparator = "\n", 11 // use it if you want to overide parsed columns name. You must provide string array with names for all columns 12 columns, 13 // use it if you want to overide parsed types. You must provide string array with type ("number" | "string") for all columns 14 types 15};
This class write objects to sqlite database. It automaticaly create table if not exist and make insert statment. You can overide create and insert sql template. Possible template tokens: {{table}} - table name {{columns}} - parsed columns names {{values}} - named parameters {{columnTypes}} - parsed types
Class automaticaly found type of values (string or number) and add correct type to create table statment (TEXT or INTEGER). It can'not set PRIMARY key, if you need it so overide create sql statment. Be carefull with mixing data types in one column. This class make create table statment from first object. If it use number type and later value type changes to string than sqlite throw error. Better overide types in TsvToObjectStream class or provide create template with correct types.
1// default options 2const options = { 3 databasePath = "output.db", 4 tableName = "parsed_tsv"; 5 insertTemplate = "INSERT INTO {{table}} ({{columns}}) VALUES ({{values}});", 6 createTemplate = "CREATE TABLE IF NOT EXISTS {{table}} ({{columnTypes}});", 7 // use it if you already have table in database 8 isTableCreated = false 9};
This class write object stream to mongodb.
1// default options 2const options = { 3 databaseName = "tsv_to_mongo", 4 collectionName = "parsed_tsv", 5 databaseUrl = "mongodb://localhost:27017" 6};
Monitor passed time and percentage of parsed data.
1// @param size - size of stream in byte 2// @param logEvery - how often show log. Default every 10 percent 3 4new ProgressStream(size, logEvery);
No vulnerabilities found.
Reason
no binaries found in the repo
Reason
0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Reason
Found 0/15 approved changesets -- score normalized to 0
Reason
no SAST tool detected
Details
Reason
no effort to earn an OpenSSF best practices badge detected
Reason
security policy file not detected
Details
Reason
license file not detected
Details
Reason
project is not fuzzed
Details
Reason
branch protection not enabled on development/release branches
Details
Reason
66 existing vulnerabilities detected
Details
Score
Last Scanned on 2025-07-07
The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects.
Learn More