tktk

The TK toolkit. Utilities for dealing with data in Node.js.

The TK Toolkit

A work-in-progress collection of utilities to help with reading, transforming and writing data. It's a collection of the following libraries:

  • Indian Oceanfor reading and writing data files (json, csv or a variety of delimited-value files) — converts spreadsheet-like data to json.
  • Tablespoonfor creating sqlite and postgresql databases from json objects.
  • Joinerfor doing left joins on two json or geojson objects.
  • Simple statisticsfor performing simple stats operations like average, mean and some less-simple things like variance, jenks clustering, t-tests and bayesian classification.
Installation
npm install tktk
Functions, See indian-oceandocumentation for most updated list

Reading data

Uses the indian-oceanmodule. Reads a variety of data file formats in as json.

.readData(filepath, [delimiter], callback)

Reads in a data file given a path ending in the file format. Callback structure is function(err, data).

Supported formats:

  • .jsonArray of objects
  • .csvComma-separated
  • .tsvTab-separated
  • .psvPipe-separated

Pass in a delimiter as the second argument to read in another format.

Note: Does not currently support .dbf files.

.readDataSync(filepath, [delimiter])

Syncronous version of .readData().

.readJson(filepath, callback)

Read in a json file. Callback structure is function(err, data).

.readJsonSync(filepath)

Read json syncronously.

.readCsv(filepath, callback)

Read in a comma-separated value file. Callback structure is function(err, data).

.readCsvSync(filepath)

Read csv syncronously.

.readTsv(filepath, callback)

Read in a tab-separated value file. Callback structure is function(err, data).

.readTsvSync(filepath)

Read tsv syncronously.

.readPsv(filepath, callback)

Read in a pipe-separated value file. Callback structure is function(err, data).

.readPsvSync(filepath)

Read psv syncronously.

.readDbf(filepath, callback)

Read in a .dbf file. Callback structure is function(err, data).

Writing data

Uses the indian-oceanmodule. Writes json objects to the specified format.

.writeData(filepath, data, callback)

Write out the data object, inferring the file format from the file ending specified in filepath. Callback structure is function(err, data).

Supported formats:

  • .jsonArray of objects
  • .csvComma-separated
  • .tsvTab-separated
  • .psvPipe-separated

Note: Does not currently support .dbf files.

.writeDataSync(filepath, data)

Syncronous version of .writeData. Callback structure is function(err).

.writeDbfToData(inFilepath, outFilepath, callback)

Reads in a dbf file with .readDbfand write to file using .writeData. Callback structure is function(err).

Joining data

Uses the joinermodule. All methods return an object with the following structure:

data: [data object],
report: {
    diff: {
        a: [data in A],
        b: [data in A],
        a_and_b: [data in A and B],
        a_not_in_b: [data in A not in B],
        b_not_in_a: [data in B not in A]
    }:
    prose: {
        summary: [summary description of join result, number of matches in A and B, A not in B, B not in A.]
        full:    [full list of which rows were joined in each of the above categories]
    }
}

_.left(leftData, leftDataKey, rightData, rightDataKey, [nestedKeyName])

Perform a left join on the two array-of-object json datasets. Optionally, you can pass in a key name in case the left data's attribute dictionary is nested, such as in GeoJson where the attributes are under a propertiesobject.

.geoJson(leftData, leftDataKey, rightData, rightDataKey)

Does the same thing as .leftbut navigates to the featuresarray and passes in propertiesas the nested key name.

Database operations

Uses the tablespoonmodule. Check out the wikifor the full documention. All tablespoonmethods are accessible under the tk.dbnamespace, e.g.

tk.db.createTableSync(data);

Statistics

Uses the simple-statisticsmodule. All simple-statisticsmethods are accessible under the tk.statsnamespace, e.g.

var mean = tk.stats.mean([1, 4, 19, 55]);

Helpers

.discernFormat(filepath)

Given a filepathreturn its file extension. Used internally by .discernPaserand .discernFileFormatter.

E.g. tk.discernFormat('path/to/data.csv')returns 'csv'

.discernParser(filepath, [delimiter])

Given a filepath, optionally a delimiter, return a parser that can read that file as json. Used internally by .readDataand .readDataSync.

E.g.

var csvParser = tk.discernParser('path/to/data.csv');

var json = parser('path/to/data.csv');

.discernFileFormatter(filepath)

Returns a formatter that will format json data to file type specified by the extension in filepath. Used internally by .writeDataand .writeDataSync.

E.g.

var formatter = tk.discernFileFormatter('path/to/data.tsv');
var csv = formatter(json);

fs

Exposes the native File Systemmodule for convenience.

What's the name mean?

In news writing, TKis used as a placeholder for facts or sections you don't have yet. For example:

Mrs. Williamson arrived at the office at TK EXACT TIME to speak with the board members.

Depending on whom you ask, it either stands for TO COMEif you like your acronyms phonetic or TO KNOWif you don't mind the silent 'K'.

What's that have to do with this?

This library is a work in progress so it's largely TO COME. You could also say you can use it TO KNOWthings since it's a collection of data utilities. Or you could say it's a (T)ool(K)it of toolkits: a TK TK.

HomePage

https://github.com/mhkeller/tktk

Repository

https://github.com/mhkeller/tktk.git


上一篇:tablespoon
下一篇:indian-ocean

相关推荐

暂无相关文章

官方社区

扫码加入 JavaScript 社区