Tembo releases open-source tool for data mapping

Tembo’s engineering team has built a CSV-to-JSON conversion tool for handling large data sets, now available as an open source, command-line utility.

JSON (JavaScript Object Notation) is a data-interchange format that is easy for humans to read and write, and easy for machines to parse and generate. It is also consistent with the way data is stored in Javascript-based programs, the programming language that powers nearly every modern website.

Most of our clients – data analysts and data scientists – are much more comfortable working with tabular data, like comma-separated value tables (CSV). These files have been in use for various purposes for many years. However, converting files from tabular form into JSON object structures is a complicated and fairly underdeveloped process.

One way to solve the challenges of conversion is to “map” columns in a tabular data file to different “attributes” in JSON objects. To do this simply and flexibly, Tembo recommends using a JSON mapping document, which is itself written in JSON. This simple document links columns in a CSV tabular data file to the specific attributes of a JSON object in a way that is machine-readable and consistent.

We recognized a need for CSV-JSON conversion that can handle very large datasets, and have built and released this command-line utility in response.

