NodeJS – reading and processing a delimiter separated file (csv)

Lucas Jellema 11

Frequently, there is a need to read data from a file, process it and route it onwards. In my case, the objective was to produce messages on a Kafka Topic. However, regardless of the objective, the basic steps of reading the file and processing its contents are required often. In this article I show the very basic steps with Node.js and and the Node module csv-parse.

1. npm init process-csv

Enter a small number of details in the command line dialog. Shown in blue:

image

2. npm install csv-parse -save

This will install Node module csv-parse. This module provides processing of delimiter separated files.

image

This also extends the generated file package.json with a reference to csv-parse:

image

3. Implement file processFile.js

The logic to read records from a csv file and do something (write to console) with each record is very straightforward. In this example, I will read data from the file countries2.csv, a file with records for all countries in the world (courtesy of https://restcountries.eu/)

image

The fields are semi colon separated, the records are each on a new line.

 

/*
This program reads and parses all lines from csv files countries2.csv into an array (countriesArray) of arrays; each nested array represents a country.
The initial file read is synchronous. The country records are kept in memory.
*/

var fs = require('fs');
var parse = require('csv-parse');

var inputFile='countries2.csv';
console.log("Processing Countries file");

var parser = parse({delimiter: ';'}, function (err, data) {
    // when all countries are available,then process them
    // note: array element at index 0 contains the row of headers that we should skip
    data.forEach(function(line) {
      // create country object out of parsed fields
      var country = { "name" : line[0]
                    , "code" : line[1]
                    , "continent" : line[2]
                    , "population" : line[4]
                    , "size" : line[5]
                    };
     console.log(JSON.stringify(country));
    });    
});

// read the inputFile, feed the contents to the parser
fs.createReadStream(inputFile).pipe(parser);

 

4. Run file with node procoessFile.js:

image

11 thoughts on “NodeJS – reading and processing a delimiter separated file (csv)

  1. npm init process-csv:

    npm ERR! code E404
    npm ERR! 404 Not Found – GET https://registry.npmjs.org/create-process-csv – Not found
    npm ERR! 404
    npm ERR! 404 ‘create-process-csv@latest’ is not in the npm registry.
    npm ERR! 404 You should bug the author to publish it (or use the name yourself!)
    npm ERR! 404
    npm ERR! 404 Note that you can also install from a
    npm ERR! 404 tarball, folder, http url, or git url.

  2. Excellent post. Thank you.
    Question, please:

    Why does this only return the first character of the string?
    const searchkeywords = fs.readFileSync(‘kwords.csv’,’utf-8′);
    for (let kword of searchkeywords) {
    console.log(`Search Keyword: ${kword}`);

    Text for the first 10 rows is:

    “1-800-FLOWERS.COM, INC.”,
    “1ST SOURCE”,
    “1ST SOURCE CORP”,
    “3D SYSTEMS”,
    “8X8, INC.”,
    “A.H. BELO”,
    “AAON”,
    “AARON RENTS”,
    “ABERCROMBIE & FITCH”,
    “ABIOMED”,

  3. Thank you for sharing your code. I am new to Node js. So, this is a big help. How do you make the function to be synchronous?

    1. Hi phoang, If you want synchronous file read you can google for “node synchronous file read” or something similar. kind regards, Lucas

  4. I receive this error : ReferenceError: parse is not defined. What is the problem?

    var csv = require(‘csv-parse’);
    var fs = require(‘fs’);
    var async = require(‘async’);

    function readCSV (data, callback) {
    var inputFile=’bulk_importData.csv’;
    console.log(“Processing Countries file”);

    var parser = parse({delimiter: ‘;’}, function (err, data) {
    // when all countries are available,then process them
    // note: array element at index 0 contains the row of headers that we should skip
    data.forEach(function(line) {
    // create country object out of parsed fields
    var country = { “name” : line[0]
    , “code” : line[1]
    , “continent” : line[2]
    , “population” : line[4]
    , “size” : line[5]
    };
    console.log(JSON.stringify(country));
    });
    });
    // read the inputFile, feed the contents to the parser
    fs.createReadStream(inputFile).pipe(parser);
    };

Comments are closed.

Next Post

NodeJS - Publish messages to Apache Kafka Topic with random delays to generate sample events based on records in CSV file

Facebook0TwitterLinkedinIn a recent article I described how to implement a simple Node.JS program that reads and processes records from a delimiter separated file. That is  stepping stone on the way to my real goal: publish a load of messages on a Kafka Topic, based on records in a file, and […]