NodeJS – reading and processing a delimiter separated file (csv)

9

Frequently, there is a need to read data from a file, process it and route it onwards. In my case, the objective was to produce messages on a Kafka Topic. However, regardless of the objective, the basic steps of reading the file and processing its contents are required often. In this article I show the very basic steps with Node.js and and the Node module csv-parse.

1. npm init process-csv

Enter a small number of details in the command line dialog. Shown in blue:

image

2. npm install csv-parse -save

This will install Node module csv-parse. This module provides processing of delimiter separated files.

image

This also extends the generated file package.json with a reference to csv-parse:

image

3. Implement file processFile.js

The logic to read records from a csv file and do something (write to console) with each record is very straightforward. In this example, I will read data from the file countries2.csv, a file with records for all countries in the world (courtesy of https://restcountries.eu/)

image

The fields are semi colon separated, the records are each on a new line.

 

/*
This program reads and parses all lines from csv files countries2.csv into an array (countriesArray) of arrays; each nested array represents a country.
The initial file read is synchronous. The country records are kept in memory.
*/

var fs = require('fs');
var parse = require('csv-parse');

var inputFile='countries2.csv';
console.log("Processing Countries file");

var parser = parse({delimiter: ';'}, function (err, data) {
    // when all countries are available,then process them
    // note: array element at index 0 contains the row of headers that we should skip
    data.forEach(function(line) {
      // create country object out of parsed fields
      var country = { "name" : line[0]
                    , "code" : line[1]
                    , "continent" : line[2]
                    , "population" : line[4]
                    , "size" : line[5]
                    };
     console.log(JSON.stringify(country));
    });    
});

// read the inputFile, feed the contents to the parser
fs.createReadStream(inputFile).pipe(parser);

 

4. Run file with node procoessFile.js:

image

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director and Oracle Developer Champion. Solution architect and developer on diverse areas including SQL, JavaScript, Kubernetes & Docker, Machine Learning, Java, SOA and microservices, events in various shapes and forms and many other things. Author of the Oracle Press book Oracle SOA Suite 12c Handbook. Frequent presenter on user groups and community events and conferences such as JavaOne, Oracle Code, CodeOne, NLJUG JFall and Oracle OpenWorld.

9 Comments

  1. Excellent post. Thank you.
    Question, please:

    Why does this only return the first character of the string?
    const searchkeywords = fs.readFileSync(‘kwords.csv’,’utf-8′);
    for (let kword of searchkeywords) {
    console.log(`Search Keyword: ${kword}`);

    Text for the first 10 rows is:

    “1-800-FLOWERS.COM, INC.”,
    “1ST SOURCE”,
    “1ST SOURCE CORP”,
    “3D SYSTEMS”,
    “8X8, INC.”,
    “A.H. BELO”,
    “AAON”,
    “AARON RENTS”,
    “ABERCROMBIE & FITCH”,
    “ABIOMED”,

  2. Thank you for sharing your code. I am new to Node js. So, this is a big help. How do you make the function to be synchronous?

    • Lucas Jellema on

      Hi phoang, If you want synchronous file read you can google for “node synchronous file read” or something similar. kind regards, Lucas

  3. I receive this error : ReferenceError: parse is not defined. What is the problem?

    var csv = require(‘csv-parse’);
    var fs = require(‘fs’);
    var async = require(‘async’);

    function readCSV (data, callback) {
    var inputFile=’bulk_importData.csv’;
    console.log(“Processing Countries file”);

    var parser = parse({delimiter: ‘;’}, function (err, data) {
    // when all countries are available,then process them
    // note: array element at index 0 contains the row of headers that we should skip
    data.forEach(function(line) {
    // create country object out of parsed fields
    var country = { “name” : line[0]
    , “code” : line[1]
    , “continent” : line[2]
    , “population” : line[4]
    , “size” : line[5]
    };
    console.log(JSON.stringify(country));
    });
    });
    // read the inputFile, feed the contents to the parser
    fs.createReadStream(inputFile).pipe(parser);
    };

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.