Skip to content
Tilo Sloboda edited this page Dec 28, 2023 · 26 revisions

SmarterCSV 2.x

SmarterCSV 2 is a Ruby Gem for smarter importing of CSV Files as Arrays of Hashes, suitable for parallel processing with Sidekiq or Resque, as well as direct processing of the resulting hashes with Rails, e.g. ActiveRecord or Mongoid.

SmarterCSV 2 is still an experimental pre-release, so if you want to use the stable 1.x version of this gem, please check the README for the main branch.

Why?

Ruby's CSV library's API is pretty old, and its processing of CSV-files returning Arrays of Arrays feels 'very close to the metal'. The output is not easy to use - especially not if you want to create database records from it. Another shortcoming is that Ruby's CSV library does not have good support for huge CSV-files, e.g. there is no support for 'chunking' and/or parallel processing of the CSV-content (e.g. with Sidekiq),

As the existing CSV libraries didn't fit my needs, I was writing my own CSV processing - specifically for use in connection with Rails ORMs like ActiveRecord, Mongoid, or MongoMapper. In those ORMs you can easily pass a hash with attribute/value pairs to the create() method. The lower-level Mongo driver and Moped also accept larger arrays of such hashes to create a larger amount of records quickly with just one call.

Contents