Parsing CSV files in Ruby

Working with CSV files is something every programmer has to do eventually. Fortunately, Ruby's built-in CSV library puts plenty of tools at our disposal.

Handling data in various formats is a common task in software, and CSV (Comma-Separated Values) files are among the most prevalent data formats you'll encounter. Whether for data migration, reporting, or simply importing and exporting data, processing CSV files efficiently is a necessary skill for any Ruby developer.

In this article, we'll learn the practical aspects of parsing and handling CSV data using Ruby. We'll cover various techniques, from parsing files line-by-line for memory efficiency to dealing with CSV data from form inputs and even integrating CSV parsing into a Rails application.

Parsing a file line-by-line

A simple way to parse CSV files in Ruby is to read the entire file into an array using CSV.read. This is memory inefficient because the size of the array grows with the CSV, so it's not as common of a solution.

One of the most common methods of parsing a CSV file in Ruby is to read the file line-by-line, processing each row individually. Ruby's CSV library simplifies this task with its CSV.foreach method. This method is beneficial when dealing with large files, as it reads one line at a time, keeping memory usage low.

As a prerequisite, ensure that you have the CSV library accessible. It's included with Ruby's standard library, so you don't need to install any gems. Start by requiring it at the top of your code:

require "csv"

The CSV.foreach method is easy to use. Here's the basic syntax:

require "csv"

CSV.foreach("directory/filename.csv") do |row|
  # Process each row, in this example, print A summary of fields,
  # by header, in an ASCII compatible String.
  puts row.inspect
end

Running the above code will print every row of the CSV as an array of strings, where each string is a new column. Inside the block, you can alternatively add logic to process each row in a way other than printing it. We'll explore that in a later section.

Parsing input from a string

While CSV.foreach is excellent for file processing, sometimes you must parse CSV data from a form input or a string. This is where Ruby's CSV.parse method comes into play. It's ideal for smaller data sets or when you have your CSV data in a string format. The CSV.parse method converts a string of CSV data into an array of arrays, with each inner array representing a row of CSV data. Here's how you can use it:

require "csv"

# Make some fake CSV data in a string (this would likely come from form input)
csv_data = "Name,Email\nJeff,jeff@jeffmorhous.com\nHoneyBadger,support@honeybadger.com"
parsed_data = CSV.parse(csv_data)

parsed_data.each do |row|
  puts row.inspect
end

CSV files often include headers. You can tell the CSV.parse method to treat the first row as headers, then use that data to help with parsing:

require "csv"

csv_data = "Name,Email\nJeff,jeff@jeffmorhous.com\nHoneyBadger,support@honeybadger.com"
parsed_data = CSV.parse(csv_data, headers: true) # Note "headers: true"

parsed_data.each do |row|
  puts row["Name"] # Accesses only the column associated with the "Name" headers
end

Writing a parser to input a list of courses into a Rails app as the Course model

When working with a Rails application, you might need to import data into your models. For instance, let's consider a scenario where you have a list of courses in CSV format that you want to import into your Rails application as Course models.

First, ensure that you have a Course model in your Rails application. It might look something like this:

class Course < ApplicationRecord
 # your code
end

You'll need to create a parser that reads the CSV file and creates Course records. You can do this in a Rake task, a service object, or directly in a controller action, depending on your needs and the application's overall architecture.

Here's a simple rake task that would accomplish this:

# Place in lib/tasks/
require "csv"

namespace :import do
  desc "Import courses from a CSV file"
  task courses: :environment do
    CSV.foreach("directory/filename.csv", headers: true) do |row|
      Course.create!(
        title: row["Title"],
        description: row["Description"],
        status: row["Status"]
      )
    end
  end
end

This rake task assumes you have the file you'd like to import in directory/filename.csv, and it contains Title, Description, and Status headers. It will then create and persist a Course object with the data in those columns. However, this script is fragile as it does not handle malformed or missing data.

You can run this rake task with:

rails import:courses

Exporting Courses.all into a CSV

The Ruby CSV library can be used for more than just importing data into a Rails app. It can also provide export functionality—exporting data from your application's database into a CSV.

The first step is to decide where to place the logic for exporting to CSV, which can be done anywhere and is just a design decision. For simplicity, let's assume you're doing it in a controller action.

Here's a controller export action that creates a CSV file from the Course model:

require "csv"

class CoursesController < ApplicationController
  def export
    courses = Course.all
    send_data courses_to_csv(courses), filename: "courses-#{Date.today}.csv"
  end

  private

  def courses_to_csv(courses)
    CSV.generate(headers: true) do |csv|
      csv << ["Name", "Description", "Instructor"]

      courses.each do |course|
        csv << [course.name, course.description, course.instructor]
      end
    end
  end
end

Exporting your application records to a CSV file can be done efficiently with Ruby's CSV library, and integrating it into a Rails application is straightforward. Users often love having the ability to export their data!

Conclusion

Ruby's simplicity and power make it an excellent choice for data processing tasks. We've journeyed through various aspects of CSV handling, from parsing files line-by-line to efficiently handle large datasets, parsing form inputs for web applications, and even integrating CSV data into a Rails application. Now that you understand how to work with data in CSV files, you could try making the script more robust—for example, by adding some error handling to prevent malformed data!

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Jeffery Morhous

    Jeff is a Software Engineer working in healtcare technology using Ruby on Rails, React, and plenty more tools. He loves making things that make life more interesting and learning as much he can on the way. In his spare time, he loves to play guitar, hike, and tinker with cars.

    More articles by Jeffery Morhous
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “Everyone is in love with Honeybadger ... the UI is spot on.”
    Molly Struve, Sr. Site Reliability Engineer, Netflix
    Start free trial