To really master the command line you have to master dozens — if not hundreds — of small utility programs. Each of these does things slightly differently. It can be pretty overwhelming.
Fortunately, it's possible to replace a lot of these single-purpose tools with a general-purpose programming language like Ruby. That way you can use the Ruby knowledge you already have to level up your command-line-fu.
This post will take you through the fundamentals of using Ruby as a command-line swiss army knife. I'm not going to bombard you with clever one-liners. Instead we'll look at how things really work, so that hopefully you'll be able to apply these techniques to solve your own problems.
Using Ruby from the command line
I'm sure you know that you can run Ruby programs from the command line like so:
$ ruby myprogram.rb
But did you know that you can pipe in code to be executed by Ruby?
$ echo "puts 2+2" | ruby
4
Even more useful is the ability to pass in code as a command-line argument. This is what we are going to spend our time on today.
$ ruby -e 'puts 2+2'
4
Hey Newbies
Are the examples above kind of confusing? It may be because you're not familiar with pipes and redirection. Check out this post for a good intro. We're going to be using pipes a lot below.
Working with input
Most command-line tools take in some data, process it, and then spit it back out.
We've got two good options for getting input: command-line arguments and STDIN. Let's take a look at each of them.
Command-line arguments
You can send as many command-line arguments as you like to your script. Just put them after everything else:
$ ruby -e '<your code here>' arg1 arg2 arg3 etc
These arguments are stored inside of the ARGV
array. In the example below I'm dumping the whole array so you can see what's in it.
$ ruby -e 'puts ARGV.inspect' apples bananas pears oranges
["apples", "bananas", "pears", "oranges"]
It's worth noting that this is exactly how Ruby always behaves. There is no magic happening just because we're using the command line. Check out this post about ARGV for more details.
Dumb example
Imagine for a second that I am super egotistical. The most important thing to me is to know how many times my name is mentioned on the web. Using the techniques we've seen I can easily write a one liner to calculate this for any webpage.
$ ruby -e "require 'open-uri'; puts open(ARGV.first).read.scan(/starr/i).count" <url here>
STDIN
Command-line arguments are great, but they're only good for short values. You wouldn't want to use them to – say — input the unabridged text of Moby Dick. For that we want to use STDIN.
If you're not familiar with STDIN, don't be intimidated. For our purposes here it behaves just like any other file open for reading.
Here's what I mean. In the example below we are piping some text into Ruby. Our Ruby script is reading it from STDIN and printing it to the screen.
echo "bananas!" | ruby -e "puts STDIN.read"
bananas!
We can easily input larger amounts of data by using cat
. The example below uses the first
method which is available on any file to grab the first few lines of the text:
cat moby.txt | ruby -e "puts STDIN.first(3)"
Call me Ishmael. Some years ago--never mind how long precisely--having
little or no money in my purse, and nothing particular to interest me on
shore, I thought I would sail about a little and see the watery part of
Dumb example
Now that we know how to consume STDIN, let's rewrite the dumb example from above. Instead of using Ruby to fetch the webpage, we can use curl
and only use Ruby for the pattern matching.
curl <MY URL> | ruby -e "puts STDIN.read.scan(/starr/i).size"
STDIN with syntactic sugar!
When you are working with STDIN, it's very common to have to loop over each line of input. Imagine that I want to get the file extension for every file in a directory. Here's how I might do that using normal STDIN loop:
ls | ruby -e 'STDIN.each_line { |l| puts l.split(".").last }'
rb
rb
csv
Since the STDIN loop is so common, Ruby provides a shorthand. If we run our script with the -n
flag, Ruby will automatically loop over each line in STDIN. The current line is in the global variable $_
.
So we can rewrite the example above like so:
ls | ruby -n -e 'puts $_.split(".").last'
It's up to you whether or not you want to use the shorthand. While it definitely does mean you have to write less code, it also means that you have to remember more arbitrary facts like -n
and $_
.
Working with output
In situations like these you are usually going to want to write your output to STDOUT. This will give you the most flexibility because it will let you pipe the output into other programs or redirected to disk as necessary.
The good news is that you get to use the same print commands that you are probably very familiar with. One thing to be aware of though is that puts
adds a newline, which may or may not be what you want.
puts "hello world" # sends "hello world\n" to STDOUT
print "hello world" # doesn't add a newline"
Putting it all together
Honeybadger is based in Washington state in the US. That means we have the privilege of paying sales tax for every paying customer that happens to also live in Washington.
I simplified this quite a bit, but we basically have a CSV file with every transaction for the year. It looks something like this:
1/1/2015,100.00,WA
1/1/2015,50.00,NY
So to get a quick sum for all transactions in Washington I can use a one-liner like this:
$ cat cc.csv | ruby -e 'puts STDIN.inject(0) { |sum, x| date, amount, state = x.split(","); state.strip == "WA" ? sum + amount.to_f : sum }'
Keep in mind that this is a horribly sloppy way to parse CSV files. I would never send this code out into the wild. But one of the joys of writing tiny little programs to solve one-off problems is that you get to ignore all of the edge cases. That's what I'm going to do here.