A Short Explanation of ARGV

I’d come across ARGV or ARGV[0] a few times while looking at code examples on StackOverflow or the Ruby mailing lists. I recognized it as a kind of placeholder variable, but did not really understand its purpose or use and winced everytime I saw it, worrying that I was missing something important. So now that I’m feeling more comfortable with the basics of Ruby, I decided to learn all about ARGV.

What is ARGV?

As a concept, ARGV is a convention in programming that goes back (at least) to the C language. It refers to the “argument vector,” which is basically a variable that contains the arguments passed to a program through the command line. It typically manifests as an array and can be manipulated in that way, meaning you can reference specific arguments by index or you can iterate through them in the standard ways. In languages other than Ruby, you will often run into ARGV’s companion, ARGC, which refers to the “argument count,” and this is a useful shortcut for iterating in languages that rely on for-loops to iterate. The argument vector is often a crucial component for command line utilities (you probably use it every day) and can simplify a utility’s user interface and make it much faster to use. (We’ll talk more about that towards the end of the post.)

ARGV in Ruby

In Ruby, ARGV is a constant defined in the Object class. It is an instance of the Array class and has access to all the array methods. Since it’s an array, even though it’s a constant, its elements can be modified and cleared with no trouble. By default, Ruby captures all the command line arguments passed to a Ruby program (split by spaces) when the command-line binary is invoked and stores them as strings in the ARGV array.

Ruby’s ARGV is different from many other implementations in that it does not store the program name — ARGV[0] references the first argument passed to the program, and not the program’s name itself.

Here’s an illustration.

$ ./awesome_ruby_cli_utility.rb first_arg second_arg third_arg

Here, ARGV is equal to ["first_arg", "second_arg", "third_arg"]. If we want to parse the arguments in our program, we can use ARGV[0], ARGV[1], and ARGV[2], or ARGV.first or even an iterator like ARGV.each. As I eluded to before, there is no ARGC in Ruby, but Rubyists have an easy substitute with Ruby’s many iterators as well as ARGV.length (or #size, or #count with no block; ♥ Ruby).

ARGV Usage

This is tremendously useful. Let’s take a look at a real world example, from the Rails source code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ARGV << '--help' if ARGV.empty?

aliases = {
  "g"  => "generate",
  "d"  => "destroy",
  "c"  => "console",
  "s"  => "server",
  "db" => "dbconsole",
  "r"  => "runner"
}

command = ARGV.shift
command = aliases[command] || command

require 'rails/commands/commands_tasks'

Rails::CommandsTasks.new(ARGV).run_command!(command)

https://github.com/rails/rails/blob/master/railties/lib/rails/commands.rb

This example employs a couple of common tactics. First, if no argument is passed (meaning ARGV will be empty), then it pushes a --help argument which details how to use the rails binary, instead of throwing an error or booting into something. It’s like, Hey, if you don’t know how to use me, here are some things you can do. Second, commands are #shift-ed off of the front of the ARGV for processing by a new CommandsTasks object. This way you can sequentially process arguments and even dispose of bad arguments with no issue if you want to set it up that way. (They’ve also included an awesome alias hash and a short-circuit || operator. Some beautiful code!)

Working ARGV Into a Gem

Let’s see if we can employ this in some of my code. This week I wrote a (shoddy) gem that scrapes SpeakerDeck.com based on a given query and produces output sorted by presentation views, suitable for storage in a database or website. It optionally generates barebones HTML. As it’s written right now, it needs to be require-d in a Ruby script, like this:

1
2
3
4
5
6
# runner.rb
require 'spdeck-scrape'

jsc_pres = SpeakdeckScraper.new("javascript", 10, "-v")
jsc_pres.scrape_all
jsc_pres.html_gen

Here I’ve initiated a scrape of presentations about javascript and included a verbose display option, then pushed the results to an HTML file. Easy enough, but if we just wanted to look at the data this is kind of cumbersome. I could write a command line interface, which might be kind of nice, but the end product is going to be an HTML file, so the terminal window is ultimately going to get in the way. However, we can make a binary, include some ARGV parsing to grab the query and options, and use the command line to directly launch the process (which, I might add, is time consuming, perfect as a background task). Let’s do it.

First thing, make a binary:

$ cd spdeck-scraper
$ mkdir bin
$ touch spdeck-scrape

We’ll also want to specify this as an executable in the gemspec. Now we’ve got a binary that will be installed into our path when we install the gem. There are different ways to implement it, but for now, we’ll go with the simplest option and just write some code into the binary.

1
2
3
4
5
6
7
8
9
#!/usr/bin/env ruby
require 'spdeck-scrape'

query = ARGV[0]
range = ARGV[1].to_i

user = SpeakerdeckScraper.new(query, range)
user.query_results_scrape(range)
user.scrape_all

There we go. That first line is important for binaries. It lets the shell know to run the file using Ruby (more on this line: http://www.ruby-doc.org/docs/ProgrammingRuby/html/preface.html#UC). So now when we call spdeck-scrape from the command line and pass a query and a range, we’ll be initiating a new scrape with that query and range. I had to force the range into an integer with #to_i because ARGV takes the arguments as strings, and my SpeakerdeckScraper class is looking for a Fixnum. Let’s allow the user to specify the display option also.

1
2
3
4
5
6
#!/usr/bin/env ruby

query = ARGV[0]
range = ARGV[1].to_i
display = ARGV[2]
...

What happens if the user doesn’t specify any arguments? Right now it will blow up, because query, range, and display will be set to nil. Let’s add some instructions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/usr/bin/env ruby
require 'spdeck-scrape'

if ARGV.empty?
    puts "\n\n------- spdeck-scrape: ERROR! --------"
    puts "      Usage:"
    puts "      Please specify a query, range, and display option (if desired):\n"
    puts "          spdeck-scrape my_query an_integer [options]"
    puts "      Options:
                    -v or -c       # verbose or concise display while running"
    puts "          -html          # include this flag after a display flag to print data to an HTML file\n"
    puts "      Example:"
    puts "          spdeck-scrape ruby 15 -v -html\n\n"
...

It’s not terribly robust, but it’s looking pretty good. I’d like to add one last thing for now: default options. I want a user to be able to just specify a query and nothing else. By default we’ll grab 5 pages of results and have concise display.

1
2
3
4
5
6
7
#!/usr/bin/env ruby
...
else
    query = ARGV[0]
    ARGV[1].nil? ? range = 5 : range = ARGV[1].to_i
    display = ARGV[2] || '-c'
...

And now we’ve got some default values. It will mostly work as intended. We can build the gem, install it, and run it from the command line.

$ spdeck-scrape ruby 5 -v -html

html output screenshot of HTML output

Next Steps

There’s plenty more we could do with this gem, including refactoring our parsing to support multi-word queries by calling #join on our ARGV, adding aliases like in the Rails code, or taking advantage of the tools available in Ruby’s ARGF class or the OptionsParser class, which could streamline and automate a lot of the parsing that I did manually in my gem. I’ve covered only the bare minimum about ARGV here.

Some additional references and resources:

You can install my gem with RubyGems:

$ gem install spdeck-scrape

Or you can browse the source code at Github: https://github.com/jnoconor/spdeck-scrape/.