A Short Explanation of ARGV
I’d come across
ARGV a few times while looking at code examples on StackOverflow or the Ruby mailing lists. I recognized it as a kind of placeholder variable, but did not really understand its purpose or use and winced everytime I saw it, worrying that I was missing something important. So now that I’m feeling more comfortable with the basics of Ruby, I decided to learn all about ARGV.
What is ARGV?
As a concept, ARGV is a convention in programming that goes back (at least) to the C language. It refers to the “argument vector,” which is basically a variable that contains the arguments passed to a program through the command line. It typically manifests as an array and can be manipulated in that way, meaning you can reference specific arguments by index or you can iterate through them in the standard ways. In languages other than Ruby, you will often run into ARGV’s companion, ARGC, which refers to the “argument count,” and this is a useful shortcut for iterating in languages that rely on for-loops to iterate. The argument vector is often a crucial component for command line utilities (you probably use it every day) and can simplify a utility’s user interface and make it much faster to use. (We’ll talk more about that towards the end of the post.)
ARGV in Ruby
In Ruby, ARGV is a constant defined in the Object class. It is an instance of the Array class and has access to all the array methods. Since it’s an array, even though it’s a constant, its elements can be modified and cleared with no trouble. By default, Ruby captures all the command line arguments passed to a Ruby program (split by spaces) when the command-line binary is invoked and stores them as strings in the ARGV array.
Ruby’s ARGV is different from many other implementations in that it does not store the program name —
ARGV references the first argument passed to the program, and not the program’s name itself.
Here’s an illustration.
$ ./awesome_ruby_cli_utility.rb first_arg second_arg third_arg
ARGV is equal to
["first_arg", "second_arg", "third_arg"]. If we want to parse the arguments in our program, we can use
ARGV.first or even an iterator like
ARGV.each. As I eluded to before, there is no
ARGC in Ruby, but Rubyists have an easy substitute with Ruby’s many iterators as well as
#count with no block; ♥ Ruby).
This is tremendously useful. Let’s take a look at a real world example, from the Rails source code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
This example employs a couple of common tactics. First, if no argument is passed (meaning ARGV will be empty), then it pushes a
--help argument which details how to use the
rails binary, instead of throwing an error or booting into something. It’s like, Hey, if you don’t know how to use me, here are some things you can do. Second,
#shift-ed off of the front of the ARGV for processing by a new CommandsTasks object. This way you can sequentially process arguments and even dispose of bad arguments with no issue if you want to set it up that way. (They’ve also included an awesome alias hash and a short-circuit || operator. Some beautiful code!)
Working ARGV Into a Gem
Let’s see if we can employ this in some of my code. This week I wrote a (shoddy) gem that scrapes SpeakerDeck.com based on a given query and produces output sorted by presentation views, suitable for storage in a database or website. It optionally generates barebones HTML. As it’s written right now, it needs to be
require-d in a Ruby script, like this:
1 2 3 4 5 6
First thing, make a binary:
$ cd spdeck-scraper $ mkdir bin $ touch spdeck-scrape
We’ll also want to specify this as an executable in the gemspec. Now we’ve got a binary that will be installed into our path when we install the gem. There are different ways to implement it, but for now, we’ll go with the simplest option and just write some code into the binary.
1 2 3 4 5 6 7 8 9
There we go. That first line is important for binaries. It lets the shell know to run the file using Ruby (more on this line: http://www.ruby-doc.org/docs/ProgrammingRuby/html/preface.html#UC). So now when we call
spdeck-scrape from the command line and pass a query and a range, we’ll be initiating a new scrape with that query and range. I had to force the range into an integer with
ARGV takes the arguments as strings, and my SpeakerdeckScraper class is looking for a
Fixnum. Let’s allow the user to specify the display option also.
1 2 3 4 5 6
What happens if the user doesn’t specify any arguments? Right now it will blow up, because query, range, and display will be set to nil. Let’s add some instructions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
It’s not terribly robust, but it’s looking pretty good. I’d like to add one last thing for now: default options. I want a user to be able to just specify a query and nothing else. By default we’ll grab 5 pages of results and have concise display.
1 2 3 4 5 6 7
And now we’ve got some default values. It will mostly work as intended. We can build the gem, install it, and run it from the command line.
$ spdeck-scrape ruby 5 -v -html
There’s plenty more we could do with this gem, including refactoring our parsing to support multi-word queries by calling
#join on our
ARGV, adding aliases like in the Rails code, or taking advantage of the tools available in Ruby’s
ARGF class or the
OptionsParser class, which could streamline and automate a lot of the parsing that I did manually in my gem. I’ve covered only the bare minimum about
Some additional references and resources:
You can install my gem with RubyGems:
$ gem install spdeck-scrape
Or you can browse the source code at Github: https://github.com/jnoconor/spdeck-scrape/.