by Zed Shaw

You Used Ruby to Write WHAT?!

Mar 01, 200818 mins
DeveloperJavaOpen Source

Deciding when to use any language--including Ruby--depends on the appropriateness to task and the amount of yak shaving necessary. Zed Shaw explains when Ruby's MRI or JRuby is the best language for the job, and when it really isn't.

Ask any programmer what his favorite language is good for and he’ll yell, “Everything!” At least until his next favorite language comes along, which is also good for everything. The truth is: Any language that’s Turing Complete and supports enough language features can solve any problem. The difference between languages and their usefulness is a matter of degrees of “yak shaving.”

“Yak shaving” is a programmer’s slang term for the distance between a task’s start and completion and the tangential tasks between you and the solution. If you ever wanted to mail a letter, but couldn’t find a stamp, and had to drive your car to get the stamp, but also needed to refill the tank with gas, which then let you get to the post office where you could buy a stamp to mail your letter—then you’ve done some yak shaving.


You used Python to Write What?!

You used JavaScript to Write What?!

PHP’s Enterprise Strengths and Weaknesses, Take 2

You used PHP to Write What?!

You used Perl to Write What?!

Some programming languages and some platforms (I’ll make the distinction in a bit) minimize yak shaving. To create a simple language in Ruby, I have to find multiple undocumented libraries, install byzantine dependencies, track down build errors, find conflicts in things called “hoe” (which needs rubyforge which needs rake which needs Ruby), verify which version of Ruby I’m using, or try the undocumented lex and yacc integration, find out about Coco/R or other projects, install multiple non-Ruby packages, have the right libraries (again), talk with various people in IRC who don’t want to talk to me—and then, after a few weeks, I might have just the harness done. Conversely, using ANTLR, without much fuss I can prototype an entire new language in a weekend and deploy it to just about any computer. It’s not completely effortless; I can usually create a small parser for a protocol or mini-language in about a day or two using ANTLR and one book by Terrence Parr.

It’s this distance between problem and solution that makes one language more suitable than another for a given task.

Aside: Sure, for the ultrageek in us, these yak-shaving expeditions can be fun, depending on the platform. I personally wouldn’t do anything more complex than ant build on Java, but on Factor, Lua, Python or Ruby I’d gladly spend a weekend hacking at something just to prove I could do it (or to use it for something especially nerd powerful).

However, the platform is not the language. A programming language can be divorced from the physical computer it runs on. You can run multiple languages on virtual machines, such as the Sun Java Virtual Machine (JVM), Microsoft Common Language Runtime (CLR), NekoVM and LLVM; you can even host Forth in JavaScript in a browser. The separation of the language used to describe computation from the actual computational device means that programmers can choose various languages for a task, yet IT management can keep deployment platforms standardized and cohesive.

Why is this relevant to Ruby? Recently, there’s been a huge push to take Ruby onto both the JVM and the CLR systems. Ruby’s popularity means that companies like Sun, IBM and Microsoft want to add the language (and others, such as Python) to their platforms. This keeps programmers writing code for the vendors’ platforms and keeps computers in the data center running their (expensive) software. Based on the adoption of Ruby on Rails I’ve seen for internal projects in 2008, this is a smart move.

In Ruby’s case, the ability to host on multiple platforms means that whatever C, Java or C# can do, Ruby can do as well. The same advantage applies to Groovy, Boo, Python, Scala and JavaScript. What is left is a matter of style and reducing the number of yaks in the field you have to shave to get the job done.

What Ruby Makes Easy…and Hard

The original Ruby is called Matz Ruby Interpreter (MRI). It is a classic interpreter, similar to Perl, but it doesn’t use any byte code as do Lua, Python and Java. MRI is written mostly in C (too much C for some) and is the gold standard in the Ruby world. With the MRI, some tasks are incredibly easy to do, and some are insanely difficult.

Systems scripting and automation. Ruby has the same capabilities as does Perl for regular expression processing, file crunching and system automation. Ruby has the added advantage in that its syntax is more impervious to obfuscation. That’s not to say a rogue sysadmin couldn’t craft something heinous, but you have to try to do something evil in Ruby. Perl developers have to try to write good, readable code. MRI also starts up quickly and uses fewer resources than JRuby so it’s more suitable for short-lived cron jobs, long-running monitoring and other system-management tasks.

Web programming, sometimes. Ruby on Rails (RoR) is the latest silver bullet for Web programming. Hopefully, you know about RoR and are evaluating it for a future project. RoR is potentially a huge cost-saving framework for developing Web applications. Complex applications are measured in thousands of lines of code rather than 10- or 100-thousands of lines. Advanced JavaScript libraries and hundreds of active developers building cohesive components also make it much easier to build low- to medium-complexity applications.

The caveat on Ruby for Web programming is that Rails is better suited for building Web applications. I’ve seen many projects attempt to create a WebDAV server or a content management system (CMS) with Rails and fail miserably. While you can do a CMS in Rails, there are much more efficient technologies for the task, such as Drupal and Django. (In fact, I’d say if you’re looking at a Java Portal development effort, you should evaluate Drupal and Django for the task instead.)

Simplified APIs for nonexperts. Ruby’s main strength is metaprogramming. You can make little languages (domain-specific languages, or DSLs) that don’t look like Ruby. In the hands of an expert, this can turn future problems into simple implementations. (However, in the hands of an idiot, this can kill a project dead, so read my caveat against this later.) Ruby on Rails is in fact built like this, as more a DSL for doing Web applications.

Gluing C APIs together. MRI has a very good C binding API, and newer Ruby VMs like Rubinius will make this even easier. I’ve developed full wrappers to many complex C libraries in a few days. These C bindings have an object-oriented design and handle garbage collection with little fuss. It’s trivial to take parts of a Ruby application that don’t perform well and move them to a small C library, and using a system like Ruby2C or Ruby Inline, you can do this almost on the fly.

Prototyping network protocols. Ruby is a “good enough” language for writing simple servers and clients because of its simple threads, multiple I/O event libraries and protocol libraries for network protocols, including HTTP, SSH, SMTP and many others. It’s also relatively easy—thanks to Ruby’s very nice string handling—to create complete protocols in a short amount of time. When combined with advanced parsers like Ragel, you can make the protocols very robust for the majority of use cases, and then switch to C or another language for the heavy lifting.

However, Ruby’s I/O, garbage collection (GC) and threads suffer from serious design flaws and performance problems.

Deploying Ruby in any IT department is also a painful experience because of its open-source nature, but see below for why that’s more of a social problem.

Web application testing. Among Ruby’s strengths are the tools that help a development team make sure the software runs as designed. Two Ruby libraries, Watir and Selenium, automate testing a site’s user interaction. Combined with a test automation system like RSpec, Watir can produce complete reports of each test case and failure output, and write the tests in a nice readable format.

Telephony applications. Jay Philips has a framework called Adhearsion that gives you an amazingly slick syntax for the Asterix telephony platform. Using Adhearsion, you can write very complex yet clean telephone-based applications; you also can connect them to Web applications, databases and anything else Ruby can talk to. If you’re doing any telephone work, it’s worth the few days’ time to try out Asterix and Adhearsion.

JRuby: First, the Pluses

JRuby is another open-source version of Ruby that is sponsored by Sun Microsystems and runs on the JVM. If you are an all-Java shop but want to try out the new hotness in languages, then JRuby is perfect for you. It allows programmers to write code in Ruby, to dip seamlessly into Java where needed, to write the hard parts in Java for speed and then deploy to the same infrastructure as a Struts or JSP deployment would. It even supports Swing. Several libraries make Swing even easier (one written by me). In its current state, I’d use JRuby for the following:

Breathing new life into tired old Java APIs. JRuby’s tight integration with Java libraries is so seamless that you can code against a Java API in a way that looks and feels like Ruby. It’s amazing how many times the same exact code—but done in Ruby with “less syntax”—seems easier to write. Syntactically, a Java implementation would be almost identical, but the Ruby code is looser and more fun. The Java purists will toss this as a “loser’s argument,” but sometimes letting tired, beaten Java programmers use a simple fun language to write their code can give the project a morale boost.

Gluing together Java libraries. Java programmers love to combine various elements of a “stack” to create the perfect framework. This usually results in a “Frankenstack” consisting of mostly off-the-shelf components, open-source libraries, lots of XML and some strange special sauce the star developer created (just before she left).

Ruby can work as a better glue between all the stack components since its more friendly syntax and easier unit testing frameworks promote readability and ease of use. However, even the worst Architecture Astronaut can turn Ruby into something horrible, as I’ve witnessed on many “professional” Ruby on Rails projects.

Rapid prototyping and experimentation. JRuby lets programmers prototype and interact with Java libraries dynamically, the same way they would in a scripting language. It dynamically converts types and gives them the feel of “just talking to the computer.” To find out if a solution is possible (something we call a “spike”), JRuby can turn the test into a single-day job rather than a two-week trip to Frankenstack City. It also means you can try out new APIs quickly to evaluate whether they’re worth using in a project; that can help when you make purchasing decisions.

Enterprise application integration. EAI is that constant dirty job somebody has to do, and it’s never been pretty, ever. When you do most EAI tasks in a strict language like Java, you end up with monumental crystal palaces nobody can maintain. If you do them in a dirty, loose language like Perl, you end up with giant balls of duct-taped twine nobody can maintain. With JRuby, the parts that are best done with strict processing can be done in Java (or even just Java style), and the parts that are dirty and require regex wizardry can be done in Ruby. You get the best of both platforms’ network libraries, string processing, file handling and design philosophies when you try to make multiple different systems talk.

Web programming but with the Java platform. You can host Ruby on Rails applications in standard Java application servers like Tomcat, Jetty, Glassfish or Resin. Rails applications simply need to be packaged into Web application archives (WARs) and hooked in, using a simple set of libraries from the Goldspike or Warbler projects. Since Ruby has no legacy notion of complex (and useless) Java technologies like JNDI, JMX, JTA, servlets, application servers or session migration, you don’t need to purchase insanely expensive products that don’t meet your needs. Simply start with Tomcat. And, when you absolutely need it, buy (or use open-source versions of) the components you need.

Swing or SWT GUI development. Traditionally, developing a Swing or SWT application was a painful experience, involving gigantic books detailing every arbitrary bit of trivia about library specifics that were mostly legacy cruft. With JRuby’s metaprogramming and dynamic nature, it’s possible to build full complete professional user interfaces using SWT or Swing in a very short time. The results are usually the same in performance, and any parts that aren’t fast enough can be redone in Java where needed. There are also several libraries like Profligacy (mine) and Cheri.

But that’s just the advantages…

And on the Downside…

I don’t recommend you use either JRuby or Ruby for:

Large data crunching. It’s sad to say, but Ruby (all versions worth using in production, and even Ruby 1.9) suffer from huge problems with large-scale garbage collection, I/O processing and thread operations. Ask anyone running a Rails application that accepts large file uploads; they’ll tell you it chews their CPU just to process the MIME boundaries in the request body. Ruby has had (and still has) frequent bugs and misfeatures in its garbage collector (GC) implementation that makes handling large chunks of data difficult. Yes, you can do it; and yes, being clever helps; but why bother, when you can just use a language without such problems?

Image manipulation. I don’t really think many languages are great at image manipulation, but Ruby is particularly bad at it. The primary libraries available are based on Image Magick, which is slow, bloated and takes forever to install on many systems. Ruby talks to Image Magick through RMagick, which suffers from memory leaks, spawns external processes silently for many operations and is difficult to install (unless your computer is nearly exactly the same as the author’s). Other libraries are no better. Either they don’t support many operations, or they also require a myriad of dependencies, build tools and other requirements just to do limited image manipulation.

Heavy math or computation. Ruby’s computation performance is limited compared to that of other languages. Many times a programmer has to either rewrite math-related sections in C or Java—or ask a Ruby guru to help him optimize the initial code several times. Usually these “deep voodoo” optimizations don’t survive the next minor version of the Ruby interpreter and need to be rewritten (as with earlier conversions of for loops to each demonstrate). While the vast majority of applications don’t need any such computation, if you’re doing scientific, statistical or financial calculations, you’re better off with C++, C, Fortran or Java.

New language development. While Ruby can build very nice DSLs, any errors in the DSL source aren’t specified in that language, but instead, as Ruby errors. That’s confusing. If your intention is to provide a language that a financial analyst can use to do her day job, then the error "undefined local variable 'var' for main:Object" is not helpful. In this case, you’d need a real parser with actual error checking and a better syntax that’s not dependent on Ruby’s—and this is where Ruby falls flat in practice.

There are libraries available for Ruby that can generate lexers and parsers, and plenty of templating engines for code generation, but most of these tools are poorly documented. You can find books for many of the Java, C or C++ based tools, and a trained language implementer can crank out simple DSLs with them in a few days to a week.

E-mail processing. Just about every other language has better e-mail processing capabilities than does Ruby. The libraries available to Ruby are half-implemented, have odd bugs and incompatibilities, are slow, bloated, and generally just don’t compare to what is available for Python, Perl, C#, C++, Java or C. Having sendmail (and postfix) supported milter support in Python, Java, Perl and C means you can leverage an existing well-written mail server and then write your application code in nearly any language…except Ruby. Yes, a few files are floating around for doing Ruby milter processing, but they’re not nearly as high quality as what’s available for other languages.

Server protocols. Ruby’s problems with I/O processing means that writing truly scalable servers is a waste of time. Most of the people I know doing server work in Ruby eventually give up and either use a mix of Ruby and C or they switch to another language entirely.

With Ruby’s garbage collection problems, you’ll have issues with long-running processes, especially if they have to process large data streams and you’re not careful about how you deal with them. With Ruby’s Threads you’ll find many problems when you try to scale your application beyond the 1,024 open files Ruby can handle. Ruby’s Thread contains several technical flaws that make it unsuitable for really huge-scale operations. Ruby can handle large network traffic, but it takes more careful planning and programming than with many other languages.

Given all that, I would prototype a new protocol with Ruby for its implementation speed; but if I had to make that prototype handle more than a few hundred users doing a few operations per second, I’d switch languages immediately. There are libraries that attempt to solve this problem in Ruby, but most of them are still constrained by Ruby’s IO event loop. They’re also rather complex, with little advantage over systems like Twisted for Python or Apache’s MINA for Java.

Enterprise deployments. The “enterprise” hates Ruby, and this is mostly a social problem. The majority of corporate systems are based on either Java or C#, with no room for a rogue language like Ruby.

In fact, take a good hard look at many of the companies running large-scale systems. You’ll see Python at Google, PHP at Yahoo, Perl at Amazon, Java at eBay—but very little Ruby. This will change as Ruby on Rails becomes popular. However, in two years of trying to deploy Ruby into any managed IT environment, my experience has been troubling. The only way to get Ruby into the enterprise is if a strong champion threatens firings unless it happens, actually fires the first suspected saboteur and builds the first version of the new application outside the company in secret. It may sound extreme, but take it from a guy who’s done six such applications at four organizations.

Ruby does have its problems, but I attribute most of the technical deficiencies to the MRI platform and not to the language itself. People generally love the Ruby language, which is why there are so many enthusiastic adopters when it’s compared with other languages. When I recently spoke at the CUSEC conference I discovered that most of the college students either were working with Ruby or planned to adopt it. Either that, or they used Java because their university required it. Very few were interested in languages like Python, Perl, Java or C, except when it was required or could get them a job.

This has probably been the most overlooked “feature” of Ruby by most people advocating it: Young kids actually want to work for you if you build software in Ruby or Python, Erlang, Haskell or Lisp. To this new crop of programmers, languages like Java and C++ were written by stodgy old blowhards who would much rather build hardware than languages. The young programmers were forced to grunt through poorly taught classes in Java or C++ that had more to do with learning each language’s esoteric syntax than with the actual material in question. Ruby and friends are the languages they learned for fun: to play around with interesting ideas and to use when they hang out with friends online.

When considering in what language to write your next project, remember that your primary goal is not arbitrary technical concerns such as performance or scalability, but rather whether you can maintain a solid stable of smart developers long enough to make the project successful. Ruby’s popularity and lovable ways can attract smart, eager people to your firm, even if the project is something boring on which they normally wouldn’t want to work.

Zed A. Shaw is currently a vice president at an investment bank leading a gang of smarties building a cutting-edge document management system, using Ruby on Rails. More important, he writes tons of free software that many people doing Ruby on Rails use all day long. After almost two decades of programming in many languages, he finds writing and music to be more fun. He is now an essayist and antipundit with a very vulgar blog,, that you should avoid.