GFX::Monk Home

Posts tagged: "ruby"

Running a child process in Ruby (properly)

(cross-posted on the Zendesk Engineering blog)

We use Ruby a lot at Zendesk, and mostly it works pretty well. But one thing that sucks is when it makes the wrong solution easy, and the right solution not just hard, but hard to even find.

Spawning a process is one such scenario. Want to spawn a child process to run some system command? Easy! Just pick the method that’s right for you:

  • `backticks`
  • %x[different backticks]
  • Kernel.system()
  • Kernel.spawn()
  • IO.popen()
  • Open3.capture2
  • Open3.capture2, Open3.capture2e, Open3.capture3, Open3.popen2, Open3.popen2e, Open3.popen3

… and that’s ignoring the more involved options, like pairing a Kernel#fork with a Kernel#exec, as well as the many different Open3.pipeline_* functions.

What are we doing here?

Often enough, you want to run a system command (i.e. something you might normally run from a terminal) from your Ruby code. You might be running a command just for its side effects (e.g. chmod a file), or you might want to use the output of the command in your code (e.g. tar -tf to list the contents of a tarball). Most of the above functions will work, but some of them are better than others.

Ruby's split() function makes me feel special (in a bad way)

Quick hand count: who knows what String.split() does?

Most developers probably do. Python? easy. Javascript? probably. But if you’re a ruby developer, chances are close to nil. I’m not trying to imply anything about the intelligence or skill of ruby developers, it’s just that the odds are stacked against you.


So, what does String.split() do?

In the simple case, it takes a separator string. It returns an array of substrings, split on the given string. Like so:

py> "one|two|three".split("|")
["one", "two", "three"]

Simple enough. As an extension, some languages allow you to pass in a num_splits option. In python, it splits only this many times, like so:

py> "one|two|three".split("|", 1)
["one", "two|three"]

Ruby is similar, although you have to add one to the second argument (it talks about number of returned components, rather than number of splits performed).

Javascript is a bit odd, in that it will ignore the rest of the string if you limit it:

js> "one|two|three".split("|", 2)
["one", "two"]

I don’t like the javascript way, but these are all valid interpretations of split. So far. And that’s pretty much all you have to know for python and javascript. But ruby? Pull up a seat.

Ruby's unicode treatment

I recently came across this enlightening post on the changes to strings and encodings in ruby 1.9. As a python lover who has only used ruby 1.8 so far, it’s interesting to see the different approaches to very similar problems in python 3 and ruby 1.9.

I may be biased, but ruby’s implementation sounds like it will lead to a lot of pain and bugs, while python’s implementation will lead to a little more pain as you are forced to learn about encodings, and a lot less bugs (as you are forced to learn about encodings). Let me explain why:

Four Common (and Broken) Ruby Operations

All of these lines, in ruby, should fail. All of them instead return nil:

@nonexistant_var
{}[:nonexistant_key]
[].first
{}.shift

All of these were encountered by myself in the course of yesterday’s programming. None of them in a good way. And the last two were in published libraries, not even code under development.

All of these, of course, raise errors in python. I refer you to lines 10 and 11 of the zen of python:

Errors should never pass silently.

Unless explicitly silenced

(an Option or Maybe type would be acceptable also, but that’s pretty uncommon to find in a dynamic language)

Also inviting my fury: every single language, tool or function, ever, that makes you check the return code of a system (shell) command to see whether it was nonzero.

How I Replaced Cucumber With 65 Lines of Python

Update:

I’ve since cleaned up the code here and published it as a tiny library: pea on github

Aside: why cucumber doesn’t work as well as everyone thinks it should

I’ve used cucumber at work for a reasonably large project, and I wasn’t impressed. Having one canonical language for stories sounds great, until you have enough arguments about how things should be phrased that you eventually come to the realisation that BAs don’t want to write their specifications as tests, and you don’t write your tests as specifications.

This is a test style assertion step:

Then the total of the items should be 42

..and this is the same step in a requirements-style of language:

Then the total of the items should equal the sum of the number of items in each category

To a BA, the first example is a lie. The sum shouldn’t be 42, it should be the correct number! And to an automated program, the second statement is nigh on useless. Saying what something is supposed to be made from is just doing the same calculation twice - there’s nothing stopping you from doing it wrong both times! If you want to check that it’s getting the right answer, you need to tell it what the right answer is, not just tell it (again) how to make it.

So I’m not a huge fan of having cucumber scenarios be the single source of truth for requirements. If the programmers have their way it’s just a series of examples (also known as “tests”), and if the BAs have their way it’s just a series of feeble assertions that don’t necessarily check what they say they’re checking.

But it’s not all bad…

But on programmer-oriented projects, I can see them working quite well. For example, I’ve recently upgraded a large suite of specs to rspec 2, and made heavy use of the browsable cucumber scenarios on relishapp.com as actual, useful documentation.

So I decided to try cucumber on one of my own projects. Since I am obviously a python fiend outside of work, I wasn’t going to use cucumber. So out came the (very young) python port of cucumber, called lettuce (where did this salad theme even come from? o_O). I gave it a go, and of course it’s naturally a bit more awkward than ruby because python doesn’t have blocks. It’s also more than a little buggy, and lacking some useful features that cucumber has (which is to be expected of such a young project).

I started hacking on it to add or improve features, and then got sick of it. It really does seem a little ridiculous. We’re actually inventing a (trivial) language, and parsing it, and using little regex parsers in each of our steps, and mapping each of those regexes to little chunks of code. And all this makes it hard to find usages, hard to track duplication and dead code, and generally just awful to navigate and manage.

The punchline

So, you know what? I just transformed all my steps into valid python code instead. Each regex replaced with a function name, and each matching group an argument (python’s keyword-arguments help here). 65 lines of code later, I have a very similar result using plain-old python.

Here is a comparison. The old feature:

Feature: running indicate-task
	Running a basic, blocking process that
	consumes and produces output.

Scenario: running and cancelling a program
	When I run indicate-task -- cat
	And I enter "input"
	And I press ctrl-c
	And I wait for the task to complete

	Then there should be a "cat" indicator
	And it should have a menu description of "cat: running..."
	And the output should be: input
	And the error output should be empty
	And the return code should not be 0
	And it should display the task's output to the user
	And it should notify the user of the task's completion

And the new, normal, actual-python-code-that-works-just-fine-with-ctags-and-isn’t-built-with-dirty-regexes version:

from makeshift_cucumber import *
from base_test import BaseTest

class TestRunning(BaseTest):
	"""
	Feature: Running a basic, blocking process that
	consumes and produces output.
	"""

	def test_running_and_cancelling_a_program(self):
		When.I_run_indicate_task('--', 'cat')
		And.I_enter("input")
		And.I_press_ctrl_c()
		And.I_wait_for_the_task_to_complete()
		Then.there_should_be_an_indicator_named("cat")
		And.it_should_have_a_menu_description_of("cat: running...")
		And.the_output_should_be('input')
		And.the_error_output_should_be_empty()
		And.the_return_code_should_not_be(0)
		And.it_should_display_the_tasks_output_to_the_user()
		And.it_should_notify_the_user_of_the_tasks_completion()

No, you probably wouldn’t be able to get a businessman to write a scenario. But has that ever actually worked with cucumber either? I find it doubtful. The results are just as readable, and insanely simpler in terms of the complexity of the testing infrastructure. Plus, it’s just a normal test, the functions are just normal functions, and the arguments are just normal arguments.

And if you don’t want to give that to a BA, just show them the test output instead:

terminal output

I’ll try to clean this up some time into a proper library & formatter sometime, because I think the mess of code you end up with cucumber is just too ridiculous for the benefits you get, and this sort of thing is much more developer-friendly while maintaining most of the readability benefits.

The -rubygems Flag

I was always slightly confused that despite rubygems not being part of the ruby language or interpreter, there is nonetheless a -rubygems option you can give to ruby to enable rubygems.

Today when I was delving through some stack traces, I noticed an odd looking filename at the root of it all. As I’m sure many before me have realised (a bunch of my workmates already knew about this), the -rubygems flag is not a real flag at all. It’s just a perverted case of the -r module syntax which tells ruby to require a file by name. Because when you install rubygems, it conveniently installs a file called ubygems.rb whose contents is simply require "rubygems". Very sneaky…