GFX::Monk Home

Posts tagged: "ruby" - page 2

rvm: Manage Your Rubies With an Ill-Managed Manager

rvm is a tool for maintaining multiple versions of ruby, as well as maintaining project-specific sets of gem dependencies. When I first learnt about it this week it sounded like a very useful tool, although it’s unfortunate that gems are so awkward to manage that it should be necessary in the first place.

Yesterday my first task was to update rspec. Which in turn required an update to rubygems before it would install. But who manages rubygems? It could be rvm, or rubygems itself, or apt, or even maybe bundler.

I looked through the documentation, and the most appropriate answer seemed to be that rvm should manage rubygems. I quote from the documentation:

rvm action [interpreter] [flags] [options]

where update is an action, and one of the flags is --rubygems:

--rubygems    - with update, updates rubygems for selected ruby

So I diligently typed

rvm update --rubygems

And what did rvm do? It proceeded to attempt to update itself, instead of rubygems. If you want to upgrade rubygems, you’re supposed to type:

rvm --rubygems update

(note that this is incorrect according to the above documentation, but is how I eventually coerced it into upgrading rubygems (this bug has since been fixed))

The accidental upgrade might have been okay, if its upgrade process were anything but Completely Insane. It goes thusly:

  • download a file from an unsecured HTTP location
  • without verifying any sort of checksum, signature or even HTTP status code, pipe the output directly into a bash shell
  • this script clones a github repository, and proceeds to install the absolute latest revision, whatever that might be

Hilarity ensues. I got a bash syntax error, but evidently not early enough in the process to stop rvm from destroying itself, requiring me to delete everything related to it and install from scratch.

Security? ignored.

Sanity checking? skipped.

Dependencies? get them yourself.

Update management? The website says “make sure you run this command frequently”.

I don’t know that I want such a tool trying to manage my dependencies, thank you very much…

The most painful thing, of course, is that it’s yet another buggy, language-specific implementation of the principals that zero-install does so much better (and simpler). If you don’t have global state, suddenly it’s really not that hard to keep things from interfering with each other.

Oh, and did I mention how rvm integrates with your shell, so that when you cd into a project directory, it automatically sets up your ruby version and gems? Except that when you open a new shell in the same location, you have to cd out of your project directory and then back in or else you’ll see the system version of ruby and your gems, and things will be broken in very odd ways. Splendid.

group_sequential

So the other day I had a list of (html) elements, and I wanted to get an array representing lines of text. The only problem being that some of the elements are displayed inline - so I needed to join those together. But only when they appeared next to each other.

I would call the generic way of doing so group_sequential, where an array is chunked into sub-arrays, and sequential elements satisfying some predicate are included in the same sub-array. That way, my predicate could be :inline?, and I could join the text of each grouped element together to get the lines out.

For example, using even numbers for simplicity:

[1,2,3,4,6,8,5,4,4].group_sequential(&:even?)
=> [[1],[2],[3],[4,6,8],[5],[4,4]]

Here’s the ruby code I came up with:

	class Array
		def group_sequential
			result = []
			group = []
			finish_group = lambda do
				unless group.empty?
					result << group
					group = []
				end
			end

			self.each do |elem|
				if yield elem
					group << elem
				else
					finish_group.call
					result << [elem]
				end
			end
			finish_group.call
			result
		end
	end
	

Things are slightly less noisy, but assignment is subtly awkward in python without using nonlocal scope keyword (only available in python3):

	def group_sequential(predicate, sequence):
		result = []
		group = []
		def finish_group():
			if group:
				result.append(group)
			return []

		for item in sequence:
			if predicate(item):
				group.append(item)
			else:
				group = finish_group()
				result.append([item])
		group = finish_group()
		return result
	

This feels like something that should be doable in a much more concise way than I came up with above. Any ideas? (In either ruby or python)


Update:

My friend Iain has posted a number of interesting solutions over yonder, which got me thinking differently about it (specifically, reminding me of dropwhile and takewhile). I applied python’s itertools to the problem to get this rather satisfactory result in python:

	from itertools import takewhile, tee, dropwhile
	def group_sequential(pred, sequence):
		taker, dropper = tee(iter(sequence))
		while True:
			group = list(takewhile(pred, taker))
			if group: yield group
			yield [dropwhile(pred, dropper).next()]
	

(note: this is a generator which is fine for my purposes - you can always wrap it in a call to list() to force it into an actual list).

The same approach is acceptable when done in ruby, but a bit more verbose because of the need to explicitly check for the end of the sequence, and to collect the results array:

	class Array
		def group_sequential(&pred)
			sequence = self
			results = []
			while true
				group = sequence.take_while &pred
				results << group if group.size > 0
				sequence = sequence.drop_while &pred
				return results if sequence.empty?
				results << [sequence.shift]
			end
		end
	end
	

Update (the second):

While poking around itertools, I managed to miss groupby. I assumed it did the same thing as ruby’s Enumerable#group_by, which is to say not at all what I want (though it’s surely useful at other times). So here is presumably the most concise version I’ll find, for the sake of closure:

	from itertools import groupby
	def group_sequential(pred, sequence):
		return [list(group) for key, group in groupby(sequence, pred)]
	

rspec immediate feedback formatter

I had cause to repurpose this immediate feedback formatter to work with the SpecDoc rspec formatter. So here’s a minimal version that just monkey-patches the SpecdocFormatter class to provide immediate feedback. As long as it’s included somewhere before rspec starts running, it should do its thing…

(view link)

Recursively Default Dictionaries

Today I was asked if I knew how to make a recursively default dictionary (although not in so many words). What that means is that it’s a dictionary (or hash) which is defaulted to an empty version of itself for every item access. That way, you can throw data into a multi-dimensional dictionary without regard for whether keys already exist, like so:

h["a"]["b"]["c"] = 5

Without having to first initialise h[“a”] and h[“a”][“b”].

A dictionary with a default value of an empty hash sprang to mind, but after trying it out I realised that this only works for one level. Recursion was evidently required.

So, here’s the python solution:

from collections import defaultdict
new_dict = lambda: defaultdict(new_dict)
h = defaultdict(new_dict)

And the ruby, which seems overly noisy:

new_hash = lambda { |hash, key| hash[key] = Hash.new &new_hash }
h = Hash.new(&new_hash)

minor metaprogramming

Can anyone tell me why ruby’s instance_variable_set would possibly require the name of a variable to start with an “@”, rather than simply assuming it? It’s a ruddy instance variable, after all…

I can find no decent alternative to python’s setattr in ruby, which surprises me.

Entirely too much investigation into ruby's match operator

(how could a title like that not excite you! ;P)

So yesterday I had this weird regex issue in ruby. I wanted to get a regular expression containing a given string, but didn’t want to have to manually escape all the special characters. Regexp.escape to the rescue! It escapes all regex metacharacters in any given string, and returns it as a regex. In fact, the docs assure me:

For any string, Regexp.escape(str)=~str will be true.

But not so much in practice?

>> str = "123"            
=> "123"
>> Regexp.escape(str)=~str
TypeError: type mismatch: String given


So, problem one: Regexp.escape is broken. It returns not a Regexp, but a string. Oddly enough it also seems to escape spaces and other innocuous characters, but at least you get the right result if you pump it through Regexp.compile(). However, that wasn’t my only discovery.

I mentioned this to Matt, and he couldn’t make much sense of it either. He noticed that the type error is specific to strings - if you use a number it just returns false:

>> 123 =~ 'foo'
=> false

Seems a bit odd, really. Fixnum doesn’t implement =~, nor does Integer.

So I went doc spelunking. I found implementations for =~ in the following three important classes:

Regexp: regexp =~ str: do a regex match, as you might expect

String: str =~ obj: Call obj =~ str (i.e swap the order of your operands). Not mentioned in the docs but clearly apparent in the source (and experimentation): raise a TypeError if both arguments are strings. Without this, matching one string to another would very quickly run out of stack space.

Object: obj =~ other_obj: return false

So the Regexp implementation is fine. The String implementation is a little odd. I guess it’s there to allow people to write matching statements either way, but it seems like a dangerous (and confusing) habit to condone.

But the Object implementation? Why??? What possible reason could one have for doing a match operation against two objects, neither of which implement any matching behaviour? This has the painful side effect of giving every single object I inspect a “=~” method which does nothing. No wonder Object.new() has over 120 methods on it *.

For comparison, python’s object only has 12 methods / attributes. And they’re all special names, so there’s no pollution of regular names going on.

So there you go, two spoonfuls of broken in the one discovery!

(this is ruby 1.8.6, if that matters)

* I exaggerate, over 120 methods on object are just what you get in a rails app. Vanilla ruby only has 41 by my count. but it’s still completely unnecessary, and adds noise to inflate that number