GFX::Monk Home

nix-wrangle: Manage nix sources and dependencies with ease

My last post was more than a year ago, in which I described my long journey towards better package management with nix.

Well, it turns out that journey wasn’t over. Since then, I’ve been using the tools I created and found them unfortunately lacking.

So, I’ve built just one more tool, to replace those described in the previous post.

I’m writing this assuming you’ve read the previous post, otherwise I’d be repeating myself a lot. If you haven’t, just check out nix-wrangle and don’t worry about my previous attempts ;)

Step 7: nix-wrangle for development, dependency management and releases

After a lot of mulling over the interconnected problems described in that previous post (and discussions with the creators of similar tools), I came at it from a new direction. Nix wrangle is the result of that approach. I’ve been using it fairly successfully for a while now, and I’m ready to announce it to the world.

How does it differ from my previous state of using nix-update-source and nix-pin?

  • One JSON file for all dependencies, allowing bulk operations like show and update, as well as automating dependency injection (since all dependencies are known implicitly).
  • A splice command which takes a nix derivation file, injects a synthetic src attribute baking in the current source from the nix-wrangle JSON file. This allows you to keep an idiomatic nix file for local development (using a local source), and automatically derive an expression using the latest public sources for inclusion in nixpkgs proper.
  • Project-local development overrides. The global heuristic of nix-pin caused issues and some confusion, nix-wrangle supports local sources for exactly the same purpose, but with an explicit, project level scope.

Please check it out if you use nix! And if you don’t use nix, check that out first :)

A journey towards better nix package development

Update:

I’ve continued my journey, and described my next steps (and implementation) in a new post.


This post is targeted at users of nix who write / maintain package derivations for software they contribute to. I’ve spent a lot of time doing (and thinking about) this, although it’s probably quite a niche audience ;)

tl;dr: you should check out nix-pin and nix-update-source if you want to have a happy life developing and updating nix expressions for projects you work on.

I believe nix is a technically excellent mechanism for software distribution and packaging.

But it’s also flexible enough that I want to use it for many different use cases, especially during development. Unfortunately, there are a few rough edges which make a good development setup difficult, especially when you’re trying to build expressions which serve multiple purposes. Each of these purpose has quite a few constraints:

Bash arrays and `set -u`

Often you need to progressively build up a set of commandline arguments in bash, like so:

FLAGS=""
if [ -n "$LOGFILE" ]; then
  FLAGS="$FLAGS --log $LOGFILE"
fi
someprogram $FLAGS ...

This usually works, but is a bit rubbish:

  • this will break if $LOGFILE has a space in it, because bash will split it into multiple arguments
  • adding a flag is kind of tedious with the FLAGS="$FLAGS ..." boilerplate
  • $FLAGS ends up with a leading space, which is entirely fine but still feels ugly

Arrays solve these issues nicely. They can store elements with spaces, and there’s a nice append syntax:

FLAGS=()
if [ -n "$LOGFILE" ]; then
  FLAGS+=(--log "$LOGFILE")
fi
someprogram "${FLAGS[@]}" ...

You need to remember the weird "${VAR[@]}" syntax, but you get used to that (writing "$@" to pass along “all of this scripts arguments” is actually shorthand for "${@[@]}", which may help you remember).

Problem: “there’s no such thing as an empty array”

The problem is that in bash, an empty array is considered to be unset. I can’t imagine any reason why this should be true, but that’s bash for you. My problem is that I always use set -u in scripts I write, so that a command will fail if I reference a variable which doesn’t exist (just like a real programming language). But in bash, this will fail:

$ set -u
$ FLAGS=()
$ echo "${FLAGS[@]}"
bash: FLAGS[@]: unbound variable

Ugh.

The solution is even more weird bash syntax:

$ echo ${FLAGS[@]+"${FLAGS[@]}"}

(thanks, Stack Overflow)

Which roughly translates to “if FLAGS[@] is set, then insert the value of FLAGS[@], otherwise expand to nothing”.

Note the placement of the quotes - quoting the first instance of ${FLAGS[@]} will lead to an empty string argument (instead of no argument) if $FLAGS is empty. And failing to quote the second instance of ${FLAGS[@]} will mean it breaks arguments on spaces, which was the whole reason we used an array in the first place.

One more trick in your bag of weird bash tricks

Depending on your outlook, this is either another useful trick to help you write more robust bash, or yet another example of how bash actively discourages decent programming practices, highlighting how you really really really shouldn’t use bash for anything nontrivial.

Running a child process in Ruby (properly)

(cross-posted on the Zendesk Engineering blog)

We use Ruby a lot at Zendesk, and mostly it works pretty well. But one thing that sucks is when it makes the wrong solution easy, and the right solution not just hard, but hard to even find.

Spawning a process is one such scenario. Want to spawn a child process to run some system command? Easy! Just pick the method that’s right for you:

  • `backticks`
  • %x[different backticks]
  • Kernel.system()
  • Kernel.spawn()
  • IO.popen()
  • Open3.capture2
  • Open3.capture2, Open3.capture2e, Open3.capture3, Open3.popen2, Open3.popen2e, Open3.popen3

… and that’s ignoring the more involved options, like pairing a Kernel#fork with a Kernel#exec, as well as the many different Open3.pipeline_* functions.

What are we doing here?

Often enough, you want to run a system command (i.e. something you might normally run from a terminal) from your Ruby code. You might be running a command just for its side effects (e.g. chmod a file), or you might want to use the output of the command in your code (e.g. tar -tf to list the contents of a tarball). Most of the above functions will work, but some of them are better than others.

Software Maintenance and Author Intent

or, “I’ve written a lot of software, and now I have regrets”

As time goes on, people write more software. Well, at least I do. And these days, it’s pretty easy to put up everything you’ve created on GitHub or somewhere similar.

But of course, not all software is created equal. That 100-line JS library I created in one day back in 2011 which has seen 3 commits since is probably not going to be as important to me as the primary build tool I use in my own projects, which has implementations in 2 languages, an extensive automated test suite, and which has steadily seen improvements and fixes over the past 2 years with more than 300 commits.

And people usually realise this. Based on project activity, date of recent commits, total number of commits, amount of documentation etc, you can often get a good idea of how healthy a project is. But is that enough?

I’ve had people report bugs in a project where my immediate thought has been “well, this is pretty old and I haven’t used it for years - I’m not surprised it doesn’t work”. Meanwhile I see comments about another project where someone will wonder whether it still works, since it hasn’t been updated in ages. To which my first thought is “of course it still works! It doesn’t need updating because nothing’s wrong with it”.

I’ll try and communicate this less bluntly, but clearly there’s information I (as the author) know that other’s can’t without asking me - from what others can see, the projects probably look just as healthy as each other.

Why are you publishing it if you don’t care about it?

I don’t want to maintain all the software I’ve ever written. I’ve written plenty of software for platforms or tools I no longer use. I’ve written software to scratch an itch I no longer have, or which I just can’t be bothered keeping up to date with breaking API changes.

I could just abruptly delete each project as I decide it’s not worth maintaining, but that’s both drastic and rude. Maybe it works fine, but I no longer use it. Maybe others still depend on it. Maybe someone else would like to step up and take it over, rather than see it die. Maybe it doesn’t work as-is, but people can learn from reading parts of the code that are still useful. I publish Open Source software because it might be useful to others - deleting it when I no longer have a use for it doesn’t fit with that spirit at all.

Stillmaintained

A while ago, there was this project called “stillmaintained”. It aimed to address the issue of communicating project health directly, by answering the simple question “Is this still maintained?”. Ironically (but perhaps inevitably), stillmaintained itself is no longer maintained, and even the domain registration has lapsed. But I think the problem is an important one.

My solution

I think the constraints are:

  • It must be dirt easy for the author to manage. If it takes too much effort to update a project’s status, I’ll be too lazy to do it.
  • The infrastructure itself must be super low maintenance. I don’t want to spend all my time maintaining the thing that tells you if my projects are maintainted!

So to solve the issue for my projects, I did the simplest dumbest thing:

  1. I created a few static images with Inkscape.
  2. In a folder that gets synced to this website, I made a bunch of files named <projectname>.png, each of which is a symlink to a status (e.g. ../maintained.png, ../abandoned.png, etc).
  3. I embed that <projectname>.png into the project’s README, documentation, etc.
  4. When I decide that a project’s status has changed, I modify the appropriate symlink.

Now the status for all my projects is managed in one directory, and I can generate a list of active projects with a simple python script. I don’t need to go and edit that project’s README, docs and packaging metadata - it all just points to the same place.

Here’s an example badge, for abandoned projects:

It’s not fancy. There are no RSS feeds or email notifications when the project status changes. Showing an image containing text is not very accessible, nor very flexible. But it’s the easiest way for me to tell visitors to my projects what my assessment of that project’s health is, which is something I’ve never had the ability to do very well before. And since it’s so low maintenance, I’m hopeful that I’ll actually keep these up to date in the future.

In open source software, the author is under no obligation to maintain or fix anything - it’s there, take it or leave it. That doesn’t tell the full story. I want people to use my code, so just ignoring users and possible contributors because I have no obligation to them is a great way to get a reputation as a terrible project maintainer. At the same time, there’s no way I can fully maintain all the software I’ve ever written, especially as time goes on and that set gets larger. So the best I can do is to try and honestly communicate my intent as part of each project’s public documentation.

Midori Blog: The Error Model

For the past few months, Joe Duffy has been blogging about the most interesting aspects of the design and implementation of Midori, a now-abandoned research OS from Microsoft Research, which has been incredibly interesting to follow. I particularly enjoyed the latest article about the error model, but the whole series is worth a read (and a subscribe, since there are more on the way).

(view link)