NixOS on prgmr and Failing to Learn Nix

Published: ← 2018-07-04 →
Category: ← Code →
Tags: docs ← nix prgmr

This is a writeup of my notes on how to get NixOS running on a VPS at prgmr, followed by more general notes on this experiment in learning nix.

Provision

I went with the lowest tier, currently 1.25 GiB RAM, 15 GiB Disk for $5/month. I’m only running weechat for irc/twitter/fediverse/slack and some miscellaneous small things. For “pre-installed distribution” I chose “None (HVM)”.

Netboot to start install

I ssh’d into the management console, ssh [hostname]@[hostname].console.xen.prgmr.com

6 bootloader
4 netboot installer, pick nixos
0 twice for main menu
4 to power off
2 to start (see “Booting” below)
1 to log in as root with no password (relax, ssh is off)

Partition

Surprisingly, the included 1.25 GB of RAM was not enough to run some nix commands. I had to back up and recreate the box with some swap space. I didn’t think too hard about it, just guessed at 2 GB and it worked OK. ⊕2018-07-09: Vaibhav Sagar suggested this is probably this known bug.

` gdisk /dev/xvda`{lang=”bash”}

` o to create gpt`{lang=”bash”}

n to create swap partition{lang=”bash”}

Command (? for help): n Partition number (1-128, default 1): 1 First sector (34-31457246, default = 2048) or {+-}size{KMGTP}: Last sector (2048-31457246, default = 31457246) or {+-}size{KMGTP}: +32M Current type is 'Linux filesystem' Hex code or GUID (L to show codes, Enter = 8300): EF02 Changed type of partition to 'BIOS boot partition'{lang=”bash”}

Command (? for help): n Partition number (2-128, default 2): First sector (34-31457246, default = 67584) or {+-}size{KMGTP}: Last sector (67584-31457246, default = 31457246) or {+-}size{KMGTP}: -2G Current type is 'Linux filesystem' Hex code or GUID (L to show codes, Enter = 8300): Changed type of partition to 'Linux filesystem'{lang=”bash”}

Command (? for help): n Partition number (3-128, default 3): First sector (34-31457246, default = 27262976) or {+-}size{KMGTP}: Last sector (27262976-31457246, default = 31457246) or {+-}size{KMGTP}: Current type is 'Linux filesystem' Hex code or GUID (L to show codes, Enter = 8300): 8200 Changed type of partition to 'Linux swap'{lang=”bash”}

w to write and exit{lang=”bash”}

mkswap -L swap /dev/xvda3{lang=”bash”}

swapon /dev/xvda3{lang=”bash”}

mkfs.ext4 -L root /dev/xvda2{lang=”bash”}

` {lang="bash"}mount /dev/xvda2 /mnt `{lang=”bash”}

Configuring nix

I generated the initial config and added a few prgmr-specific tweaks:

` nixos-generate-config –root /mnt`{lang=”bash”}

cd /mnt/etc/nixos{lang=”bash”}

` {lang="bash"}vi configuration.nix `{lang=”bash”}

Here’s my tweaks:

` boot.loader.grub.device = “/dev/xvda”`{lang=”nix”}

` # prgmr console config: boot.loader.grub.extraConfig = “serial –unit=0 –speed=115200 ; terminal_input serial console ; terminal_output serial console”; boot.kernelParams = [“console=ttyS0”];`{lang=”nix”}

environment.systemPackages = with pkgs; [ bitlbee tmux weechat wget vim ];{lang=”nix”}

services.openssh.enable = true;{lang=”nix”}

networking.firewall.allowedTCPPorts = [ 22 ];{lang=”nix”}

sound.enable = false; services.xserver.enable = false; services.openssh.enable = true;{lang=”nix”}

` {lang="nix"} users.extraUsers.pushcx = { name = “pushcx”; isNormalUser = true; extraGroups = [ “wheel” “disk” “systemd-journal” ]; uid = 1000; openssh.authorizedKeys.keys = [ “[ssh public key here]” ]; }; `{lang=”nix”}

Then I ran nixos-install to install the system.

Booting

The NixOS manual says you should be able to run reboot to boot to the new system, but something in xen doesn’t reload the new boot code and I got the netboot again rather than the new system. After talking to prgmr I found it worked if I pulled up the management console and did:

6 -> 1 boot from disk
then 4 to fully poweroff
then 2 to create/start

After this I had a running system that I could ssh into as a regular user.

Prgmr donates hosting to Lobsters, but because Alan configured the hosting, this was my first time really using the system. It was painless and getting support in #prgmr on Freenode was comfortable for me as a longtime IRC user. I liked them before, and now I’m happy to recommend them for no-nonsense VPS hosting.

Nix/NixOS

I did this setup because I’ve been meaning to learn nix (the package manager) and NixOS (the Linux distribution built on nix) for a while. As I commented on Lobsters, they look like they didn’t start from manual configuration and automate that, they started from thinking hard about what system configuration is and encoded that. (The final impetus was that I ran out of stored credit at Digital Ocean, hit several billing bugs trying to pay them, and couldn’t contact support - six tries in four mediums only got roboresponses.)

The NixOS manual is solid and I had little trouble installing the OS. It did a great job of working through a practical installation while explaining the underlying concepts.

I then turned to the Nix manual to learn more about working with and creating packages and failed, even with help from the nixos IRC channel and issue tracker. I think the fundamental cause is that it wasn’t written for newbies to learn nix from; there’s a man-page like approach where it only makes sense if you already understand it.

Ultimately I was stopped because I needed to create a package for bitlbee-mastodon and weeslack. As is normal for a small distro, it hasn’t packaged these kind of uncommon things (or complex desktop stuff like Chrome ⊕2018-07-16: I’ve learned that Nix does have a package for Chrome, but it doesn’t appear in nix-env searches or the official package list because it’s hidden by an option that is not referenced in system config files, user config files, the NixOS Manual, the Nix Manual, the man page for nix-env, the package search site, or the the documentation of any other tool it hides packages from.) but I got the impression the selection grows daily. I didn’t want to install them manually (which I doubt would really work on NixOS), I wanted an exercise to learn packaging so I could package my own software and run NixOS on servers (the recent issues/PRs/commits on lobsters-ansible tell the tale of my escalating frustration at its design limitations).

The manual’s instructions to build and package GNU’s “hello world” binary don’t actually work (gory details there). I got the strong impression that no one has ever sat down to watch a newbie work through this doc and see where they get confused; not only do fundamentals go unexplained and the samples not work, there’s no discussion of common errors. Frustratingly, it also conflates building a package with contributing to nixpkgs, the official NixOS package repository.

Either this is a fundamental confusion in nix documentation or there’s some undocumented assumption about what tools go where that I never understood. As an example, I tried to run nix-shell (which I think is the standard tool for debugging builds but it has expert-only docs) and it was described over in the Nixpkgs Manual even though it’s for all packaging issues. To use the shell I have to understand “phases”, but some of the ones listed simply don’t exist in the shell environment. I can’t guess if this a bug, out-dated docs, or incomplete docs. And that’s before I got to confusing “you just have to know it” issues like the src attribute becoming unpackPhase rather than srcPhase, or “learn from bitter experience” issues like nix-shell polluting the working directory and carrying state between build attempts. (This is where I gave up.)

I don’t know how the NixOS Manual turned out so well; the rest of the docs have this fractal issue where, at every level of detail, every part of the system is incompletely or incorrectly described somewhere other than expected. I backed up and reread the homepages and about pages to make sure I didn’t miss a tutorial or other introduction that might have helped make sense of this, but found nothing besides these manuals. If I sound bewildered and frustrated, then I’ve accurately conveyed the experience. I gave up trying to learn nix, even though it still looks like the only packaging/deployment system with the right perspective on the problems.

I’d chalk it up to nix being young, but there’s some oddities that look like legacy issues. For example, commands vary: it’s nix-env -i to install a package, but nix-channel only has long options like --add, and nix-rebuild switch uses the more modern “subcommand” style. With no coherent style, you have to memorize which commands use which syntax - again, one of those things newbies stumble on but experts don’t notice and may not even recognize as a problem.

Finally, there’s two closely-related issues in nix that look like misdesigns, or at least badly-missed opportunities. I don’t have a lot of confidence in these because, as recounted, I was unable to learn to use nix. Mostly these are based on my 20 years of administrating Linux systems, especially the provisioning and devops work I’ve done with Chef, Puppet, Ansible, Capistrano, and scripting that I’ve done in the last 10. Experience has led me to think that the hard parts of deployment and provisioning boil down to a running system being like a running program making heavy use of mutable global variables (eg. the filesystem): the pain comes from unmanaged changes and surprisingly complex moving parts.

The first issue is that Nix templatizes config files. There’s an example in my configuration.nix notes above: rather than editing the grub config file, the system lifts copies from this config file to paste into a template of of grub’s config file that must be hidden away somewhere. So now instead of just knowing grub’s config, you have to know it plus what interface the packager decided to design on top of it by reading the package source (and I had to google to find that). There’s warts like extraConfig that throw up their hands at the inevitable uncaptured complexity and offer a interface to inject arbitrary text into the config. I hope “inject” puts you in a better frame of mind than “interface”: this is stringly-typed text interpolation and a typo in the value means an error from grub rather than nix. This whole thing must be a ton of extra work for packagers, and if there’s a benefit over vi /etc/default/grub it’s not apparent (maybe in provisioning, though I never got to nixops).

This whole system is both complex and incomplete, and it would evaporate if nix configured packages by providing a default config file in a package with a command to pull it into /etc/nix or /etc/nixos for you to edit and nix to copy back into the running system when you upgrade or switch. This would lend itself very well to keeping the system config under version control, which is never suggested in the manual and doesn’t seem to be integrated at any level of the tooling - itself a puzzling omission, given the emphasis on repeatability.

Second, to support this complexity, they developed their own programming language. (My best guess - I don’t actually know which is the chicken and which is the egg.) A nix config file isn’t data, it’s a turning-complete language with conditionals, loops, closures, scoping, etc. Again, this must have been a ton of work to implement and a young, small-team programming language has all the obvious issues like no debugger, confusing un-googleable error messages that don’t list filenames and line numbers, etc.; and then there’s the learning costs to users. Weirdly for a system inspired by functional programming, it’s dynamically typed, so it feels very much like the featureset and limited tooling/community of JavaScript circa 1998. In contrast to JavaScript, the nix programming language is only used by one project, so it’s unlikely to see anything like the improvements in JS in last 20 years. And while JavaScript would be an improvement over inventing a language, using Racket or Haskell to create a DSL would be a big improvement.

These are two apparent missed opportunities, not fatal flaws. Again, I wasn’t able to learn nix to the level that I understand how and why it was designed this way, so I’m not setting forth a strongly-held opinion. They’re really strange, expensive decisions that I don’t see a compelling reason for, and they look like they’d be difficult to change. Probably they have already been beaten to death on a mailing list somewhere, but I’m too frustrated by how much time I’ve wasted to go looking.

I’ve scheduled a calendar reminder for a year from now to see if the manual improves or if Luc Perkins’s book is out.

2018-08-09: I wasted another two days trying Nix from the other direction. Rather than build up from the basics I tried to start from the top down and create a “Hello World” Rails app. It’s hard to tell around the bugs and docs, but I’m pretty sure it’s not possible to run a Rails app on NixOS.

2019-12-01: New attempt. Got an incorrect error message that /sbin/bash didn’t exist. Paved and reinstalled, then tried to build a Rails demo but I got a useless error message when it failed to install 1password (!?). The two errors came while pasting directly from the install docs and I was told “lmk when you figure out which command you were just calling wrong”. The documentation’s first example is still broken and following install steps invariably leads to errors. I’m sick of being told it’s my fault nix doesn’t work and I’m giving up on it.