Hard Lessons
Code: ,
No comments

Having worked on email-related code before, I have been morbidly fascinated by one of the founders of handmade.network writing an email client. Handmade Network is trying to reinvigorate programming by emphasizing small teams and from-scratch performant code. It’s a great way to write small, self-contained projects (games, libraries, utilities) that can be done, but fell out of favor two decades ago for complex user-facing software.

This update included a few sentences I’ve been waiting for:

The biggest lesson is that not everyone is RFC-compliant. It was a shock seeing some companies accept ill-formed e-mail addresses, developers showing their best-but-still-inaccurate regular expressions for compliance, and security agents from company’s mail servers trumping simple IMAP requests that should have yielded a proper response, but didn’t. Look, I always knew commercial software packages don’t fully adhere to a spec—not even language compilers achieve 100% accuracy—but seeing violations led to unfortunate wrinkles and hard-coding in specific recovery points when I try to talk to some servers.

From what he lists, he’s only seen the tip of the iceberg. For example, he hasn’t mentioned some of the fun problems of IMAP or talked about the woes of email encoding and attachment. Specifically, this strategy of “seeing violations led to unfortunate wrinkles and hard-coding in specific recovery points when I try to talk to some servers” is really, really not going to scale. C is a tough language for the tower of abstractions he’s going to build and rebuild in the face of unexpected inputs and dusty corners of the spec.

And email is a particularly hard domain because it’s old and *looks* simple, so there’s an incredible amount of errors you have to cope with from version 0.1. Users will never accept “Yeah, you just can’t read email from people using Outlook, it’s Microsoft’s bug.” And then on top of all that, many emails are shifting to HTML-only with increasing expectations of CSS support and you’re implementing or talking to a huge browser engine. Email was a big factor in ending my support for Postel’s maxim.

It might be another 5 months before I reach a working prototype for that [GUI], and probably another two months of polish before I consider the possibility of releasing some build publicly.

I wish him a lot of luck and there’s a tiny, windmill-tilting bit of me that hopes he’ll succeed, but I’m watching this race for the crash.

POP3 and SMTP via SSH Tunnels
Code: , , ,
No comments

I use Fetchmail to retrieve my email. I have an account that still doesn’t support SSL, but at least I also have an SSH account that on the same network. Here’s the fetchmailrc config to optionally tear down, then build and use an SSH tunnel:

poll "mail.insecure.example.org"
    via localhost port 6301
    proto pop3
    user "username@example.org"
    pass "foo"
    preconnect "kill `lsof -t -b -i @localhost:6301` > /dev/null 2>/dev/null; ssh -q -f -C -L 6301:mail.insecure.example.org:110 username@example.org sleep 20 < /dev/null > /dev/null"

It took quite a bit of tinkering over a long time to get that working reliably, so I hope it’s of some use to someone.

Along the same lines, I’d prefer my SMTP server not leak my home computer’s IP address in emails, so I tunnel to the SMTP server’s network to send email. This script replaces sendmail -t:

#!/bin/bash
 
/usr/bin/ssh -f -q -L 8587:mail.example.com:587 username@example.com 'sleep 5' && msmtp -t --read-envelope-from
exit $?

Deleting Spam From sup Maildirs
Code: , , ,
No comments

A quirk of the sup email client is that it doesn’t sync back changes like deletes to mail sources. “Deleted” messages are only flagged and hidden from the user.

After a few hundred thousand spam messages, sup slows down a bit and I have to actually delete it. I forget how every time, so this is mostly a note to myself in the hopes that someone else needs to delete mail from maildirs managed by sup.

First, type L to search for spam and delete the spam. Use T to tag all messages and =d to delete all tagged messages, perhaps after using \ to filter your view to obvious spam (and use !! to load everything that matches the query). Then quit sup and run:

# update the email filenames in the maildir so deleted ("trashed") messages get the "T" flag
$ sup-sync-back-maildir -nu
# get rid of them
$ mkdir /tmp/deleted-spam
$ find ~/path/to/mail -type f -regex ".*,.*T" -exec mv {} /tmp/deleted-spam
# sync sup to know they're gone
$ sup-sync --all-sources -o

After checking that everything’s fine, delete /tmp/deleted-spam (or ignore it; the contents of /tmp are wiped every reboot).

Extracting Immutable Objects
Code: , , , , , ,
No comments

In the last few weeks I’ve been rehabilitating some of the first object-oriented code I wrote in the hopes of bringing my mailing list archive back online. Lately I’ve been refactoring some of the earliest, core code: the Message class. It manages the individual emails in the system and, because I didn’t understand how to extract functionality, had turned into something of a God Class.

Yesterday I tweeted about a really satisfying cleanup:

tweet

Continue this post

Inbox Zero
Life: , , , ,
1 comment

A few minutes ago, for the first time in around a decade, I emptied my email inbox. I’ve been steadily whittling it down (or at least holding the line) for the last few months: catching up on mailing lists, responding to outstanding emails, admitting there’s some things that are so old I’m not going to respond to them, and moving work items onto a proper to-do list. So I have an impressively boring screenshot:

sup

Watching Merlin Mann’s Inbox Zero video and reading the preceeding articles was the impetus for the final push to zero. It’s not that there’s anything I didn’t know in there. But there was the sense that it was not only possible but achievable.

So I immediately did what Mann suggested not to do and spent, oh, a full day changing email clients from mutt to sup. It sounds like an awesome amount of time-wasting, but I read the sup philosophical statement a while ago and it resonated:

The problem with traditional clients like Mutt is that they deal with individual pieces of email. This places a high mental cost on the user for each incoming email, by forcing them to ask: Should I keep this email, or delete it? If I keep it, where should I file it? I’ve spent the last 10 years of my life laboriously hand-filing every email message I received and feeling a mild sense of panic every time an email was both “from Mom” and “about school”.

The flip side of this is that once if you’ve set up automatic filters you have to remember to go check those folders, which is a habit I’ve never been able to form. And once I’ve ignored a folder for two weeks, hell, I’ll leave it another day or two, why hurry to find out if I missed out on something interesting or if I let someone down? Or three days. Or…

And so I’ve poured all of my email into sup’s index and started mercilessly hacking away at that last couple hundred messages I hadn’t yet dealt with. Read and delete, or archive, or note on my to-do list, or suck it up and start another email with “I’m sorry it took so long to get back to you…” And now it’s cleaned out, to my pleasant amazement.

There is, of course, the terrible chance I’ve missed something important, but I couldn’t let that risk of something getting lost in the upheavel continue to paralyze me. Perfect is the enemy of good. If you’ve been waiting on a reply from me about anything and didn’t get it in the last few minutes, I’m sorry, please let me know. And if you’ve thought about contacting me but haven’t because I did so poorly with the last few emails, I’m sorry, I’m going to keep trying to do better.

Command/Query Separation
Code: , , , ,
1 comment

Objects contain both state (data) and methods, and methods should be classifiable into commands that change state and queries that introspect state. The principle of Command/Query Separation (CQS) expresses a design principle I’ve intuitively used as a rule of thumb. With the conscious consideration that comes from hearing it, I knew how to improve some of my own code.

I violated CQS in part of the design of ListLibrary.net (mentioned here). It was both beneficial and harmful (especially in debugging).

Example Thread

Emails are threaded together to show the flow of responses. In this example Chris, Mike, and Jon all responded directly to Tiago and there was a series of responses to Mike. It’s difficult to put these threads together well because email clients often don’t (accurately) include the metadata to specify the message an email is responding to. You sometimes just have to guess what goes where, and it’s further complicated by messages that are lost or (in date-organized archives) split off into other directories.

I started with the threading heuristic by JWZ. It’s a well-described, often-implemented bit of code. I implemented in two steps: first, as messages are added, their references are examined to do some basic relatinships between them (JWZ’s step 1). The rest of the steps require more computation and are more likely to be invalidated as messages are added, so I do them lazily: the threading is calculated (and cached) when it’s requested… violating CQS. The query to list the threads in a message also acts as a command to generate them.

This is both a performance win (yes, I tested) and leads to better-threaded messages, using all available messages are in reduces the amount of ambiguous situations it has to guess at, which can persist even after adding other messages that would have resolved it. But it’s also a great way to make my life difficult in testing: when I insert debugging queries they can trigger commands that change the way messages are threaded. It creates heisenbugs, bugs that appear, disappear, or change when they are being debugged.

I was really frustrated by heisenbugs for a while: I need to violate CQS for performance, but I’m creating all this extra work for myself in the most complex piece of the codebase. The solution is to create referentially transparent query commands specifically for debugging. They’ll introspect on the internal data structures and present them in their raw forms without triggering threading (making them useless for non-debugging work, but that’s OK), and their tests can confirm this.

I find the CQS principle to be a really useful tool. Its application leads to better-designed and easier-test code, and its violation can indicate bad code (a Code Smell).


Fatal error: Call to undefined function twentyseventeen_get_svg() in /home/malaprop/push.cx/wp-content/themes/pushcx/archive.php on line 45