Washington Post Update
« Why I Write Tests
» Rails Forum Roundup
Code: Django, politics, Washington Post, work
It’s been a long time since I’ve written
about what I’ve been up to at the Washington Post, so let me run down the apps
I’ve worked on since September 2007 in roughly linear order. Lots of these apps link different places,
so if you don’t see projects.washingtonpost.com at the start of the
URL, you’ve probably wandered off something I directly worked on.
- Campaign Tracker
- I made it possible to browse events by type. I’ve also gotten data to academic researchers who’ve requested copies of this (I think) unique data.
- Voters Guide
- The 2006-8 Metro Voters Guide is probably the clunkiest app I worked on. Lots of confusion between producing a one-off site and a maintained app, as well as between primaries and generals. 2009 will be better because we’ve settled those issues and we’ve built a nice database schema for dealing with elections as part of the 2008 Presidential race.
- Congress Votes
- I loaded the new info for the current and previous sessions of Congress, added the ability to view just current members of Congress (as some join or leave during the term), added JSON feeds that power the Apture displays (yes, I know about the NYT Congress API, I’m getting the signoffs to publicize the machine-readable pages).
- Staff E-Mail Form
- Mostly just maintenance, but some generalization for shared mailboxes.
- Challenge Index
- Formerly the clunkiest app, we fixed it by refocusing it on local schools (dropping national) and I rebuilt it in Dec 2008 to clean up the code. Then we imported data back to 1996, and it’s a really nice little app. Also the source of my 3rd rule of database apps (watch for the post).
- Presidential Endorsements
- During the primaries, we tracked endorsements from public figures, mostly politicians. The app is interesting today: the front page only mentions the last few candidates like Clinton, Obama, and McCain because we stopped advertising the candidates as they dropped, but their data is all still in the db. We’ve been meaning to resurface them now that the election is settled, but it’s hard to remember to find the time to be an archive instead of just news.
- Sellout Songs
- This went along with a cute piece in the Sunday Style section. Fun fact: Python has one constant for pi, the graphic used another, and the example songs used a third. If you play with the app, you’ll see it matches the examples (because that’s the only discrepancy folks might notice). This project is where I started dreading the Sunday paper: because there’s a Saturday edition (the “bulldog” edition), things have to be ready late Friday evening. And print folks are under a daily pressure that has driven them all a bit mad, so they tend to finish projects right up against the deadline. Since this project we’ve managed projects better to avoid the “OK, now that it’s 7 PM on Friday, here’s everything you need to build the web app” problem, and I’ve noticed the web has gotten more consideration in the print newsroom over time.
- Campaign Finance Search
- Candidates are required to report contributions over $200 to the FEC and we pull that data from them. We were showing state, city, and zip code totals, then I built a search engine to expose the individual records. (And we also track noteworthy donors.) Take a good look at the search page and results table so you know it as you continue reading this list.
- Dog Parks
- A small app showing off Dog Parks in the DC Metro area. We’ve talked about adding more features, but it’s never climbed high enough up the priorities ladder.
- Recipes of the Day widget
- The only (technically) interesting thing about this project was preloading a month’s worth of season-appropriate recipes at a time, because it’d grind down a producer to have to update this by hand every day. It’s a nice toy.
- Silent Injustice
- The FBI used to analyze the content of lead in bullets to match a bullet in a victim to bullets owned by a defendant. Trouble is, the science didn’t really work, and they quietly dropped it without notifying people convicted in whole or in part on this unreliable evidence. I helped put all the case files we could find online. This was a really rewarding project: it may have only mattered to a few hundred defendants and their loves ones, but it meant a lot to them.
- Primary Pages
- They showed live news and election results on election nights, and our database server struggled a little to keep up that first Iowa night. Postgresql doesn’t have good replication options like MySQL, so we beefed up the hardware before Super Tuesday and have been running great since. On some of the election nights, traffic to the projects server accounted for more than 10% of all WashingtonPost.com traffic.
- Primary Exit Polls
- Lots of fun breakdowns of data by demographic, location, response, etc. One thing I noticed on this project is that it’s hard for humans to talk about breaking down data in more than three dimensions (and three itself can be tricky). Funny how our visual and spatial skills affect the rest of our cognition.
- Post 200
- The Post 200 showcases 200 prominent local businesses; apparently it’s been a yearly staple for a few decades. In 2008 we built a site we could continuously update, rather than just drop a little data once a year. I’d still like to grow this app to encompass all local companies, or all publicly traded companies (because more data makes everything better, right?). The Post doesn’t do a whole lot of business coverage, though, so we may not be able to add enough value to make that all worthwhile.
- Clinton’s Schedule
- In the 2008 primaries, Hillary Clinton supported her claim of being more experienced with her time as First Lady, and eventually acquiesced to the call to release her schedule so people could judge her time. I found out the day before (while chatting at the water cooler) that her campaign would be dropping it off at the National Archives. Seemed like it might be fun to do something with the data, but we had no idea what format it would take. The next morning a courier picked up a CD from the Archives and we saw it in all its glory: a 17,481 page PDF that looked like it had been scanned from the fax of a photcopied inkjet printout. If you look at the news stories written that day (2008-03-19) you’ll notice they all take the form of “What was Clinton doing on this important date?” because reporters couldn’t search, just page through it chronologically. I used 200 Amazon EC2 instances to break up and OCR the entire PDF in a few hours and we put the searchable database online (and it became an AWS Case Study (note: PR people wrote that thing)). The project was an absolute blast: making interesting data available by applying cutting-edge tools on a tight deadline.
- Young Lives At Risk
- I pulled nutritional data on 11k foods from the USDA (in tilde-delimited format, weird) and it powered an interactive shopping cart. Over the few weeks of this project, the USDA called me nearly daily to chat about nutritional data. I respect their commitment to nutritional data, but by the end of it I just wanted to buy them all puppies so my phone would stop ringing.
- Sacred Ground
- I didn’t work on the (very nice) Flash app, I updated the Faces of the Fallen app to user-submitted tributes and the “Sacred Ground” templates for Pentagon victims. This looks really nice if you’re surfing from this story on the opening of the Pentagon Memorial, but it’s hugely confusing if you’re visiting from the main Fallen app. It’s one of those last-minute compromises that sometimes happen, and I plan to repair the damage.
- Business Glossary
- When the CDO market finally had to face reality and cratered, the business dept. asked if it would be possible to build a resource for all the (unfortunately) new terms readers were confused by. Working from the Politics Glossary (gee, eerily similar), we were able to respond with an entire application instead of just a promise to build one.
- Thrift Savings Plan data
- The iframe near the bottom of the page with TSP market information (sort of a 401k for government employees) is a small app that pulls data from the TSP.
- Vote Monitor
- Our team kept some empty space in our schedule because we knew the Politics team would surely come up with a last-minute idea for an election app, and they did. After some debate about whether to stay local or open up to national reports, we built a great app to allow people to report their voting experiences. I wish we’d had the idea sooner to better advertise it and work closer with other organizations that monitored the elections. Note for 2010 to-do list…
- Obama Transition Donations
- Hey, remember that table with donations presidential candidates table from back when? Obama, big on transparency, has released information on the donations to his presidential transition and we’re presenting a fairly familiar view of it. I believe they’ll release the data on January donors in about two weeks, so we’ll be updating then.
- Clinton Foundation Donations
- Hey, remember that table with donations… yeah, you remember. As part of Hillary Clinton’s confirmation process as Secretary of State, the Obama team required the Clinton Foundation to release previously-private donation information. They grudgingly posted it to their website, which crashed under the load. We scraped what we could as the site was intermittently up and presented it.
- Bush Legacy Timeline
- Finally, reaching a project that went live two days ago, the Bush Legacy Timeline graphs key indicators against important events. I created the database to store events and helped out with the Flash coding.
Some other projects that I haven’t done much more than bugfix are Local Blog Directory, Recipes, DC Schools, Political Ads, Faces of the Fallen, High School Recruiting, and Inauguration Alerts. I’ve also built a few internal tools to automate or streamline the editorial process, but they don’t make for compelling reading and this post is already plenty long.
In case you’ve wondered why I’ve called the Post the best job I’ve ever had, scroll back up and read it again. Tons of interesting, meaningful projects with more talented professionals than I could manage to list, but let me try so this post doesn’t make me look like a braggart:
The foundation was laid by Adrian Holovaty and Derek Willis. It’s the nicest codebase I’ve ever inherited. Almost every one of the apps I listed is well-designed because of Alyson Hurt. Several of these data sets came from Sarah Cohen. I’ve worked a bit with the developers Dan Berko and Ed Holzinger. Ryan O’Neil is the other developer on the Tools Team who builds all these fun apps with me, and he occasionally saves me from my bugs and design decisions. There’s also all all the news producers who created and managed apps: Jason Manning, Sarah Lovenheim, Paul Volpe, Andrea Caumont, Amy Kovac, Tanya Ballard, Amanda McGrath, Jacqueline Dupree, Alicia Cypress, and Amanda Zamora. Last, it’s been possible because of great bosses Bob Greiner, Karl Eisenhower, Jim Brady, and Dave Baker.