How side effects make ActiveRecord models unreliable
The underlying faults in familar code that breaks in surprising ways
Let’s look at some code from an app that processes Orders. It’s not a simple online cart system, an Order for tens or hundreds of thousands of dollars is built up over weeks by customers and sales agents interacting. The LineItems might have many Adjustments that nudge up or down the price of an Order as the customers negotiate or prices from suppliers fluctuate.
Deep in the Order model I found:
class Order < ActiveRecord::Base
has_many :line_items
has_many :adjustments, through: :line_items
# ...500 lines of code...
def adjustments_subtotal
sum = adjustments.sum('amount_cents')
Money.new(sum)
end
end
A simple two-line method, right? The first line finds the adjustments and fetches the overall sum. The second line wraps it up in Money object.
(If you don’t recognize that call to Money.new, the app uses Money-Rails, your best choice for handling amounts of money. It prevents subtle bugs caused by storing money in floats and, if your app needs it, keeps track of multiple currencies.)
This method looks totally boring, but I was looking at it because the accounting team was far from bored: they were freaked out.
Customers were sometimes paying the wrong amounts for their Orders. Or the accounting department was trying to bill the wrong amount. Wait a minute: the invoices the app gave to customers and accounting didn’t agree! Who was even right?
Soon enough the customers were freaked out because they’d finished negotiating an order but the salesmen were calling back to “double-check the numbers” and sometimes increasing the total. C’mon, you’re pulling this shady sales stunt even after I’ve got a printed invoice?
And accounting was getting even more freaked out and throwing around terms like “prior period adjustment” and “restatement” and “lawyers”. It turns out the government cares quite a bit that your books balance so they get paid the right amount in taxes. (Who knew?)
What went so wrong with this simple two-line method?
Well, I opened by saying I wanted to talk about why I want to write Adapters for ActiveRecord models, so you know the high-level answer is “side effects”. Let’s dig in.
Starting from the biggest symptom: the invoices that went to customers and accounting didn’t agree. Customers had theirs displayed after they finalized their Orders and tended to immediately print them out so their accounts payable can sit on them until the “net 60” period ends and it’s time to pay. The accounting team generally looks at the invoices when the checks come in.
So someone changed numbers in the month between those two, right? There’s a bunch of code to prevent someone changing an Order after it’s finalized, but a month is a long time for things to go wrong, it’s the likeliest place for a bug.
Except that as incorrect invoices got reconciled, accounting’s copy of the invoice was always right. The customer’s was always incorrect (and wow was that fun to explain to the customer). That’s the opposite of what we’d see if Orders were changed after they’re finalized.
Reconciliation also revealed that the invoices only differed in their Adjustments. And never in adjustments negotiated by sales, only the automatic ones that were applied based on supplier inventory, season, or favorite-customer discounts. An error that only occurs in an automatic process is very likely to be a bug rather than human error.
So I went over the code applying adjustments and their tests, but both were solid. I couldn’t even reproduce this bug manually.
I investigated from the outside in. The Order view displaying the invoice had two places where it dealt with adjustments:
# when displaying all the line items:
<% @order.line_items.each do |lineitem| %>
<tr class="lineitem">
<td><%= lineitem.description %></td>
<td><%= lineitem.price %></td>
<td><%= lineitem.quanity %></td>
<td><%= lineitem.total %></td>
</tr>
<% lineitem.adjustments.each do |adjustment| %>
<tr class="adjustment">
<td><%= adjustment.description %></td>
<td></td>
<td></td>
<td><%= adjustment.amount %></td>
</tr>
<% end %>
<% end %>
# and later, when displaying subtotals:
<tr class="subtotal-lineitems">
<td>Line Items subtotal:</td>
<td></td><td></td>
<td><%= @order.lineitems_subtotal %></td>
</tr>
<tr class="subtotal-adjustments">
<td>Adjustments subtotal:</td>
<td></td><td></td>
<td><%= @order.adjustments_subtotal %></td>
</tr>
<tr class="total">
<td>Final total:</td>
<td></td><td></td>
<td><%= @order.total %></td>
</tr>
This looks tame. There’s no complex business logic that might conditionally hide anything. There’s also no crazy code updating models as they get displayed to give me flashbacks to my PHP days.
As long as they’re mentioned, here’s those two other Order methods that work with adjustments_subtotal:
class Order < ActiveRecord::Base
def lineitems_subtotal
sum = line_items.sum('total_cents')
Money.new(sum)
end
def total
lineitems_subtotal + adjustments_subtotal
end
end
Let’s move from the view down into the controller:
class OrdersController < ActionController::Base
def finalize
@order = Order.find(params[:id])
@order.finalize!
CustomerMailer.order_finalized(@order).deliver
render :invoice
end
end
We could debate all day about whether that should be wrapped up in a service object (which is a big part of why I don’t like the term), but this is also boring. Let’s look at Order.finalize!:
class Order < ActiveRecord::Base
def finalize!
Order.transaction do
adjustments.each(&:finalize!)
line_items.each(&:finalize!)
line_items.each(&:create_automatic_adjustments)
self.status = :final
save!
end
end
end
A bit more work, but there’s no conditionals and a transaction tells us that Orders weren’t saved halfway through a change and left in a weird state.
Adjustments are our big suspect and they’re up first:
class Adjustment < ActiveRecord::Base
def finalize!
self.status = :final
save!
end
end
OK, that’s boring, there must be something happening in those LineItem methods?
class LineItem < ActiveRecord::Base
belongs_to :product
def finalize!
self.total = price * quantity
self.status = :final
save!
end
def create_automatic_adjustments
if SupplierInventory.count_for(product) == 0
Adjustment.create!({
status: :final,
description: "Restocking fee",
amount: Adjustment::RESTOCK_FEE
})
end
standing_dicount = order.customer.standing_discount_for(product.type)
if standing_discount.present?
Adjustment.create!({
status: :final,
description: "Standing discount",
amount: standing_discount.amount
})
end
# ...and more similar business rules with the same struture...
end
end
Let me admit I simplified create_automatic_adjustments for this example: those conditionals were encapsulated in other methods and the Adjustment creation actually happened in a method on Adjustment. I wanted to save the distaction of following this code another layer or two down the stack.
And really, however it’s broken down, all this code is all obvious and boring. There’s no smoking gun. But believe it or not, you’ve seen the bug.
I’ve told you it’s caused by side effects and I’ve cleaned away a lot of distractions, but the bug is still subtle. If you want the practice, take a minute to scroll up and down the code before you go on to the answer.
Last chance to solve it for yourself.
The bug is happening between calls. The first call to adjustments in Order#finalize! triggered an SQL query, and cached it. Later, only for some customers and some Orders, LineItem#create_automatic_adjustments created Adjustments.
When the view got around to calling #adjustments_subtotal (directly and via #total) it used the cached list of Adjustments instead of find them all again. If the customer printed the invoice right then and there, they didn’t see those automatic Adjustments.
Five things kept this bug subtle and alive long enough to cause a lot of stress:
- Customers had to print the very first view of an finalized invoice
- Few Orders got automatic Adjustments
- Customers were used to them and taxes, so they weren’t bothered to see an invoice change a small bit
- Customers paid net 60, so accounting didn’t see it for two months
- It struck at the end of a quarter when Order volume spiked up as sales cut deals to make their bonuses
The integrated tests failed for a few reasons: they trigged an extra page load, they they had their own copy of the Order model that had an empty cache, or they fetched the adjustments from the database to see that they were correct.
From the description of the bug, you might think the problem is caching and go to the one-line band-aid:
class Order < ActiveRecord::Base
def adjustments_subtotal
sum = adjustments(true).sum('amount_cents')
Money.new(sum)
end
end
All you have to do is add (true) to the end of the association call, which flushes the cache. Poof, bug gone. Right?
The bug is not gone, it’s lurking. Any other piece of code that looks through adjustments might need to flush the cache. We could add some more (true)s to our associatoins, or call .reload on the model. More explicitly, Order#finalize! could call to clear_association_cache after it called create_automatic_adjustments.
Oh, but there’s a subtle race condition remaining: if outside code had looked through an Order#adjustments before #finalize! and created one (or, less likely, even another user to the site), another user created one), that new Adjustment wouldn’t have its #finalize! method called. So we need another clear_association_cache.
I’m going on about what other pieces of code might do because that’s the real problem here: it’s possible to write totally innocent code elsewhere in the codebase and bring back this bug. What we need is a way to write software that is more predictable and reliable, that doesn’t need more and more integrated tests. Bugs like this aren’t in associations or caches, they’re lurking every place that any model has the side effects that look normal and familiar to us. I want to isolate them to Adapters because they’re always doing surprising things in caching, in updating the database, in tweaking other objects. When we have side effects we lose the ability to reason about what’s happening when in our programs.
This is the source of that fear you feel when you don’t know what other code might do to your models, or whether you’ll dig into a simple-looking call and find slow calls to an external network API four methods down, or wonder whether an ActiveRecord callback is a good idea but you can’t see any other way to make sure code runs every time a model is saved. All these things look innocuous and might be introduced months from now by someone who can’t possibly know all the consequences of tweaking a side effect.
What’s the real fix? Look back to the previous post about how to extract side effects to Adapters.
When ActiveRecord models don’t have side effects they become Entities. Those side effects all live in Adapters and Shells, which corral and minimize the parts of our code that are hardest to reason about. You can use and test your models without them having access to the database (or even knowing if they’re persisted in the first place), which makes their tests far easier: no saving to the database, no mocks, few or no stubs, and simple assertions that check the right value is returned from a method or that an Entity’s attributes have the correct values. That’s the value of Adapters.
Pop quiz, hotshot. Open a terminal to your Rails app and run:
grep -r '(true)|breloadb|clear_association_cache' app
This will find most of these band-aid fixes for side effect bugs. It will miss roundabout patches like order = Order.find(order.id)) and .save calls that are carefully planned to leave records in the database before another method happens to search for them. It will also miss tests that are “fixed” by changing FactoryGirl.build to FactoryGirl.create, but it still gives us a rough idea of how much pain there is.
Do you have a handful or were you scrolling the terminal back up to see them all?
I haven’t gathered statistics on these band-aids, but it when I look at lines of non-test code (run ‘rake stat’ and look at the leftmost column of numbers) in apps that feel reliable I’ll see 1 or 2 hits per 10,000 lines of app code. In the apps where I’m constantly firefighting I’ll see 5-20, and they tend to multiply. In the apps I want to restart from scratch I see even more. The highest I’ve seen so far is 39, and that app was only half-finished. I think the ideal number is probably zero.