We fixed seven out of the four bugs
Streamed
- More #1318 fixing comment confidence/sorting bugs - found float precision issues causing negative confidence values. Switched to BigDecimal; dithered about clamp.
- Switched calculated_confidence to use BigDecimal instead of Float to avoid precision errors
- Fixed admin-only ban enforcement in controllers to match view restrictions
- Migration to fix cached comment scores/flags
scratch
topics
https://github.com/lobsters/lobsters/issues/1318
layers and layers of memoization and mutation
comment 123 has multiple versions: score conf c_o
comment 123 row in the 'comments' table database 3 .3 correct
comment 123 object in the ruby vm 3 .3 correct
- call update_score_and_recalculate! 0,0
comment 123 object in the ruby vm 3 .3 correct
comment 123 row in the 'comments' table database 2 <- correct! correct
comment 123 object in the ruby vm 3 .3 correct
comment 123 row in the 'comments' table database 2 <- yes .3 < old! wrong
now the db row is in an inconsistent, impossible state (though mostly converges in practice)
do we have confidence in the new calculated_confidence?
enforce only admin can ban in controller
Transcripts are generated with whisperx, so they mistranscribe basically every username and technical term. They're OK but not great, advice appreciated.
Recording
17Okay.
Not as much pre-announcement tweets on this.
Some morning stuff ran long, so I'm just going to jump right into it.
gsora_ hello!
I suppose I better jump over to the blog and get set up.
You get to see a little bit of plumbing here.
...47There we go. Oh, hey, good morning, Jisora. I recognize you.
...58gsora_ :)
So today, I have a couple of things going on.
There was the fixes for the 1318 issue that
DZWDZ, that's a mouthful of a username, had raised.
And on the last stream, I spent a bunch of time going through all of the fixes.
And we hit a really nice spot where we had all the fixes.
We fixed seven out of the four bugs.
And eagle-eyed readers here, or viewers, I guess you all are,
will notice that I didn't end up pushing the two commits from the last stream on Monday.
What's going on there is this last one ended with needing to make a migration to clean up the old comments.
And I started running it, but it was clear it was going to take at least a couple of minutes.
And it did eventually finish off stream
Well, actually it crashed off stream and then I played with it some more and I found some more issues that I thought were worth talking on stream.
And the issues with calculated confidence and comment sorting are generally pretty minor.
The whole thing where negative comments could get score sorted between positive scored comments.
It's not the end of the world because they're fairly uncommon, but
I wanted to fix it correctly before I put it live.
So let me show you what I mean.
02:59What did I call this migration? I called it recalculate.
03:09Oh yeah, and then I had some kind of weird ass kernel crash again. I'm wondering if my box has dodgy hardware. When we're having these hot, humid days in Chicago twice now, I've had hard crashes. That's no fun. Hopefully it won't come up on stream today. So yeah, I wrote half of this. So let's just scroll that off so I can kind of focus on this migration. The migration was that We want it to find all of the comments and update each score. Basically clear out the cache because the schema, the way comments work is that they memoize. Let's find them. They memoize score and flags. And flags is a count of all of the times a user has flagged that comment. Pretty straightforward. These are fairly low numbers. We looked at some of the distribution on the last stream. So you can dig into the stream archive there and skim the transcript if you want to catch up. And then the other one is score, which is the number of upvotes minus the number of flags. So let's look at what this update score and recalculate method does. Because I caught actually two issues with it after writing what I thought was a very straightforward migration. Actually, did I catch two issues or three? No, just two. It's kind of funny. Even the borders between bugs are pretty, fuzzy where it's like, is this one bug? Is this two bugs? So this update score and recalculate, it takes in the score delta and the flag delta, which is another level of memoization. There are so many layers of cache and validation because voting is the most common user action on the site and happens many thousands of times per day. And so I cared a lot that this ran in a very fast, hopefully just one query method rather than the Rails callback soup where you end up with firing off 10 or 20 queries. We have a couple of places in the code base we have that and we have the deadlocks to prove it. So I've spent a bunch of time ironing it out of the most popular action.
06:03forifone love your font
so there were two issues and the first one is a little subtle i realized if we call this on all comments it's not going to give all comments the correct confidence value and you might look at it and say okay well if we're calling it with zero zero we're not going to change the score flags which are the
records local values the and in fact we don't use those deltas oh thanks hi for a phone the font is in consolada and the italics are operator sans if you follow down under the stream there's a link to the blog archive there's links to those fonts and such the
Calculated confidence is this complicated function we spent a bunch of time debugging on Monday.
And we ignore these score deltas and tell the database, well, set them to what they should be.
Let's do the math for what's the score of this comment.
And let's do the math for how many times has it been flagged.
But then the bug sneaks in here.
Actually, two bugs sneak in here.
yes there are three yeah i didn't remember so the first one is this call out to calculated confidence which is the ruby method that generates a zero to one score for how should we rank this comment one is very good zero is bad well the hassle here is calculated confidence
Because it's a Ruby method, it has to run on these memoized values, and not just the memoized values, on this Ruby object's copy of the memoized values.
Let's make this concrete.
Imagine we have a comment.
We're gonna call this layers of memoization.
You'll see where I'm going with that.
We have
Comment 1, 2, 3.
And there are kind of two versions of comment 1, 2, 3.
08:37Let's say multiple because I can see where I'm going already. So the first one is comment 1, 2, 3 in the database. Let's say in the comments table, right? We'll say it's a... And then the migration runs and it loads the comment out and that becomes comment one, two, three object in the Ruby VM. And it has its own copies of, let's do ASCII art. We can make a little table here. So we have score and I, yeah I can express all of this with score so we're just going to ignore flags all of the same issues exist but it's not going to add anything for me to to duplicate the work here so let's say a comment has a score of three in the database well when we load it it's going to have a score of three and then we have we call update score and recalculate And what that does is it updates both the object and the database table. And we know this, right? attributes so okay well i'm passing zero zero let's explain what we're doing here we call update score and calculate so this is a little bit of like temporal order right top to bottom so if we call with zero zero we know the score on the vm doesn't change However, we don't then reuse that value. We say, let's update the comments and set the score to, we're deliberately busting the cache here and making sure that the score is set to the correct value based on other data in the database. Well, that's okay. It should be three, but what if we had another bug? What if At some point in the last couple of years on this comment, score and the number of votes in the table got out of sync. You can imagine a lot of bugs. All of this clever stuff with Delta is very fertile ground for bugs. On previous streams, I have covered that there is pretty exhaustive testing on this. And when I've edited this code, I've been really careful to build out truth tables and stuff. But it's possible there's an error. So now we say, okay, we'll set it from the votes. That could be the correct value. But then let's look at what this method is doing again. Now it says the confidence equals the calculated confidence. Calculated confidence works entirely on the Ruby attributes. It's working on this object.
11:59So we're going to just give kind of a realistic value here, probably something in the neighborhood of 0.3 based on the other values we saw. Well, it ran on the wrong values, so it's going to stay 0.3. And then it's going to update the database record, and now the database record is in an impossible state. It has a score of two, but a confidence of 0.3. Right? There's kind of a...
12:49I like ASCII art. Just say yes. These have gotten out of sync here, where now the confidence value, which is itself a kind of cached value, is calculated out of inconsistent data.
13:18The second issue, and this one is really subtle, is the way update queries run confidence order is going to use the confidence column but it uses the value of the confidence column from before this so the update is not a imperative set this column set this column set this column set the fourth one it takes a snapshot of the comment row before the query. That is the value that gets used in the conference. So I said we had multiple versions. We don't have two versions. We have the row before the query and we have the row after the query. And so then the third issue is and I'm just going to call this confidence order. I'll have to space it out. So even if this one is correct, and this one is correct, and this one, and this one, and this one, wrong, because this one is, it's sort of twice the problem of confidence. So calculated confidence is using outdated data. then confidence order is using the previous run of update score and calculate wrong data so once again this is why cache invalidation is the hardest thing in computer science is there are like six different instances we say comment and we say update And we have to know that there are all of these different copies of comment happening, even when I wrote this query in a really paranoid kind of go tell the database to use its own data because we know it'll be correct. And given that bugs in these deltas are pretty rare, I say that having checked. This eventually converges and it's out of date, but it's not really wrong. And so then this hung out for years and didn't get noticed. Ouch. So this isn't layers, it's layers and layers of memoization and mutation.
16:10Now the DB row. is in an inconsistent possible state though mostly converges so what do i mean by mostly converges in practice let's look in the database here i have a copy of the lobsters production database and the really interesting question I lost the query. So I'm going to have to rewrite these queries offhand because of the machine crash I mentioned. So let's select a couple of fields off of comments. Short ID, score, flags, from comments. OK, we can kind of imagine what that table looked like. I'm not going to select all 550,000 comments. We need a filter and actually i'm going to offhand i'm going to just order it because I know what that I want for that. Well, what i'm curious is, are there any places, the score and flags memorization has failed.
17:29So what I want to know is where score does not equal. This. It shouldn't be possible, right? It's what I've always set it to. Or. Yeah, I don't just want flat score, I want flags.
18:00So flags.
...10And now we're going to deal with an annoyance of SQL, which is I want to see what these values are. I would have to write a common table expression or some other kind of hoop jumping for this. I'm just going to repeat the subqueries. But I have to repeat them because if I name them here in the columns, I can't just refer to those in the where clause. And I understand why that is. It's just a frustrating limitation of SQL sometimes. SQL is one of those languages that's kind of accreted over the last 50 years. There have been a couple of examples in the last five years or so of people trying to make nicer syntax for SQL. I'm a little bit hopeful that one of them will catch on.
19:11So if we have 550,000 comments in the database, would anybody like to guess how many of these records will show up where the score is memorized wrong or the flags are memorized wrong? And I will give you a hint. There are kind of two correct answers to this question. And one of them is based on code we fixed last week. I will give you a big hint on it actually. Let's go look at the definition of calculated confidence. This is the version of calculated confidence that was written last week. All right, no takers.
20:05There used to be a line of code that I deleted last week that said, when deleting a comment, set the score to negative 10. And that was attempting to shove deleted comments down to the bottom of the order. But update score and recalculate didn't know that. So if anybody voted on a deleted comment, it would get changed. And it was a really crude way of busting the cache when really what we want to just do is express. We'll put it to the bottom once in the code as opposed to having more call sites know a lot about the code.
...54I have order by. So this query is going to take, I want to say around 15 seconds. The The hassle here is all of those records that were deleted had their scores set to negative 10. And now they look like update score and recalculate has a bunch of bugs. So this is, come here. So lots of comments where there is a score of negative 10. and a score actual of a very different value. Well, that's not great. There are also these three that pop out at me. They have a score of one and a score actual of zero. All right, so let's filter it down. We don't want to see... And I am going to... I always add parentheses, because I can never remember the precedence for and and or so we care about either of those conditions either score flags are wrong and we care. We say and not score is 10 and the comment is deleted. or is moderated. And in this context, the. The semantics of is deleted is the user deleted it, and the semantics of is moderated is a moderator deleted it, which took me like a year of working on the code base to learn it. But those semantics are slightly different from the exactly the same named comments on stories. One of these years, I'll clean that up.
22:54And we are seeing a little of what happens when I've actually coded off stream. Because I look, you know, kind of clever for knocking that out on the first try. Oh, guess not. Did I want to not score? I don't want to DeMorgan this. Yeah, so I want to say, show me the records where the score isn't 10 with it seems much cleaner to express it this way i don't want the records oh not positive 10 negative 10 simple typo this took a couple of iterations to actually write i'm retyping it from memory from i don't know monday night so those three records that popped out at me There are the only three records in the database that have their score or flags memoized wrong, except for the ones where the code had kind of been deliberately, I hesitate to say poisoning, that's a little strong, but deliberately putting a wrong value in there to make sorting work, which is not a great way to couple it, which is why we fixed it. The other thing that pops out at me, well, there's a couple of things I noticed. Number one, all of them have a score of one and a score actual of zero. Oh, I just realized how they're produced. The other thing that came out is they all have really close, they happened around the same time. So there smells a lot like there was a code change that went out in August of 2020, and then it was just a little bit buggy, and it got fixed in September, October, somewhere in there, and it didn't recur. Or there was something odd in the setup. One of the things that could happen here, because we saw the seam, Depending on what the transaction around this looks like, I almost wonder if when the voting code was running, it inserted or removed the vote record from the database and then crashed before this ran. It's never had a syntax error here. I would remember that bug, but we have occasionally had deadlocks. So it is possible that these are three instances where voting code hit a deadlock and so it managed to delete the vote but not that so in any case i i wouldn't guess that it is worth spelunking four-year-old code to fix exactly three records if it was more than three if it was ongoing if it was recent i would spelunk but this feels like There was something very odd and very rare and it's gone now so I don't have to worry about it. Yeah, it I always get a little nervous like the hair goes up on the back of my neck when I see weird old data and I don't know what because like there was a bug clearly at some point a bug happened and I don't understand it. But it's always a investment of time to decide. I'm going to try and figure out this bug. In this case, probably the bug is gone. I'm going to leave it. The other conspicuous thing that pops out at me is the score is always one. Comments by default, a brand new comment has a score of one because the submitter automatically upvotes it. then if the score is actual that means the submitter deleted their vote or perhaps you know a bug could have deleted their vote so that implies to me that's a really uncommon scenario that somebody unvotes their own comment it can happen hmm
27:26hejihyuuga Good morning pushcx. Good morning chat. Hope you're well
so those were the three issues that came up running the migration oh hey hedgy nice to see you yeah i'm actually doing real well so i can fix this second one we'll say the new confidence equals calculated confidence and then also
We'll use that value here.
28:03So there's one of the bugs gone. And realistically, we now have really high confidence that maintaining these deltas and running this update are happening properly. So I feel safe to say that the first bug is gone as well. If I saw a lot of comments in the live database where these score memoizations after this delta match math weren't matching up, I would say, oh, we have a bigger problem. I'll put it up here before the big comment. Then I would be concerned. But I think we got it.
29:01hejihyuuga !faq
So we saw that there were 893 rows with this old cache-based approach to re-sorting deleted comments.
pushcx https://push.cx/stream
I don't actually have a bot in here, Hedgie.
hejihyuuga Thank you
And I don't have a separate page.
The FAQ is just over on the stream archive, so scroll down a little and there you go.
This data, if I were to run the recalculate all, this minus 10 is going to get involved here, and it'll give nonsense values for confidence.
That's not what I want.
So I started updating this migration to say we want to refresh the memoization of scoring flags because three comments, and I just noted which ones they were,
case i ever want to dig back and explore in the database had the wrong memorized score and then second delete for user used to set flag moment score now handled by calculated confidence and it's funny because i know exactly why i paused here it's i'd never remember the
Active record syntax for I want to run these two.
I want to say basically update columns.
And I don't know if I can just turn this into a hash and run it.
Let's see.
Sometimes one of the things I really enjoy about working in Ruby and Rails is
Other people with similar design sensibilities have worked on the code base.
And so sometimes you just kind of like guess at an API or you guess at the name of function and it's there.
It's kind of delightful.
So I'm going to grab this code.
And then I'm going to run it in the console.
I don't want to run the whole migration.
I don't want to do the the
old dance of running and rolling back.
hejihyuuga At this point, is ruby the language you've worked with longest in your career?
Oh, I can't paste.
Can I do this?
There we go.
Linux clipboard.
At this point, is Ruby the language I've worked with longest in my career?
Yeah, I think it must be.
I started out professionally.
I started out doing PHP.
And then after
A couple of years of that, I moved into doing Python.
And then I moved into, well, PHP that I was supposed to port to Rails, but that boss was a lying liar.
So he reneged on that.
And I think I wrote PHP for the remaining two months before I found another job and departed that one.
Because when your boss tells you big fundamental lies, run.
I don't want to.
Let's see if that works.
Did I mean?
No.
Update columns.
And then after that, kind of alternated Ruby and Python, but there's been so much more Ruby.
As a full stack developer, it's primarily been Ruby.
32:51But I'm thinking about it. The other thing is, I guess you could also say I've worked with JavaScript the longest, because regardless of what backend language I'm using, until very recently, you had to write JavaScript to write for the browser. So yeah, I would say it's mostly Ruby.
33:30So I'm puzzled why update columns didn't run on that.
hejihyuuga Noted the remark about running from lying bosses haha
I think it maybe is only on a active record instance.
And then with the relation, it's update all.
And can I just pass?
Yeah, I think I can just pass like this.
...57I guess updateAll doesn't return a relation. I wanted to peek at the SQL to make sure it was going to run the right thing. That's not at all the right thing. Wow, it's a good thing these are memoized values, because that just nuked a lot of database data. It must have seen that score and flags are integer columns and called 2i on the string. And then I think if you have an arbitrary string with no numbers in it, yeah, you get 0. So update all. Not exactly what I wanted.
34:44I really wanted to pass a SQL fragment.
...52Can't do it with... Oh, I have to go straight to the...
35:03So it'll let me do what I want if I pass in a string instead of a hash. That's fine.
...18It's a fairly fast edit. even with read line not working so that's going to take a couple of seconds to run and in the meantime i am going to while that runs in the background i was mostly doing that to kind of shake out my knowledge of the api because i was not confident that this was correct so i actually don't really care that it finishes running i mean it did good but let's break this up a little for readability
36:21And if I get rid of that, there we go. So I'm solving that problem of the comp update score and recalculate running on the memoized values by just fixing all of them in the database. And then running all of this. So this migration takes, call it about 15 minutes to run, because for each of the half a million records in the database, it's going and hitting votes. And what is that?
37:13When count star takes a second to return. Okay, so it's 3.4 million. So at that point we're we're doing an n squared kind of amount of work, so I don't want to run that whole migration on the stream, I want to start it yeah I can start it at least.
...39But now i'm i'm happy with this migration now it's going to do the right thing. Although the one thing it's going to do is take a track transaction.
...55And since I'm fixing old data in the database. I don't actually want to lock comments and votes for that 15 minutes or so. So we're going to go ahead and say. there it is if i locked comment and vote for 15 minutes that's a lot of people at least yeah probably three or four people writing comments and 10 or 15 people who are voting and then They're either getting 500s or their data is getting thrown away. Either way, that's unpleasant.
39:03Incrementally.
...18long enough, it's worth another line.
...31I don't know DZ if you're here, but you had said on the last stream that you thought there were two migrations to run. And I wasn't sure if you were referring to what I just wrote of Busting these caches and then calculating all the confidences. If you were referring to those two steps, you thought they should be two different migrations. Or if there was something else, if you are on the stream, please pipe up. I know I asked last time, but I think we got distracted and never came back to it. So I wanted to follow up if you happen to be here. This one. The fixing old data is the easy part. Standard, why are you mad at me? It's a breakman warning that there might be SQL injection. I'll have to run breakman-i because I changed the signature of this. So breakman is this really handy gem for catching things that like, hey, you just directly put a method call or an attribute into a string and threw that at the database. That's pretty hinky. That's a pretty good foot gun. And break man is very, very clever so it also I think it takes like an md five some of what this string is, and if we touch that it says hey it changed you better look at it again and so i'm going ahead and writing the note to say that this is safe.
41:31And then this fingerprint is removed. That'll be the old version of the query.
...41So I appreciate BreakMan quite a bit. When I added it, you know, it had a bunch of these kind of false positives. You know, there are now 22 ignored warnings. I want to say one or two of these I looked at and I, as paranoid as I try to be about security stuff in Rails, And it's easier to be paranoid about security stuff in Rails, having worked in Rails for 20 years, because I've seen all the security stuff get added one at a time, as opposed to having to learn it all at once. I was still caught by one or two things where I was like, maybe that was a potential problem. So I'm really glad to have added that. Let's run the whole suite here real quick. So I will amend that commit one more time. At least I don't have to force push.
42:41Oh, yeah. Well, I can't run the suite with migrations pending. I can at least run the linter. That was the one I didn't run. That looks like a Debugging statement got left in. Let's go find that. 206. Yeah. Oh, that's... At the end of the stream, as I was writing that migration to update all score and recalculate, that's one more topic.
43:29A running topic in the last stream was I had treated calculated confidence as a trusted black box and written code that was based on the range of values I saw in production because I just trusted the math and I didn't want to dive into all of this complexity. And I missed a fairly obvious clue that the range of confidence should have been zero to one. and it wasn't in production. And what DZ discovered was this function has always, until they caught it and we fixed it on stream on Monday, has had bugs and calculated confidences incorrectly. And one of the things we talked about was, well, we could make a kind of test out of that. We could make an assertion out of that. That you know, this is not a full we don't have these nice post. it's nice way to say that we are going to assert on our output value, so we have to do it in this very verbose way. Because yeah one of my spice your takes is that most design patterns are. gaps in a language or misfeatures in a language, and then you have to have a design pattern to work around it. That's part of why... And maybe I don't even hold the spiciest version of it because there are still design patterns in Lisp, which is as close to the AST as you're going to realistically write. I don't know, maybe that is the design flaw in Lisp is that you have to implement the language as you go. So I wrote this test so that when the migration runs, which is going to touch every line in the database, I wasn't sure if I would keep that. The way I wrote that at the end of the last stream was that this was a raise. hassle with arrays i honestly expected it to pass i thought the fixed calculated confidence would always return something in that range and then i guess it's kind of a giveaway here well if i change it to puts you know it failed at least once and i wanted to see how many times it failed you have an error in your sql syntax when i pasted that back, we've picked up a space there. Give it, what was that? 10, 15 seconds to get past the first query and into the next.
46:36And I wanna say for, Three or four records.
...48This printed out a value and said, oh, the confidence for this particular comment.
Let's change that and start again.
Cause I don't want to always show raw IDs on stream.
The.
couple of confidences that it printed out were like negative 2.2 way out of spec and I saw that and I felt a dread of oh no we have this formula with 20 terms in it I'm gonna have to go back and forth between this and Evan Miller's blog post for another hour trying to figure out where the typo is
And I had a thought, which is, you know, anytime you have work, I think, can I get out of doing this work?
And then that's when this stuff occurred to me of, oh, update score and recalculate is going to be running on these old values.
Maybe this is a function of that score is negative 10 kind of stuff happening.
And so I thought, well, let me just punt on it and.
arh68 so is there like a formula version of that ? left lookin sus
arh68 also morning everybody
fix the bad data that's in the database and then see if this persists later so we'll have to wait on the output of this is there a formula version of that left is looking suspicious well left is the one we fixed before and if i pull up these sign in with google things are such a pervasive nuisance
Good morning to you too, ARH.
I've wondered about your username, ARH.
Is it ARH68 because you're a baby boomer or is it because you're almost nice?
49:05gsora_ ha!
arh68 arh67 was taken, presumably
So down here, Evan Miller's post, luckily Ruby was very popular in 2000, what eight, nine, early 2009 when he wrote this post.
So he gives his own version of it.
And I was, I noticed stuff like he names his variable P hat.
Come on, because he doesn't want to call it like new probability or.
think what it is is the number of positive interactions but i don't know why he puts the circumflex on it because he doesn't have another p variable this is anyways so if i put this in i am replacing one opaque formula with another opaque formula and i don't have a lot more reason to think this doesn't have any of those kind of order of operation
bugs because I don't know that he wrote tests on any of that or has comprehensively tested that by using it in production.
So I don't have more confidence in the code.
So I was like, all right, let's just default to the thing that is less churn.
And then he does also have a SQL version of this.
And I've said somewhere, probably
probably in a comment somewhere, that I wouldn't mind moving calculated confidence entirely into the database.
50:45But I've looked at this formula and this is an entirely different formula.
So
he talks about doing a 80% confidence, and here he makes it an argument, and elsewhere, up higher up, he talks about it as an 80% confidence, and here, this code is basically ported over from Reddit's code base, and it also used this value of 80% confidence, and then here he uses this value so that he can get 95% confidence, which is a fairly minor thing, but,
Is it this one?
I don't recall.
It must be this one, the 1.96.
But there's just so many fewer terms in here that this cannot be the same formula.
I can't immediately spot everything that's missing, but there's a bunch of taking things to different powers and squaring them that just simply isn't present.
arh68 it's like that scene from Lost in Translation lol
And so when I read this, this sequel, I didn't totally get what he was trying to say because they cannot be the same formula.
I guess it's this sentence that the Z score never changes because a big chunk of this, well, maybe that's what it is.
So it doesn't really name the components of this.
This is why I struggled to read it very clearly.
But I think what he's saying is that here, this section that gets called under in our code is a constant value.
If you take for given that you have a constant confidence, the only input here is Z. I'm not sure what's up with that alpha.
I'm up at the limits of my
reading math formulas and then also some of these terms here like all of this stuff related to z here here and up in this numerator they're going to produce constant values and so i think what he's saying is that he's simplified stuff out for this sql query
Yeah.
I don't mean to look a gift horse in the mouth, but it is not how I would have written this.
I would have written, like, here's the direct porting, and then here's the version of it where we are refactoring because we can figure Z is a constant value based on confidence.
And then here's us porting the final version over to other languages.
I understand how that significantly more work for him so that's no criticism of hey you didn't do enough free stuff to make it clear but.
That was a little rough.
And since we memorize confidence anyways.
I have a pretty strong aversion to putting something like this in an order clause.
54:17Since I'm going to memoize anyways, I might as well memoize in Ruby. There really hasn't. It would simplify update score and recalculate, and it would remove that prospect of not correctly maintaining the score deltas. But I'm not sure that gets us a lot of value. And it's a bunch of risk around something that is it's not hard to write a golden master test for, but given the complexity of the formula, it's hard to write a test where I'm confident about getting out the correct values. And so I have an inertia in my coding where I'm like, well, let's make at least, at least make a small change. Cause then in six months or six years, when I come back to this code and I'm confused and frustrated by it, at least I can follow the history as opposed to, You read back and, oh, we threw it all out. You have to start over.
55:27arh68 wait so in the Math.sqrt ... what is added ?
All right.
How's that migration going?
Yeah, still running.
So in the math square root, what is added?
I'm not sure what you're asking about here arh are you referring to like what are the terms here Z is a constant and is the number of positive and negative sentiments or up votes and flags.
arh68 I see, the /n is distributed. nvm all good
And I think what's happening in his query is he's basically factoring out the Z.
Because it's a.
constant factor yeah the n is distributed yeah it's it's more arithmetic than i care to do on stream because or i should say algebra because it's such a fiddly thing but i can see how he rearranges the
I can understand how it is possible to rearrange this formula to get something like what he did, but I would almost want to do it by hand to have more confidence in his code.
I don't know.
So all of this is related.
I'm amending the commit again.
Once this migration finishes here on my copy of production data at that point i'll feel enough confidence to push this commit to master on github and deploy until then it's just going to hang out for a minute.
57:29So on the. previous stream there were a couple of things actually this is actually so let me be clear i'm going to change gears away from this code unless anybody has any more follow-up questions i think it is complete i think it is going to run fully and then at some point after that migration runs without any errors i'll have pretty high confidence in this code and i'll just delete this assertion or i'll convert it into arrays But either way, it's not worth just sitting here on stream and waiting for 15 minutes or so for it to finish running locally.
58:19If I were running this on a huge production database where instead of half a million comments, I had 50 million comments, then I would be breaking this out into multiple threads. It would feel like it was worth the investment. But again, Lobster's is. much smaller and doesn't need that level of professionalism but this is it's one of those things that you see an example code as highly parallelizable or embarrassingly parallelizable man tough word i have to slow down on it all right so given no objections i was going to pull up the couple of odds and ends that we didn't get to last stream.
59:18Because this turned into something quite big. I wanted to enforce that only the admin can ban. So there's actually let me grab that language i'll bring it in here you're still running you're going to run for a while hopefully i'll come back to it before the end of the stream i'm not going to run terribly long today because i'm headed to a conference and i have to head out I don't know, about half an hour. So this will be a fairly short one. There's something I noticed. We were in the chat room discussing how bans happened, actually. It's always a fun meta conversation.
01:00:28There's this chunk of the user profile here that says, if the user is an admin, print them some admin information. And this section is the forms for banning and unbanning users.
...52That's all fine. very deliberate choice about the site that the admin can ban people and the moderators can't admins mods can delete comments delete stories or edit stories nobody can edit comments that's not a feature i i i'm speaking a little universally there but i mean as a moderator kind of action users are allowed to edit their own comments for a few minutes this check in the view of is the user an admin before we show them that is not reflected in the controller. The controller says you have to be a logged in moderator to do these actions. Which one's correct? I guess I answered this, that the view is correct. Only the admin is supposed to be doing these things. But it is not what is expressed here. And I'm not worried that the mods are going to go rogue and, you know, forge the HTTP requests or edit the form to send that. But it would be a nice little thing to fix up.
01:02:30I can't remember if I defined a helper. So there's, yeah, okay, there is one. Require logged in admin. So this is a very small fix. It's a very small fix for, oh, we saw something interesting over there. Let me finish the thought here. Require logged in admin. This one, would I make any tests for it? No, not really. It's never happened in production. Someone reading the mod log would have noticed.
01:03:32It's not totally impossible if I dug back that I would see that JCS implemented both of these at the same time. And so his intention is unclear from the code, but he's made comments elsewhere that makes me think that he was very deliberate of only the admin should be able to do it. It's worked out well for us. So I don't quite want to change that. So I'm going to go ahead and commit that, and then we're going to go look at the migration that's warning us of badness. I already committed. Great.
01:04:15It's kind of funny to see a query that runs for 10 or 15 minutes, and then one of literally half a million records is wrong. I want to say it's in the neighborhood of three or four. You know it's fairly rare if it takes minutes to get to it.
...49So I want to look at these values. I also want to literally go look at it. Yeah, that's really uncommon. That is a very strongly negative score. And those are rare as heck. And since I'm logged out, we probably can't even look at the comment. Wrong clipboard. Man, who gave Linux like nine clipboards? I'm increasingly tempted to unify all my various clipboards.
01:05:33Yeah, that's literally a little mean. Hmm.
...47Okay, so the primary thing that's interesting about this is the score is negative seven and the flags are seven, which means the submitter removed their own upvote. That smells a little bit like the other bad data we saw. How many comments have a negative 7 score? Probably very few. 50? Yeah, sort of. 1 in 10,000? Let's ask a different one. How many comments don't have a matching vote from the author? I'm going to bet this is it's going to be higher than this previous one, but it's going to be pretty darn low.
01:06:52So where the vote is there and.
All right, is it left out or join?
I get a matching row whether or not it found something to join.
Boy, does that not look like the right syntax.
Yeah, that must not be the right syntax because there's no way that the authors on voted on 518.
Did I reverse that?
arh68 my mind says where not exists but i'm still drinkin coffee
Is it right out or join that I wanted there?
No, I think it's left because I want all of the rows from comments
My mind says where not exists.
01:07:38Yeah, that's the easier way to say this.
...58Oh, that's...
01:08:04chamlis_ did you mean to join on user_id?
That's the issue with the other query.
Oh, read line.
I lost it.
I did mean to join on user ID.
I meant to join on comment ID.
All right.
Let's just put it in one of my nine clipboards.
Yeah.
Shamless, you got it.
...34I want both. I care that the author voted. Oh, it's singular. Yeah, let's rewrite that as the exists because nobody's guessing the... left out or joined syntax correctly.
01:09:31All right, that's running long enough. I think it's actually the correct query. So like I said, my guess is we're going to have more than 52, but I will say less than 500 rows just to kind of put some bounds on for my intuition. Authors, comment authors only rarely unvote their own stuff, even when they are very very in the wrong although we can see from that score seven flag seven that that comment author must have unvoted their comment 26 27 000 wow that's higher than i thought that's what yeah yeah i don't believe that number
01:10:34arh68 can you just select some, like from the past 3days
The query looks good.
We want to find all the comments where.
arh68 that doesn't makes ense, but I can't see why
Yeah, at that point, let's grab a couple.
01:11:17Hmm. Guess we're going to take a second to load to get the browser ready for. All right, so here's three recent ones.
...38Someone who edited their comment. This is a. This is a pretty anodyne comment. It's not immediately leaping out at me that anything is funny about it. I am going to, yeah. So limitation here is I want to just pull all the votes out of the table, but that includes a bunch of IDs. I think if we all agree not to make IDs a weird status game, that's fine. Because it's fairly limited.
01:12:20The thing that would have been interesting is if this was heavily flagged or something else was going on. But I just don't believe that the... I don't know anything about this author, but I really doubt that they unvoted their own comments.
...40What do I want off of votes?
...47want everything. Well, you know, I want low cap.
01:13:14There's Oh, not where ID. I was going to say, that doesn't make any sense. So there's a score of three. There are three votes on it. If we...
...36The user unvoted this comment. Wow, that really surprises me. I, ..
...50A possibility here is that lots of users are accidentally unvoting their own comments because they're tapping it on mobile, but we actually have an issue file that says it's hard to hit those arrows. I'm totally puzzled by this behavior. The other thing that catches my eye is it says 12 days ago. So I mentioned the possibility of that deadlock bug where maybe we inserted the comment, but not the author's vote. But I know I didn't get any deadlock emails in the last 12 days because we haven't seen it for, I don't know, six weeks in prod, maybe two months. Maybe this is just really odd user behavior.
01:14:47Do we have any more examples?
No, not yet.
arh68 what aboutt he other 2 ?
So we have one comment that has, what about the other two?
The other two what?
01:15:14arh68 other 2 short ids
The other two votes, there are three other votes.
The other two short IDs.
arh68 3 most recent comments unvoted
This is different from the three comments that had bad data in the other way.
Oh, the three most comments unvoted.
Yeah, we can look.
But this is so unexceptional that it kind of
knocked out my interest in that.
So we got what?
Yeah, it prompted a conversation.
01:16:16arh68 i don't see why anyone would unvote em
These really don't look like any bad paste.
...29Odd that this is disowned.
I wonder if this comment is Andy C. They're almost the only user who writes horizontal rules.
What's the topic?
Python.
rasheed018 Good day from South Africa SeemsGood ... it might be off topic but I kinda got a question ... is web dev or full stack dev still worth learning in this day and age in terms of AI and the current job market?
I mean, there's nothing like this one has something interesting because people disowning their comments to the inactive user is pretty uncommon.
But this one and the other one are so anodyne.
01:17:13Hi, Rashid.
Welcome.
arh68 i've never even seen the inactive user, huh.. HahaHide
I think web dev and whether that's full stack back end or front end is worth learning.
AI is not at the level of sophistication where it can write entire apps.
And we are still very much figuring out its use as a tool to augment programmer productivity rather than replace programmers.
I think we're at a really interesting time in
Changing tools, it feels a lot like the development of garbage collected languages or Rails, to use another web dev example, where there's a big jump in productivity.
We'll see.
Mystery data.
The other part is, let's look at calculated confidence.
I'm not even in the right model.
rasheed018 I needed to hear that 🙏 🙌
So calculated confidence only works on scoring flags.
So it should be trivial for me to reproduce this, right?
I should be able to say comment.new with the score of negative seven and the flags of seven.
01:18:47So we did get this warning that it came up with... Oh, you know what catches my eye here?
I didn't see it.
e to the negative 17.
This isn't negative 2.
This is like negative 0.0000 something 2, isn't it?
hejihyuuga that's basically 0
Right?
This is... What's the...
Yeah, that's basically zero.
And that makes sense for a score of negative seven with no upvotes.
Yeah, that's about the lowest you can see.
I'm not getting the float syntax right.
Yeah, I'm getting a number that very much rounds to 0.
I'm not getting the padding night right.
Do I want to say what, like 3.20?
Yeah, there we go.
hejihyuuga i don't think so
325 like are we hitting float min is that what this is boy doesn't that look like a suspiciously close number i mean it's to negative 308 but that's with a
arh68 e-308 is way smaller lol
I don't know offhand if this is a 32-bit float or a 64-bit float.
Using the native, so probably 64.
This is a 64-bit machine.
All right, so I'm going to leave this running in the background.
And I'm going to tweak this.
The migration that's running cannot immediately start using this code, but we can.
I want to get this value.
01:21:08All right. Now I'm going to put that back before I forget I did that thing. I pulled up the float docs. Yeah, and they also give that same value. Yes, much lower. So what's happening here? We're saying if 0 to 1 cover confidence. All right, not confidence. Obviously, I renamed the variable here. That seems grossly wrong. And floats are so much fun alright, so what if I said, this is zero dot zero I don't know if I can even make a float range and Ruby. guess I can't. Oh doesn't help if I typo.
01:22:22chamlis_ low is negative though, right?
wow.
So low.
is low is this real low number oh it's so i guess the concerning thing is that it's slightly negative yeah yeah shameless you got it
Okay, so it's not just close to zero.
It's just very, very slightly negative.
arh68 you could just take absolute value Kappa
arh68 wrap it back 'round
You know, DZ had cautioned maybe we might see the accumulated error of float rounding.
I could just take the absolute value.
I mean, at this point, yeah.
chamlis_ epsilon time!
I think maybe we are seeing the accumulated error of float rounding here.
Because as you look at it, we do a lot of multiplication and division.
Epsilon time.
Oh, yeah.
Just saying, are we within this much of zero?
I don't have much experience with float rounding bugs.
And so I am aware they are a thing.
And I have the, you know, I've seen them in the context of
You do a bunch of stuff on a web store that uses floats for prices, and eventually you get a cart that's one cent off from how many widgets the user is trying to buy.
But I've done very little graphics and 3D work that involve tons of float multiplication and division, so I don't have a lot of intuition developed around things that are likely to produce float error.
But boy, does all of this work in the range of zero to one feel like a breeding ground for very slight range errors.
hejihyuuga i'm pretty sure just normal arithmetic accumulates float error over time
I'm a little surprised by this number crossing over zero.
01:24:48Oh, I know for certain that normal arithmetic, even without multiplication and division, I guess what I should say is, because there's a couple of those famous ones, like 0.01 plus 0.02.
gsora_ work's calling, gotta drop, thanks pushcx!
There's like some really common low value I'm not remembering offhand where you can reproduce this kind of issue.
Yeah.
Is it 001 and 002?
I don't recall offhand.
I'm not guessing it off the top of my head.
I'll see you later, GSarrow.
I'll see you around the site for sure.
mjiig An entertaining example if you've not seen it before, sum a million random floats between 0 and 1, then sort them and sum them again
So I'm not shocked that we have an error.
arh68 it's 0.1 + 0.2
I'm just slightly puzzled by it crossing over zero.
That feels like a slightly different phenomena.
Sum a million random floats between zero and one, sort them and sum them again.
Oh, that's an interesting way.
0.1 plus 0.2.
01:25:54Yeah, thank you, ARH. That is the trivial go-to example. I didn't quite remember it. You got it.
01:26:07So does this math look like it has something? It's likely to cross the zero point. Because maybe that's all this is. Yeah, left minus right. So we have these two terms, left and right. If right goes slightly large and left goes slightly zero, and then we do our division, that's where we slip in a negative number. Okay, I feel not enormously confident, but slightly confident that this is just the accumulation of float error. So I have been trying not to replace all this code, but this is another value of factoring out z. If we factored z out of this in the way that Evan Miller described, we would do much fewer arithmetic steps for error to sneak in. And it's not that I'm worried the number is wrong when we're 22 digits after the decimal. It's that this very simple assertion trips. Yeah, that's still going. All right. Thanks for talking through it. Yeah, I'm, I'm pretty confident. This is just load error accumulating.
01:27:48That's a fun one.
arh68 ya when p = 0 it cancels a lot
That's one of those where I wonder if what's the
Is it big decimal?
Yeah.
arh68 it's like zz/2n - z Sqrt (z2/4n2) which should zero out
What if I said instead of this, we said big decimal.
01:28:16And then down here, there was one more. Ah. so with the way ruby kind of infers and carries type through that might be enough to shift this whole thing over to the back this all right where's that reload because if this goes away well i mean then we have really strong confidence that this is just float accum error accumulation hey look We got a very small positive number. Yeah, I got nervous because it changed the sign. But this is just... And then where's that? If we print it out to 30 digits, we get what we expected of. This is a very low number that is basically zero, but it is not zero. Yeah. So that was one thing I didn't catch when I first ran this migration is I fixated on the minus 2.2 and I didn't notice there was an exponent at the end. And I think we've worked out that this is float error. I don't think the difference in these two values of are we at minus 0.17 zeros or plus is worth it. I think at this point I am fine just clamping. Do we have a clamp method in Ruby? Surely we do, right? like I've looked for it before. All right. Yeah. All right. It works the way I'm expecting. Can I pass an expected one or two arguments? I wonder if I can just give a range. I can. That's nicer.
01:31:10I'm pretty happy with that.
It's clear what's happening.
We're returning a value.
We're avoiding the issue of accidentally returning weird values and shoving them in the database, which was a thing we saw before.
I didn't write that query before this migration started.
should have to think do we have things that are out of range out of the range zero to one in the database oh actually no we know there is we talked about how that was a sign i didn't pick up on and the whole point of writing this was to fix all those values so there we go i'm going to let this migration finish on its own time and assuming i only get more infinitesimally small numbers that are just barely out of range i'm going to call it good and commit the code
But I think that's my stopping point for today, because I've got to head out to that conference.
There we go.
All right, I don't need to commit all this big decimal stuff.
That was useful for figuring out what's happening, but that's not a level of, well, don't just leave it as big decimal.
thisusernamenotexist How can you implement a metaprogramming solution in Ruby that dynamically defines multiple nested classes while ensuring each one has unique instance variables and retains scope context?
thinking about performance without having tested and the difference for one query is pretty cheap that's fine username that does not exist how to implement a metaprogramming solution that dynamically defines multiple nested classes and retains scope contexts
You make sure that the multiple nested classes have different names.
If they have different names, they will have different...
I just did somebody's homework, didn't I?
thisusernamenotexist Oh thanks 🙏🏻
Yeah, I got played.
Please don't give me your homework.
It's a little rude.
01:33:20I got nerd sniped.
So, yeah, maybe I'll just leave this as big decimal.
The big decimal version of it didn't error on that particular row.
And then this comment is out of date, but...
I think I'm fine with that.
I mean, it should...
It's sort of weird.
Like I'm going back and forth of, I almost want to restore the raise version of this because the big decimal shouldn't have this failure mode.
And if it has another failure, yeah.
Yeah.
I'd rather just use big decimal.
arh68 you could just if-else when p = 0
Like this is not, this isn't the place to save cycles.
01:34:40thisusernamenotexist I don't even have rubies or any jwellery still I am here
Yeah, so we'll just say.
...48Put that cover back in. That can be an inline comment. Now we have arrays and then we just return confidence. It feels like there should be like a block idiom for this in Ruby so that I don't have to say the name confidence again at the end, but maybe that's a little too much syntax. Good. All right, let's go look at the diff. Change float to big decimal. We gain confidence over a problem. That's good. I'm gonna commit this, but this is gonna be a fix up. Fugitive, why are you mad at me? Why can't I commit? Oh, cause I don't actually stage that. All right, so we'll fix up confidence. So we had last stream, we fixed, Seven out of the four bugs that DZ recognized. And now we just caught an eighth. Programming. Somebody asked me once if I thought AI was going to get rid of programmer jobs. And I was like, no, we're so good at creating bugs that we'll never run out of work to do. And I think we got a little demonstration of that. All right. Well, thanks very much, folks, for hanging out and coding with me today. I'm going to roll out. Hopefully you will see this commit post up to prod later today. Might be late this evening because I'm going to be running in mostly offline. Yeah. Have a good one, folks. Take care.