You're allowed to be evil in performance code
Streamed
Bug #1313 prompted a deep dive into the most dangerously clever code on the site. We explored why the code works that way, figured out why the performance improvement in #1308 was putting comments in the wrong place, tried to fix it, realized more problems, and ultimately reverted it. (And then fixed a small bug right at the end for funsies.)
scratch
https://github.com/lobsters/lobsters/issues/1313
https://github.com/lobsters/lobsters/pull/1308/files
zeal offline docs: https://zealdocs.org/
sneak peak of https://recheck.dev
confidence_order = concat(lpad(char(65536 - floor(((confidence - -0.2) * 65535) / 1.2) using binary), 2, '0'), char(id & 0xff using binary))
confidence_order_path in Comment#story_threads (and more)
a [174, 82, 255]
b [174, 82, 255]
c [174, 82, 255][0, 0, 255]
d [174, 82, 255][0, 0, 255] <- wrong place, reply to a
for arh68: concat in short_id?
c_o_p length is 3 bytes * 31 (depth of the deepest comment)
if using short_id, 7 * 31 = 217
eventually c_o_p becomes so wide the query perf falls off a cliff
q: can I avoid reverting #1308?
revert:
it's safe, it works, it's well-tested
don't revert:
it's faster, and might help with deadlock
more clever special-casing
invalid cached data on story
either: pull back id for final byte OR insert random number
random byte would get 1/256 misparenting
chamlis_'s random byte idea:
diff --git app/models/comment.rb app/models/comment.rb
index e2f1b8ce..cc14a1d0 100644
--- app/models/comment.rb
+++ app/models/comment.rb
@@ -24,6 +24,7 @@ class Comment < ApplicationRecord
dependent: :destroy
attr_accessor :current_vote, :previewing, :vote_summary
+ attr_reader :id_placeholder_byte
attribute :depth, :integer
attribute :reply_count, :integer
@@ -165,7 +166,8 @@ def assign_initial_confidence
self.confidence = calculated_confidence
# initial value for confidence_order because the submitter puts 1 upvote
# skips the overhead of calling update_score_and_recalculate!
- self.confidence_order = [174, 82, 255].pack("CCC")
+ @id_placeholder_byte = rand(256) # exposing for test
+ self.confidence_order = [174, 82, @id_placeholder_byte].pack("CCC")
end
def assign_short_id_and_score
@@ -518,6 +520,7 @@ def plaintext_comment
def record_initial_upvote
# not calling vote_thusly to save round-trips of validations
Vote.create! story: story, comment: self, user: user, vote: 1
+ # touch the comments confidence_order to insert the byte for the comment id
unless score == 1 && flags == 0
# A real comment starts with 0 flags and score of 1.
diff --git spec/extras/bitpacking_spec.rb spec/extras/bitpacking_spec.rb
index 64239751..89a5d953 100644
--- spec/extras/bitpacking_spec.rb
+++ spec/extras/bitpacking_spec.rb
@@ -153,12 +153,12 @@ it "uses the low byte of the id in the last byte of confidence_order" do
c = create(:comment, id: 256 + 9, score: 1, flags: 0)
c.update_score_and_recalculate!(0, 0)
c.reload
- expect(c.confidence_order.bytes).to eq([174, 82, 9]) # id is the low byte
+ expect(c.confidence_order.bytes).to eq([174, 82, c.id_placeholder_byte]) # id is the low byte
end
it "increments correctly" do
c = create(:comment, id: 4, score: 1, flags: 0)
- expect(c.confidence_order.bytes).to eq([174, 82, 255]) # placeholder id before vote
+ expect(c.confidence_order.bytes).to eq([174, 82, c.id_placeholder_byte]) # placeholder id before vote
create(:vote, story: c.story, comment: c)
c.update_score_and_recalculate!(1, 0)
c.reload
diff --git spec/models/comment_spec.rb spec/models/comment_spec.rb
index 6e86599e..e1c35e25 100644
--- spec/models/comment_spec.rb
+++ spec/models/comment_spec.rb
@@ -251,4 +251,24 @@ it "doesn't limit slow responses" do
expect(c.breaks_speed_limit?).to be false
end
end
+
+ describe "confidence_order_path" do
+ it "doesn't sort comments under the wrong parents when they haven't been voted on" do
+ story = create(:story)
+ a = create(:comment, story: story, parent_comment: nil)
+ b = create(:comment, story: story, parent_comment: nil)
+ c = create(:comment, story: story, parent_comment: a)
+ sorted = Comment.story_threads(story)
+ # don't care if a or b is first, just care that c is immediately after a
+ # this uses each_cons to get each pair of records and ensures [a, c] appears
+ relationships = sorted.map(&:id).to_a.each_cons(2).to_a
+
+ # kludge, spec is necessarily flaky - 1 in 256 times, generated ids can be the same byte
+ if a.id_placeholder_byte == b.id_placeholder_byte
+ true # not a meaninful result
+ else
+ expect(relationships).to include([a.id, c.id])
+ end
+ end
+ end
end
"better" approach adding one query to update the comment id byte:
diff --git app/models/comment.rb app/models/comment.rb
index e2f1b8ce..7cc36ae1 100644
--- app/models/comment.rb
+++ app/models/comment.rb
@@ -518,6 +518,8 @@ def plaintext_comment
def record_initial_upvote
# not calling vote_thusly to save round-trips of validations
Vote.create! story: story, comment: self, user: user, vote: 1
+ # insert the low byte of the id to confidence_order
+ self.update_column(:confidence_order, [174, 82, self.id].pack("CCC"))
unless score == 1 && flags == 0
# A real comment starts with 0 flags and score of 1.
diff --git spec/models/comment_spec.rb spec/models/comment_spec.rb
index 6e86599e..a7fb7e24 100644
--- spec/models/comment_spec.rb
+++ spec/models/comment_spec.rb
@@ -251,4 +251,18 @@ it "doesn't limit slow responses" do
expect(c.breaks_speed_limit?).to be false
end
end
+
+ describe "confidence_order_path" do
+ it "doesn't sort comments under the wrong parents when they haven't been voted on" do
+ story = create(:story)
+ a = create(:comment, story: story, parent_comment: nil)
+ b = create(:comment, story: story, parent_comment: nil)
+ c = create(:comment, story: story, parent_comment: a)
+ sorted = Comment.story_threads(story)
+ # don't care if a or b is first, just care that c is immediately after a
+ # this uses each_cons to get each pair of records and ensures [a, c] appears
+ relationships = sorted.map(&:id).to_a.each_cons(2).to_a
+ expect(relationships).to include([a.id, c.id])
+ end
+ end
end
so, really, with 1308: is it ok that story's cached values are invalid?
comments_count, hotness (same again for merged stories)
open question: what is the median time between a new comment posted and any comment in that story getting a vote
dzwdz: that feels bad for first comments on stories
^ especially bad because people won't click in to stories with zero comments to vote + invalidate the cache
does comment sorting even benefit from the confidence calculation?
Lobsters doesn't use downvoting, and flags are rare
should we drop confidence entirely to use score (votes - flags)?
confidence_order -> score_order
one byte is score + 10 [FLAGGABLE_MIN_SCORE; cap at 245]
then other two bytes can be id, eliminates birthdays
q: how much does confidence differ from score?
if we sort by score instead, how many comments change order?
https://github.com/lobsters/lobsters/issues/1298#issuecomment-2272179720
title: you're allowed to be evil in performance code
post-stream:
is OBS volume muting linked to scene? oddness at start of stream
what on earth did I do to alacritty to get it loop scrolling the buffer
vim: fix 'shell' hackery to not lose command
Transcripts are autogenerated. Warning: the system currently mistranscribes basically every username and technical term.
Recording
02:21Mhm.
…29Don’t go home. I know, sir. I know you gotta make some noise. He’s confused.
…44jameslittle230 Mic is hot, btw!
Ok. Hey, folks.
05:07jameslittle230 Hi Peter!
dzwdz 🦞🦞🦞
dr3ig hi there
dr3ig we can't hear yu
dr3ig hi, now we hear you
Oh, hello?
hm.
The, the stream muted for a minute. That was an odd one.
What was the last thing you heard? I didn’t,
I don’t know how long that stayed muted for,
dzwdz i joined at :01 and didn't hear a single word
I think O BS is being a little unreliable because I
dr3ig i heard you say hi, and then the cat a bit
had thought it was muted and then the mic was hot.
Maybe it’s a scene specific thing. I will have to play with that.
Hm.
nanerbag Good day
jameslittle230 Last thing I heard was some chatter in the background during the blueprint screen
Joined it on one and didn’t hear a single word.
Weird.
It is not nice to not be able to trust the
jameslittle230 and then silent when you switched scenes, maybe?
O BS software volume control. That’s no,
no. Bueno.
hm.
So, anyway,
silent when I switch scenes maybe. Oh, yeah. So let’s put that on the
host stream.
06:17Yeah, I’ll just investigate that later.
Anyways. Thank you.
It’s kind of,
you know, I know streams develop their own,
I guess lore is almost the right word,
their own running jokes
and unfortunately it looks like audio issues are going to be one of mine.
I guess that’s part for the course of the Linux desktop.
the actual thing.
So the, the story I told about the cat was that,
he gets fed at
9 a.m. Exactly.
And I say exactly because he has an alarm clock for it. So
he was in here hassling me because I walked into the room before feeding him
and he was a little confused by that, that
usually it’s me feeding in the morning, but not today. So,
jameslittle230 How does he do when the time changes for daylight savings?
it’s very funny to have a,
a smart cat who can learn things like alarm clocks and tricks.
How does he do when the time changes for daylight savings?
If we see it coming,
we will, adjust the alarm clock five minutes each day.
And then, you know, that just kind of smears the time change and it’s fine
if it’s not. Well, he’s a lot happier with,
the time change where food comes an hour early. That one he doesn’t get bothered by.
It’s only
the time change where food comes an hour late that he gets a little put out of shape.
Yeah, that’s the one we try to, I never remember it. Spring forward, fall back. Yeah.
So it’s in the fall here that we tried to
step it by five minutes each time.
So
there was,
oh,
and I don’t know if I said it when I was muted but,
I missed Monday because I was offline in the woods with some friends.
And you can see a picture of the literal stream
I was hanging out by on Mastodon or blue sky.
The
I guess it was 10 days ago stream, something like that.
I merged this pull request to pre commute,
pre compute comment scores on initial insertion. And since then,
there have been a whole bunch of bug reports.
Most especially
pushcx https://github.com/lobsters/lob
let’s go to
let’s grab this, yeah, rendering wrong post and response.
So I will share this in the chat and I will stick this in the
let’s grab this URL too. I will stick this in the scratch file.
So it goes into all the notes and all that kind of good stuff.
The
report was really clear which I super appreciated but
basically a
bunch of these comments have been appearing in the wrong place
and
this
bug is almost certainly connected to this pre compute comment scoring because
comment scores are how they get sorted and all of this,
I’m pausing because I can’t summarize it.
This stuff about confidence order is related to
sorting and it’s all a big performance improvement.
And then 1308 is a performance improvement on top of a performance improvement.
So it’s
kind of not surprising that it’s
hard to get reliable at first
and
confidence order was in a
sequel and trees don’t mix very well. Confidence order was an attempt to get
comments out of the database in a
single pass without having to say something like
select all of the top level replies,
select all the replies to any reply we
saw like that’s that would obviously murder performance.
And the way the code previously worked was it
would grab all of the comments on the story.
And then
on the ruby side, it would loop them
until it figured out what the nesting was.
And so it would do one pass through to find all of the thread roots and then it would do
more passes to find the
replies to those and so on and so on. And it would build up the tree in memory
and it was,
I’m sure every developer is thinking like what’s the big o complexity of that?
The big o complexity of that was something
I wanna say. It was pretty smart and it was based on
it. Built a hashian memory God. I would have to dig back in this code.
I wanna say it built a
hashian memory
of
messages
to an array of their responses.
I don’t remember which direction the arrow pointed or
it was a hash of comments to their parents.
And I wanna say it looped the comments twice rather than like a
and squared or the depth of it or anything.
So like performance wise, it wasn’t bad but it wasn’t great because
looping over common records is the sort of thing the database should be doing.
So
I inserted this big confidence order thing to say, OK,
let’s have the database understand at least enough of the tree
because we can do that with a
recursive common table expression
and it’s reasonably performant.
We’ll have the database, understand that and give us all the comments in order.
So we don’t have to loop them and build up the tree and memory.
And then
this 1308
by route contributed this to say, hey, these initial values,
what’s happening is you are inserting these dummy
values and then you call update score and recalculate
and update score and recalculate. Fires off.
I think he said eight queries. I don’t know that I quite counted to eight myself
but
dzwdz i wonder what's the story with the most comments
plenty.
And so
we also have a
dzwdz how much the big o complexity really matters
different bug that’s still open because I can’t quite lock it down.
I don’t know how to pronounce your name dzwdz.
If you wanna know the comments with the most, the story with the most comments
or the story with the most common depth,
like figure out what’s interesting to you and we can run that query in a minute
we had.
So one of the reasons
by route was looking at
that.
Oh my God, a typo in the in
the subject. I have to
pull up a personal browser off screen and fix that. That’s super irritating.
Yeah.
Time
balls.
There we go.
The
when there are
lots of comments getting posted at the same time as votes
infrequently
and I say infrequently maybe
two or three times a year,
the app throws an error because Maria DB
has deadlocked because the voting code and the,
excuse me, the voting code and the
comment insertion code both touch the comment and
vote tables.
But in different orders and when two queries lock tables,
the same tables in a different order that’s called the deadlock.
And Maria DV goes, well, one of you queries gets to lose
and it’s always the vote that loses.
I don’t know if Maria DB is being really clever and realizing, well,
you have a smaller transaction or
it’s just
there are 10 times more votes than comments.
And so the odds are it’s going to be a comment that gets killed or I’m sorry,
a vote that gets killed rather than a comment.
But that’s why it hasn’t been a, a super hair on fire emergency because dropping
two or three votes a year is not a big deal.
But
by route mentioned that they had done this in part to reduce the number of quarries.
They didn’t expect it to solve the deadlock. But
there’s also the chance it accidentally solves
the deadlock because this touches a bunch of
those code and vote tables in the same
transaction in the same neighborhood.
And
that is a lot of context here.
Yeah, let’s take a second to
answer your query.
15:01dzwdz yeah no i did not prepare it :p
So I’m gonna bring up the console
and we have the Yeah, let’s not get that on screen.
dzwdz wait no that's not what you meant i'm stupid
Oh, that’s
dzwdz nvm lol
all right. Let’s actually read these commands before I hit enter on them.
That’s old stuff from a previous stream. So here we go.
…27arh68 can you also query like Mean Average # Comments per story
We can just kinda, so I’m happy to write this
query because
I can do it off the top of my head rather than
arh68 idk how to do 1-liner medians in sql lol
yeah, I can calculate mean average comments per story. Let’s do this one
select stories dot short ID can start.
Che doesn’t stories have a
a comic count.
Hold on,
I believe this has a
yeah,
there’s a cache
column here to avoid exactly this joint. So
this these are gonna be very easy queries. So we’ll say
select short ID title
count.
Stories ordered by
comments.
Oh, I don’t have read line correct
comments,
count from stories
ordered by
that
count, desk limit
point.
So here are the
jameslittle230 Philosophically: do you only treat “mariadb has been deadlocking in prod” as a bug or would you also treat “mariadb has the potential to deadlock in this scenario that we haven’t hit” as a bug?
stories with the most comments on them.
Yeah, this,
this feels about right. We just had,
yeah, I saw this thread maybe
a week ago
dzwdz honestly less than i was expecting
and I was like, oh, this is a strong contender for a ton of comments
dzwdz i thought we had one with like 700
and I guess my, my intuition was right there.
Yeah.
Unsurprisingly the top threads are meta threads
philosophically.
Do I treat Maria DB has been deadlocking his prade as a bug or would I treat it as
has the potential to dialogue
in the scenario?
Hm
I don’t think we’ve had a thread with 700 comments.
dzwdz i mean evidently we haven't
Like
I’m actually surprised I would have guessed that Microsoft buying github
was on here.
That was a huge thread but I guess it was
just huge for mergers rather than huge for comments.
Well,
I
see.
So DZ it is also
does comments count.
Well,
let’s just check this, this backdoor one.
It’s possible it doesn’t count merged comments
and I think it does, but I don’t remember off the top of my head,
but we can check real fast because this story I remember had ok, so that 306
0,
resizing the terminal cut it off that 306 does include merged stories. So you’re not
remembering a 700 story where we’re like summing all of the short I Ds.
Yeah,
maybe I’ll do that on stream in a couple of weeks of
trying to clean up the story. Merged database model.
I’ve been thinking about it a long time
and griping about it because it’s an infinite bug factory, but at least in this case,
it’s not an issue.
So
James, I didn’t miss your comment.
I’m not sure what you’re asking though
because Maria DB always has the potential to deadlock
where
any time we’re doing a transaction with
hitting multiple tables. It is possible that another transaction
wants to lock in the opposite order and is going to
cause issues.
So I don’t really consider that a bug because
it’s always possible to deadlock
if you’re asking in the more broad sense because
I guess really where I’m going with that is the only alternative
to something like that would be an entirely different database engine,
something like
longer to be where they’re
I don’t want to talk out of my ass. I don’t know. Mango DB.
Well enough to say it, but I don’t think it has that kind of
table or like multi table transactions.
Hm.
Maybe a Mongo DB expert can answer but you’d have to,
I guess what I’m trying to say without a specific example is that we’d have to have a
very different database technology to avoid deadlocks. And I am ok with
every database is gonna have some amount of
corner cases and limitations.
These deadlocks,
I think this is like
the second one I have seen in a sequel database in Prade
in,
I don’t know,
25 years of working professionally as a programmer.
jameslittle230 yeah, sorry, bad example, just wondering when something elevates in your mind from potential limitation to actual bug?
So that’s a pretty good failure rate to me.
I am sure there are different uses that could be
have deadlocks much, much easier, but
hasn’t been my problem.
When does something elevate from a potential limitation to an actual bug?
When I get a exception in Prague really or I get a bug report
and then there’s that, you know,
little bit of investigation. Where
does the user understand what’s happening?
Are they correctly characterizing it as a bug?
Most of them,
especially with a a tech savvy audience like Lobsters has are actual bugs,
but occasionally it’s the copy is unclear or the site
norms are unclear and somebody thinks there’s a bug.
one place where
one thing that happens fairly regularly, maybe
five or six times a year is somebody who really
wants to do marketing comes on the site and then they
post a link to whatever their start up is or whatever
their project is and they put like 20 tags on it
and you can tell that’s them thinking tags work like other sites
on lobsters, tags are mostly there to filter out stories.
And so by adding 20 tags, not only do they really stand out and look bad.
They
have hidden their story from the most number of people.
So it’s kind of funny to see that.
I don’t consider that one a bug because it’s a user misunderstanding
what,
jameslittle230 makes sense, thanks for indulging me :)
how the tool works.
But
I, I think it’s interesting that there is a gray area between, is there really a bug?
So Eric, you asked for mean average comments of stories. So let’s
where
at?
I think there’s
can I just say average comments count from stories?
I wanna say that’s a sequel function
and then it automatically,
arh68 HahaCat avg(..) ya i think
yeah, there isn’t a need for a group buy here.
Yeah. Is it lobster? Is it a BG?
There you go.
I would bet the median is like zero or one
though because we get a whole bunch of stories where
yeah.
Is there a
anybody know offhand if Marie to be I can bring,
we can
just grab
that. So here’s a zeal, my, my docket manager, we have my sequel. I,
yeah, it doesn’t look like it has a built in medium. Does it have a mode?
Yeah. Not that kind of mode. I want the mathematical mode
dzwdz wait what's that program again?
if I look for ABG. Are there anything,
is there anything else on this page about it?
Yeah. I could give you the standard deviation but I couldn’t give you the median.
arh68 i would order by, then select N/2th
And that kind of makes sense because median is
arh68 idk how else
a little more complicated to calculate
air H if you want to
take a second. Oh, what’s that program? The program is Zeal.
Let’s find it real quick.
pushcx https://zealdocs.org/
I think it’s just Zeal Docks.
Yeah, it’s an offline. I’ll share this here in the chat.
It’s an offline documentation browser
and it’s one of those where
somebody made a really nice commercial
Mac app for offline documentation browsing.
And then the open source folks were like, oh,
let’s knock that off and do the exact same thing.
And I teased them a little bit.
It’s a very, very open source thing where it’s like, oh,
there’s a commer successful commercial thing.
Let’s
make our own version of that
dzwdz this looks dope, thanks
dash
dash is the thing. They
that’s the Mac Os app. That’s a little more polished.
I do this, I think I started using this
like the day that Zeal came out, I saw the link and I was like, yes, because I had a,
a laptop at the time and I did a lot of coding on the laptop and especially
this project has been running for something like a decade, I wanna say, right.
Where’s your,
where’s your github?
But I want to say something like a decade
and at the time, coffee shops had
dr3ig it has docs for ruby, ruby2 and ruby3 :)
much less active,
much less reliable wi Fi. And so having an offline docket was super useful for me.
Is there a
Yeah, docs for Ruby 23?
I really, I
really think the world of zeal, it’s a super useful tool.
I’m aware. A lot of people use it for things like plain coding.
I suppose I could have coded while I was on my,
you know, offline stream vacation, but
I was very much offline. So I didn’t write any code this weekend, which was unusual,
usually try and work on a personal project at least for a minute.
It doesn’t want to load.
That’s ok.
So we’ll just have to go with my,
my gut recollection that Zeal is something like a decade
old and I’ve been using it about that long.
So let’s grab that and put that in the,
arh68 2013-2024 indicates youre right
make sure that gets highlighted in the
stream archive.
0, 2013.
Hey
implies I’m right. Yeah. Occasionally my memory works.
arh68 it was at the bottom of the page
I don’t have the, the sharpest memory but I had that like rough feeling.
It’s been about a decade.
So
do we have a
oh, it was at the bottom of the page? Thank you. For catching it.
easeout seems some DBMSes do have a mode aggregate function. without it i suppose you just order by it
Let’s see, real quick. If we can just find someone who’s written a cross platform.
That’s
fine.
I’ve never seen this top 50%. Is that,
that must be a my sequel thing, right?
Or I’m sorry. MS sequel.
arh68 smellsl like T-SQL ya
Oh, and here’s somebody saying here’s a better one.
It also requires sequel. See,
it’s a frustrating thing that sometimes people,
they say sequel.
But whenever I see like sequel with a specific year,
unless that is the year of the sequel standard, I assume they’re talking about.
Yeah, new and sequel server.
frustrating that Microsoft named their database that
let’s
26:44really you, you just told me there wasn’t a median function. Maria did B than a typo median. That would be a little embarrassing. Select media. No. Oh median is a window function. OK? Well, we don’t have any partitions. So we can just say select dean. Where’s mine? And then I think OK, that wasn’t useful. What do I say here to group them all? I wanna say partition by ID. No, that’s what I really wanna say is group not partition. What if I just say partition by one?
27:46Anybody feel real confident in window functions. I know a little. Yeah, that’s not, that’s not at all what I intended. So in this example, we are partitioning by title and I would just wanna like partition by true cause I want the median for all. I haven’t done a ton with window functions. And so that’s why I’m flailing a bit here. Cause what I wanna express is that I want a window all and the default over is like rose proceeding and no rose following.
28:44Yeah. Partition by and if I don’t, if I just say over empty.
29:00easeout Is there a more general percentile sample function?
Yeah, it’s just giving me
arh68 i guess the median is 1? select distinct
mm
it’s calculating like a running median I think.
arh68 if the mean was 4.2 median being 1 is believable
I don’t think the median has one. I think it’s Yeah, it’s so
I’m getting this implicit unbounded proceeding, un bounding following
or unbounded proceeding to,
…36can you say range all? Yeah. No, I can’t say that.
…46Yeah. So here’s the example of if you say nothing,
the sum is run over the entire data set.
Oh, ok.
If the meeting was for 22 being the meaning is believable.
All right, then I guess it’s, it’s referring,
let’s say distinct
because if it’s just telling me one for everything that is plausible,
but then each row would have its
dr3ig try where comments_count > 1
have the same median value.
I’m just,
I’m really suspicious of this
0.0000. Like
dr3ig or some other where
I guess median is the most popular value and I’m thinking of
the other arithmetic means.
arh68 mode is most popular
All right. Where comments count,
comments, comment.
Oh yeah, the median is six.
All right.
Yeah, I guess that’s true.
So I had guessed that the meeting was probably gonna
be zero or one and it looks like it’s one.
Do we also have
a,
a mode is a window function? I’ll just try it
no
shit.
Median
or wait is mode the one that’s also a
no, that’s mean
arh68 mode would have to be 1, no ?
taking me back to junior high.
dr3ig can you count(1) where comments_count <= 1
dr3ig to make sure
All right.
arh68 if median is 1, half are 1
So with that bit of insight,
can I count one where comments count
is less than, or equal to one
if the median is one half or one?
31:47So that’s 6300 or 63,000 and then yeah, about the same on the other side. So cols count. Yeah. Yeah, that’s a good go check, good thinking there. I like that. That’s I really appreciate when I’m unsure and unconfident in the tools doing those kinds of checks. So thank you for thinking of that. It’s nice to get that confirmation that we’re, we’re generating roughly what we expected out of that or we are indeed seeing plausible answers is maybe the better way to put that. However, speaking of seeing implausible things, we have been seeing comments not rendered to their correct parent and it is almost certainly this 1308 because it was deployed like three hours before people started reporting the bug and it touches that exact code and no other code touches comment sorting. So I feel highly confident that this is the culprit, but I wanted to spend a minute playing with it and seeing if we can number one write a test that reproduces the bug because then we have a really high confidence, you know, we have certainty that it’s what caused the bug. But then also we can look at is there a way to rec to kind of rescue this work? Because I would like to save all of that selecting on comment insertion. So one thing that showed up in the rendering the wrong post and responses is if you scroll down. Oh yeah, I mentioned it in here like I suspect it is something about these comments being replies because I was just kind of thinking about it while looking at the stream. No, before and there was, I don’t need a terminal. I need a of them.
34:09Oh, that is the wrong. Oh, I’m on a branch.
Ah sneak peek.
All right. So eagle eyed people. Oh yeah. So
recheck is the gem. I’m working for
working on, for database integrity
and
I’ve been
dzwdz heh, ddate
hammering it out and it’s kind of fun. It’s a tool for,
for making sure you have
correct data in your database. It could almost be used on this bug because
d date, yep, I’ve had it at the start of every
terminal forever since I’ve had a Linux desktop
in I think the first stream I talked a bit about discordianism.
So if you jump back in the stream archive on my blog, you’ll see
some mention of the book
Principi
Disc Cordia.
So it could almost be useful in this scenario
of our comments appearing under their correct parents.
But the tool is it like,
you know, 0.0 0.0 0.1 rather than
something I can ship. So let’s, let’s get off of this branch.
Let’s,
oh,
I’m in the middle of,
yeah.
So I’ve been using lobsters because it, it’s a big real site with real complex data.
But you know why not?
Let’s go ahead and run it.
We’ll do a sneak peek.
So you write these
checks of your data
and I wanna say message,
oh, it’s gonna dump.
jangomandalorian 👋🏼 Hi everyone
But if I,
konradpetersberg yo
if I run this on stream it dumps the
whole object and there’s gonna be stuff like users,
konradpetersberg what up
emails and other things.
Where’s a good one?
Key store passes. Hey, Conrad, welcome.
Here’s an idea of what it does
konradpetersberg bless me coding god
where
a check looks at your database table and then each one of these
dots is 1000 records in your database and it checks that they’re correct.
If you have ever seen
the
bless you coding God.
konradpetersberg emmagr1Clap
May your bugs be few and obvious
the,
if you have ever run a Rails app in production or really any app,
you probably know that once you get past a couple 100,000 records,
you have records where like you load them out of the
database and they say that they’re invalid and it feels impossible.
But
it’s a real thing that happens. I gave a talk on this a couple of years ago. It’s
fundamental to the way active record and most other database libraries work
even setting aside the fact that you can
like revise your code validations but not think,
not remember to run a migration to fix all of your data in the database.
Even setting aside that like obvious check.
I’m trying to think which of these is not gonna include personal info,
but one of these has
one of these has bad data in it.
Oh, this table must not have a
column on it that’s running a lot slower and I don’t wanna run it.
All right, let’s go ahead and
blow it away. That’s all generated code. So I don’t need it.
So that aside, sneak peek aside,
I hope to have a a beta of that out soon. And that’s
yeah, I will put it.
Oh,
there’s a a mailing list if you’re curious about it. So with that aside
on to a dance.
But
yeah,
so this initial confidence is
the code that was touched by 1308.
38:03And when I think about what confidence is in this context, confidence is this formula down here. It’s based on a, a classic famous blog post by Evan Miller and how Reddit sorts comments. I will not be explaining the math on stream because there’s a lot of it. But the basic idea is there’s a big difference between a comment with a score of one because only the submitter has voted it up and a comment with a score of one where 10 people have voted it up and 10 people have flagged it like numerically, they sum the same but they mean very different things. And so this is a, a function for getting at that idea that every vote is a information. And rather than sorting by the naive score, we should use all of those votes that becomes the comments confidence, which is basically it’s ranking at each sibling level. So if we have a parent, you know, parent comment A and then it has the replies B CD confidence is what order are these being sorted in? And so that’s part of why when I see the bug that people are reporting is that like D is not, especially when here, let’s just pull up the that useful screenshot.
39:44So to adapt this, what we’re seeing is we have A BB has C and D. Yeah. However, D is actually a response to A not B and so when I see this thing of, oh, the, the comment is sorted under the wrong parent that’s really conspicuous for, oh, the confidence order is wrong.
40:31
So
this this
big comment that’s like 100 words is me explaining the clever
thing that confidence order path does to allow the database to
us a recursive comment expression.
This is the kind of
I
I’m so conflicted on this code like on one level, I love it. It’s very clever. It does
bit packing which is always fun for old assembly coders like me.
On the other hand, it’s not
100% reliable. It is only like,
I don’t know,
four nines reliable,
which is maybe not good enough for
core site functionality, like showing comments in the right places
and
when things need these kinds of giant comments to explain what’s going on that.
arh68 "here be dragons"
A hair stands up on my back, on the back of my neck saying, oh,
that’s probably too complicated.
Maybe that’s too complicated.
Yeah.
Here be dragons. Yeah, that’s very much what this is
is
so confidence
order
is those three bites that 1308 touched. So let’s jump up to
sign an initial,
right?
So this says that the confidence order
is these three bites
and then this common explains what’s going on with these three bites.
The first two bites are
an unsigned integer
so that
it’s sort of the the comment order
just basically that
boy, how did this cause this bug?
Cause it only should show up when
the confidence should be the same value.
And this is all tiebreaking between the
depth.
But if there are
top level,
if the top level comment has the same.
So where this is going is
this idea of a confidence order path,
which is this big
fairly wild thing
where
we are doing a bunch of
string interpolation.
So what’s happening here? And I’m gonna do this over here
is you end up with
arh68 should the confidence depend on the parent comment?
me,
let’s just jump down and show the, the query.
Should the conference depend on the parent comment.
No, no, it doesn’t depend on the parent comment.
So if we say where’s the simplest one of these,
there’s recent threads and then there’s story threads.
So this is the big workhorse query that grabs all of the comments for a
story.
And then there’s this idea of a recursive common expression that says,
OK, we wanna find
all of the comments. Let’s start with those
at the top level. So that’s where parent common ideas know.
That’s all of these top level replies like this one by emperor
and Perham,
I know his last name
and
44:36then union that to let’s find all of the comments that are replies to those and so on. And doing this in the database is of course faster than linking it all back to Ruby and doing it again to I think it was a R H’s comment of if there’s only a couple 100 does it matter? Not hugely. One of the reasons I waffle on whether this whole confidence order thing is worthwhile is and is not huge. It just feels very clever. So we take those confidence orders. So we have, you know, a commentary like ABC D and this is in place I to A. So each of these is gonna have a confidence order that’s like three bytes. And I’m gonna do this in bytes rather than the bit packed version because I want it to actually be readable. So if we assumed that see it’s going, I can’t remember the order. Is it highest first or lowest, first order by? Yes, lowest first. So the confidence gets flipped from high values to low. So let’s say a is a really upvoted comment. That’s the maximally upvoted comment is gonna have 00 for its confidence. We’ll say like 01 and this, this is at its level. So it’s not any kind of, this is not a trace. This could also be 00, a very highly uploaded comment and this could be the same thing. And in fact, I suspect what’s happening is the same thing. And then the comment ID, the low bit of the comment ID is a tiebreaker. And so, oh yeah, that’s actually, that’s certainly what’s happening here. Now, I understand the bug. I don’t know that I can write a test for it. I’m not sure I can explain it, but I, I understand the bug. So let’s keep talking until it until I can actually say it. And then I’ll know I really understand it as opposed to I have an intuition. So we’re pre filling these values and 1308 does performance improvement by saying, well, let’s just, instead of selecting out instead of putting in 000 and immediately hitting the database to recalculate that let’s just grab that value and put that in. The problem is both A and B when they haven’t been voted on, they are both gonna have the placeholder value that is the same. And so they are going to look identical. And if they look identical, d these comments are not getting sorted under their parents by what is their parent ID? That’s, this is where that whole confidence order path thing comes in. So let’s talk about what that is. It’s, it’s this guy which is some weird code. So again, what I’m trying to do is get out all of the comments in this tree order with the restriction that sequel really is not well suited to trees. And so what I ended up with was this idea of a confidence order, which is what is your place relative to your siblings and then a confidence order path, which is, well, let’s go ahead and say if we take the confidence order of all of your parents and we concatenate them together and then we put this comment on the end. Now we know your order in the tree and that is, that’s what becomes a confidence order path in. It’s the name of that function story threads.
49:16So what’s happening with these comments getting parented in the wrong place is they are getting comment D is getting posted before A and B get voted on after A and B get voted on. They’re gonna have a value that is based on their actual voting as opposed to this static value
…46and
then they look identical.
And so then the,
arh68 is it too much to concat the parent.short_id into the path?
the comment sorts into the wrong place.
Probably the other thing I should talk through here to justify is it
too much to con catt the parent short ID into the path?
Yes.
Yes. Unfortunately, it is.
This three bite thing was a really painful compromise
with,
between uniqueness
and width,
the
confidence order path. So here for d you can see it’s six byte wide, right?
And so that tells you real easy
that
the cop
with
or length is
three bytes times the depth of the comet.
And there’s gotta be a max somewhere, right?
Like what is the most replied comment and the maximum comment depth is
I made a constant for it.
I wanna say it’s
31
in production.
Yeah.
So at one,
dzwdz is twitch being fucky for anyone else? i have to keep refreshing the page because the chat keeps breaking
the longest reply chain in production was
31 comments.
And so that is the maximum length of
a cop.
But when I saw that I was like, are you kidding me?
I can’t tell anybody if,
which is being weird and the chat is breaking that is beyond my purview,
I can’t even get twitch to fix.
Wow
bugs.
arh68 site's fine for me so far
So while I was investigating this
and figuring out the most nested comment reply chains,
I realized as I was looking at them that this one where people replied
back and forth 30 comments deep that
it was actually a super unpleasant conversation.
And if I looked at
the deepest comment threads, all of them were super unpleasant conversations.
Once you got past a depth of like 18 or something at that point.
They were pretty much all people who were
like, nitpicking each other back and forth
until one of them got frustrated
arh68 LUL ya i tend to scroll past those right-hand-side discussions
and left
and sometimes that was like, left, left the site rather than just left the thread.
And so I, I kept mac depth
a little lower
right hand side discussions. Yeah, that’s a funny way to put it.
That’s also
the other reason I put in a lower depth is
as these comments in dent, they get real skinny and
also around 18
on any normal size display, it looked kind of ridiculous where you know,
each comment is
two words wide.
That was not great.
That’s gonna be a fun one for if and when
I ever rebuild the site with CS S grid,
I’ve used it a little bit like
the Newish
header that’s maybe
dr3ig are you saying assigning precomputed values doesn't differentiate between parent comments prior to any upvotes ?
two or three years old
that uses CS S grid. And I would like to start moving the page into CS S grid because
I came up and cut my teeth on all the
float things.
Yes, Drake,
I am saying that assigning precomputer values
doesn’t differentiate between parent comments prior to
uploads. That is exactly the problem.
It is also the problem
dr3ig but then why did it work before
if we don’t pre compute
because A and B before they get uploaded would have the same values.
Why did it work before?
53:42That is a really good question
…49dpk0 morning/afternoon/evening all
because in practice,
arh68 morning dpk
the idea of what’s happening is we are inserting the same confidence order.
Hey DPK, welcome back.
We are inserting the same confidence order values that would have been calculated.
Oh, it’s the, it’s the 255.
It’s the 255. So
these two bites,
these are the real production values for a
comment has exactly one vote from its submitter.
Well, exactly one vote that is an upvote and no other votes
in practice,
that is the one initial upvote from its submitter and nobody else has voted on it yet.
Fine.
That is a very common value we were talking about like the median,
the median comment also gets voted on like
once
I wanna say
now close that terminal, that’s fine.
I’m not gonna check it offhand.
So the tiebreaker when things have the
same confidence order because as you can imagine
the confidence order values for a comment with
just the one vote from its submitter or
jameslittle230 gm dpk
the one submitters upvote and one person up voted or
the one submitters vote and two
people up voted. Those are really, really common values. Those are, you know,
30% of the comments in the database. I’m just,
hejihyuuga hello Mr. Pushcx, hello chat
I’m pulling that number out of my butt, but it’s like directionally accurate.
And so this third bite
is,
hejihyuuga Hope you're all doing well
oh, welcome back.
He
this third bite is
the low order byte of the comments ID.
And that’s a tiebreaker to differentiate
comments with the same ID.
Oh, I didn’t finish my thought here of
for a H of
would say four
55:55and so A Rh here was kind of suggesting, well,
a rgh. Huh?
You can tell which word I type more.
What if instead this was the several bites of the short ID and we know offhand,
the short D is six characters.
And so this would become, you know,
five bites wider
or four bites wider
could do that. But then the cop length,
the max for the cop length,
which is 31
let’s say,
jameslittle230 Likewise Heji!
then this would become
two plus 57 times 31 is what
56:48I have to do that with a calculator. I don’t know that math off the top of my head comes 217 bytes wide. And at some point when the confidence order path is wide enough and that really comes around because comment is already a fairly wide table. At some point, it blows up the performance benefit of doing all of this confidence order stuff because the row is so wide that the database has to allocate memory differently. Seems to be what’s happening. I haven’t tried to dig under the covers of the query planner or the explains. But yeah, that’s the, the verdict is unfortunately at some point cop becomes so wide
57:46because the other thing to do would be to just put in the actual bites of the comment ID. So we have, what is it?
58:00Yeah,
…11what is that? Two to the
12th? Of hand.
mjiig 2^19 ish
No
16th.
Yeah, it’s getting in the neighborhood.
Yeah, there we go. Two to the 19th.
I was like we’re right close to a value. So two to the 19th.
So like 19 bits. Thank you M Jig.
You are either better at remembering your powers of
two or faster with the calculator than I am.
The
being only 19 bits wide would be nicer than being
what is this?
byby42 id.to_s(2).size
Eight times 540 bits wide.
But at the same point, we still run into this,
we still hit this limitation. That cop becomes too wide. Cop really can’t be
ah bye bye. 42 has a very smart version of it. Yes,
let’s do that.
Didn’t know you could say two s with two. That’s gonna do a binary.
Oh no, I can’t do that.
byby42 Hum
Didn’t LLM write that code.
That would be nice if I could say
to us to just throw it over to binary. But
byby42 Works on my machine
there is probably a way to do it, you know, with string formatting and such.
I don’t want a rabbit hole there, but
you are welcome to figure it out. Chat.
byby42 Oh, id is a String here?
Works on your machine.
What does Ruby dash dash version say on your machine?
dzwdz exit \n python3 \n bin(2)
Ideas? A string here. No ideas.
ID is the integer that comes back from the database.
Oh I called the wrong.
byby42 Integer#to_s(2) return the binary representation
I called it on the wrong thing.
Nobody caught the bug.
I said comment dot Last, not comment dot Last ID.
Ruby does indeed have that. Very nice. And then
yeah,
ok. We rabbit hold anyways because I type out, I didn’t say last ID. I just said
last two S
that’s ok.
So,
01:00:33so my compromise for confidence order was
rather than being
five bites wide or three bites wide.
I just made, I just grabbed the low bite
of the
comments ID as the tiebreaker
and it’s kind of funny this
this isn’t perfect and you’re like, oh, like naively, there’s
a one in 256 chance that
two comments get the same
have the same low order bite except
the site gets right around 200 comments a day.
And so it basically
is significantly rarer than that
because it would have to be that
a comment is posted
and not voted on. And then 26 hours later
when the ID has rolled around that bottom bite,
someone comes back and leaves a sibling comment that also doesn’t get voted on.
So in practice, it is,
it does happen, but it’s very rare. It’s rare enough that I’m comfortable with it.
It’s on order of
maybe twice a year.
I think about once a year, somebody reports it, but I see it about
twice a year
and it’s automatically fixed by if I go and I
quote one of the comments, the bug disappears. So, you know,
it’s very cheap.
I don’t, I don’t love it.
Maybe somebody looking at this goes, oh,
there’s a better way to do this confidence ordering thing.
But this is what I came up with, especially given the limitations of sequel where
sequel
doesn’t have,
doesn’t have a huge amount of
bit packing functions
dzwdz if you order by c_o_p, the replies to a comment wll always come after the parent comment, right?
that would do all of the clever stuff I want, especially not concatenating,
you have to kind of
hit bite alignment and then turn these into strings.
Yeah.
So yeah, if I order by cop the replies to a comment,
always come after the parent comment, right.
The problem here is 1308 says, well, we don’t know
the comment ID
because we can’t know that before the comment is inserted.
And so we are just gonna in insert a placeholder and we
discussed this on stream of let’s insert a placeholder byte of 255
and we’ll just put these last.
And so like when E comes in as a top level reply, it’ll come in
last because presumably at some point A and B will
get voted on and they’ll have real values that are different
dzwdz i wonder if you could do a pass over the returned data to try and force stuff into the right order
if they don’t, let’s just tie break and we’ll put e last.
dzwdz since it's mostly in the proper order anyways
What’s happening though is now that all of these values are
the same replies are showing up in the wrong place.
01:03:41Yeah.
So
Dey,
you’re running into one of the constraints is the whole point of this big
performance hack with confidence order and path is wanting to on the ruby side,
not loop the comments.
Oh And someone was saying,
yeah, someone asked like how many comments are we talking about? What’s the big?
Oh, here there was a side point I wanted to make of,
we looked at what the highest comment counts are and
dzwdz that was also me btw
you know, the top one is
300 something and the,
the next 20 are all around 200. That’s not a big n
what I am trying to avoid is not
long loops of tens of thousands of rows. It is
allocations
Ruby as a garbage collected language
has to allocate stuff.
And we already did.
You know the whole guest star stream with by route about
that ended up with boy. If we install Je Malek,
we’ll have less ruby
less memory fragmentation.
I was trying to reduce GC pressure because if we loop the comments once and we don’t,
that was also you excuse me.
Not great with names. The
if we loop the comments once and we don’t have to
resort them and we don’t have to build any data structure,
we get much less memory pressure on the
Puma workers. And
in a Ruby web app,
memory pressure is more significant than CPU pressure.
We just have a smaller budget there.
And some of that is
ruby objects are very wide.
You cannot build a low overhead data structure.
I I say that and I’m sure someone is like, well, if you write AC extension,
yes,
we’re not going to write AC extension to sort 200 comments though.
01:05:45So the options here. Now, I feel like I have a really good understanding of what is going wrong with 1308. And I would like to see if I can write a, a test for it because if I can write a test, then we can think about do we have to revert? Because I would like to not revert the, the benefit of 1308. Is this update score and recalculate runs this query, which is an update. And I did a lot of work to shove all of this work into the database. So it’s just one update. But then the story has to run and story update com, cash comments, correct? Well, then that has to update the comments count and then for all merge stories, it has to update those and then it has to update the hotness with the calculated hotness. Oh and then calculated hotness. Let’s go query the tags out of the database. Let’s query all of the comments out of the database. Let’s hit those merge stories. And so it’s like, OK, we’re, we thought we were doing one small thing and it said we set off this whole cascade of various cash levels. This is also a couple of years ago, some folks asked for query information because they were writing their own database. And that ended up getting called Noria and I think it ran for a couple of years and then got Aqua hired if I remember correctly, someone would have to dig this out. But it was a fun bit of performance work somebody did with the Lobsters Database because it has this kind of real world complexity. Cause if you kind of cross your eyes, what’s happening is we want to query out particular data, but it becomes expensive to query it out. And so we have layered on all of these ad hoc caches where comments count, that’s a cash. And so we have to like hit and update that cache and hotness itself is also a cash. And all of these things become several intermediate layers of caching where the comments has a value and then that gets rolled up into other comments and it gets rolled up into story and into merge stories. And all of that is so that we do less selecting because the site is very, very read heavy like most sites, what if the database was smarter about incremental calculations so that when you queried, it realized most of the time the data hasn’t changed and it managed all of those caches. I wished the Noria work could have made it into a popular database like say Maria DB where it could directly benefit us. I wasn’t quite willing to say boy, the benefits are so great that I wanna switch to a database run by a small team. That’s a little scary. But the work they were doing like we can see all of this downstream stuff that or not downstream. All of this, this fundamental motivating work of we’re maintaining caches at all these different layers and invalidating them and pre filling them and ending up with duplicate values. And then nonsense happens. This is all just cash, cash, cash invalidation. You know, that’s why the the saying is the two hard things in computer science are naming things, cash invalidation and off by one errors. We’re seeing it right here. Yeah. What’s right? The speck expect comments
01:09:41and just dropped down to the end.
…51The guy. Yeah, I might have created another describe block with that same name. So I wanna basically create this structure. I don’t need C I only need three comments to demonstrate the bug. It doesn’t sort comments under the wrong parents when they haven’t been voted on. That’s a very long name and I will probably shorten it but I just wanna like get the first draft out. So we had comment a
01:10:35and let’s think a second cause in the spec I always want to use these factories and the factory. I don’t think the factory is gonna create different data. Let’s look at it real quick. Yeah, it has these sequences and all this kind of stuff, but it’s not going to attempt to create the score or the confidence order path in a different way. So we will be able to exercise the bug this way and these all need to be replies to the same story.
01:11:16And I like to specify where I care about things. So we will say that C is a reply to a
…32and then, so this is a, this is just me expressing the bug a little. And also if I write five lines of a test, I want to see that it runs. So this is like the source of the bug. So I’m not gonna keep this expectation, but I wanna get set up. So if I run it green dot No, undefined method, confidence order path. Oh Because it’s it’s confidence, order confidence order path is a database specific thing where in the query that pulls out the threads, it generates the confidence order path. Yeah, good. I closed the wrong term.
01:12:35So that’s fine. I mean, it’s not fine. It’s the bug. But what I actually want to express is that if I say comment dot
…54and spammer, I say comment dot story. What was it called? Story threads? Let’s just look at it,
01:13:09story threads. Yeah, for this story,
…18look at this and then I am expecting,
…32what do I wanna do? I really wanna just check that the IDS are in the order. Wow, I figure I can just directly say this. I said sorted 0.2 A to equal AC B. So this is what I want. But I think this spec is gonna fail. And let’s see if it, let’s make sure it fails because my expectation fails as opposed to I have to hassle with the race. That’s very big. So we expected comment ID 123 to equal. That’s not a Yeah, so that’s correct, but it’s not a usable D. So let’s just say sorted dot map to grab the I DS. We expect to get a I DS.
01:14:46Yeah, good. So now I have a spec for this book which given it was a transient bug because by the time I, so I was especially happy this person included a screenshot because by the time I loaded on it loaded it, people had voted on various comments and I did not see the bug anymore because of course, as soon as you vote on things, the bug disappears, which is part of why, excuse me, which is basically why the site hasn’t been on fire the last week that this code has been in production. If, comments were all showing up in the wrong places forever, that would be, that would be, I would have interrupted my vacation to drive to somewhere. There was Wi Fi, the, the place I was at was so rural that the descriptions like one of the directions was turn off the paved road and then there are like 45 more minutes of driving. So driving to get to Wi Fi would have taken a second. I had, actually, yeah, I’ve been to this place a few times and starting a couple of years ago, somebody built a cell tower and so you can kind of if you stand in exactly the right spot, you can kind of get one bar. It took literally, I just set my phone down. It took literally an hour to post the stream screenshot, stream photo, excuse me, the stream photo to blue sky and Mastodon like a hit post and just came back a lot later. So there’s our bug. All right.
01:16:35Right.
…41So this is happening
…49initial, this is happening because this doesn’t know the actual comments ID and so now dupes are super common.
01:17:09What I wish,
…16I wish I could do an insert returning or otherwise instruct Maria DB. Hey, I want you to fill this fight with this sequel expression, but I don’t think I don’t think I can express that Marie DB period. I know I can’t express that through the layers of active record. Let’s go ahead and put that away.
…57So the tempting thing is to say, well, the way 1308 worked was record initial upvote which is the submitters who calls that, who calls record initial upload? I wanna see that actually
01:18:28the after create fires that hook
…38and then it creates the vote for the user.
This
this is test code.
Occasionally we insert comments with different scores and flags.
This is just there for tests in practice. This never runs
but then we have to hit the
hash.
So really the
cheapest. Did I leave that edit?
Yeah, let’s go ahead and get rid of that.
The,
the cheapest thing I could do is say
right here,
touch the comments,
confidence,
order to insert the,
wait for the comment ID,
comment ID.
And then I get to keep
some of this benefit of number eight where I don’t make quite so many queries back
to the database
and I’ll have to touch,
it’ll break this spec because then that won’t be correct anymore.
But that’s a pretty easy spec to fix.
And I feel like
that might be the way
rather than
let’s think if I’m making a mistake.
So what I’m considering is,
can I avoid reverting
1308?
So there are two paths, revert,
it’s safe. It works
well tested.
Don’t revert,
it’s faster.
It might help with deadlock.
And so
I want
to keep it
if I don’t revert though,
the downside is more clever, special casing.
And I’m trying to think about
is this special casing worth it?
It really does save a whole bunch of round trips to the database.
Common insertion per performance doesn’t
get my hackles up. Even if it goes and does
six database queries,
the database is in the same data center.
It’s very well sized for the amount of data in it.
And
all those round trips are really just
even with the overhead of active record,
they’re really just a couple tens of milliseconds,
which is not great like
on one hand of, yeah,
comment insertion feels like it should be the kind of end point that can run in
50 or 100 milliseconds in practice with rails
like 80 milliseconds is kind of your floor.
So if we come in at like 150 to 250 that’s not
the end of the world. Like that’s a little slow, but it’s not,
it’s not popular website slow where
doing something like inserting a comment or placing an order takes 10 seconds.
So I realized I am kind of
using a different scale than I would use in a commercial production
code base where
arh68 so in the SQL, the sort by has 1 field? like could ya sort by parent, (then) c_o_p i wonder
but there aren’t dollars on the line. That’s just sort of pride on the line
and getting comment insertion
in 80 milliseconds instead of 100 and 50 is incredibly satisfying.
So in the sequel,
arh68 i think <1s is fine
the sort by has one field like you could sort by parent then cop I wonder.
No, it,
it has to sort by
confidence order.
Where is,
that’s the
the point of building up the
confidence order path?
chamlis_ can you pick a random final byte, and get back to a 1/256 failure rate?
Yeah, I think under one second is like
it’s fine for commercial work, but I care a little bit more about this
and I like that.
Can I pick a random final bite and get back to a one out of 256 failure rate
channel?
Thats
so that’s really tempting.
So I was briefly considering that
it’s
either
use,
you know,
pull back ID for final byte or
insert random
number.
The hassle with inserting a random number is it’s not
going to be
a one out of 256 failure rate. It’s gonna be more like a one out of two failure rate
because
the problem we’re seeing is,
01:23:56is really this simple that A and B have the same thing and C
which should be up here
is getting put under B
and so,
hey,
get back in the bin.
And so
if I inserted random bytes for A and B,
it is a 5050 as to which one of them is higher or lower than the other? Oh, but then
we don’t actually care is a sorted higher than B No, it’s not 5050.
I’m thinking through it wrong. I’m thinking of sorting
not correctly parenting.
dzwdz wouldn't random make this test fail 1/256
Yeah, we get back to one out of 256.
That is reasonable. I was, I was dismissing that too early.
I’m glad I talked through it. Thank you.
So
01:24:52it’s called it Miss Parenting. So I remember
01:25:00dzwdz as opposed to using the ids as it was done pre-#1308
which is actually
this is the same failure rate as we already have
or no,
this is higher than we already have because of all of
those caveats I had about how slowly I Ds roll over.
…21Yeah. So what would that look like? That’s actually short enough to just write it. So if I said sign initial, I am gonna say, where’s the attributes gonna have? A
…46And then here I’m gonna say, what is it is it. I never remember the interfaces. It ran 256.
01:26:03dzwdz i assume in this test the ids are sequential which is a pretty decent approximation of what really happens
All right. And if I say like ran two a couple of times,
all right, it is up to but not including. So that’s exactly what I want.
So I am
in this test, yes. The ideas are sequential.
And in the test data we saw go by, they were literally 123 because it resets the
database between
tests
or
it resets the table. Basically, there are a few
spec strategies.
So here,
so then this stays the same
and then
dpk0 stochastic programming: it could just be that you never got a ‘2’ as a result from testing it out to find out if the argument to rand is inclusive or not, but you trust that the probability is low enough :P
the other speck from 1308 which was over in the bit packing. Spec I believe.
Yeah. This thing becomes
01:27:09C dot
…22chamlis_ I guess you could add one more byte from the ID/rand like you discussed earlier for a 1/65536 failure rate
Stochastic programming.
Yeah. DPK, I did like
eight of them.
I’m not gonna write a property test of the
random number generator.
I was more
confirming my intuition than I was
using it for the first time there. So
spec model
comment and spec model
bit packing.
Where did I put spec extras?
There we go.
…57Yeah. If I add one more bite though, I’m starting to lose the performance. And it’s funny, I’m a little bit tempted. There are really only,
01:28:21chamlis_ ahh I must've missed we were right on the performance cliff
I don’t think there’s a depth on here. No.
So how do I want
if I said
I don’t know. Rate the whole union.
Yeah.
Chambliss
Cham Le.
Yeah. Is it a French name? Cham
Le?
The
the cliff really is right there at about a bite or two.
And so there are so few comments that go
to a depth of more than 20 that
I am kind of tempted to
one way to get more bites
would be to say
01:29:12that the maximum cop length is not 31 times three. It’s like, well, what if that was 18 times three, the same as the current thing and the way I would get there is there really are only like 10 comment threads that go that deep. I could just go and edit that data and like split those threads in the middle and leave a comment explaining like, hey, I edited this. This doesn’t match. I repent, I repented a couple of these threads literally five or 10. I guess I couldn’t bring myself to edit the data just to get the depth to be able to have another bite. Like I don’t know. It’s funny these, it’s, it’s totally intuitive. I can’t justify that one besides the vibes were wrong of editing data to get the performance that just felt a little too sleazy for me. So that these, I lost it in all the conversation. These specs did not run because oh the spec I fixed the speck of this specific bug didn’t run. I expected at least that one would run. Let’s go back. That’s on confidence order path.
01:30:46Ah It’s not dash N, it’s dash E
oh It ran this time. Really?
byby42 A bit late, but if I get some time in the future, I'd like to challenge the idea that sorting comments in Ruby would use noticeably more memory.
Is this like the it’s failing one in 256 and that very first run I did.
Ok. Well, now it’s failing two out of three.
How many times do we wanna run this spec to get an idea for it?
You’re,
yeah, a bit late for
challenging the idea that sorting and rope
uses noticeably more memory. It’s,
I think that’s a pretty
hold on. I’m gonna clear my throat. I gotta mute you all.
So you don’t have to put up with that.
01:31:36byby42 Nah, don't respond, I'll just try to benchmark it at some point.
I think that is a really fair criticism of all of this confidence order thing.
How much actual performance gets actually saved by it?
If you wanna benchmark it, I would
love that.
I have tried benchmarking it and it seemed like it was about
a 20% improvement. Combining
both the action ran faster and the GC ran less often
and I say 20%
what am I talking about? 20%
speed improvement to that end point.
byby42 I think I can probably repurpose the YJIT-bench suite.
But then, you know, the GC pauses are distributed among
every request that comes along rather than just necessarily
the ones that load a tree of comments.
So it’s sort of hard to
measure.
Yeah, you probably could re repurpose the Y jet bench suite.
That’s, that’s clever that did not exist when I was doing this code.
So,
01:32:46so Shamley somehow my intuition that it was like
we would be correct 50% of the time rather than one
out of 256 seems correct because this test is passing like
half the time. Not 255 out of 256.
So,
chamlis_ huh
we do not understand what’s happening here.
This does not match my intuition
at all.
Mm.
I’m gonna leave the coat up and I’m gonna run to the restroom. I’ll be back in
about a minute or so
in two minutes.
Yeah.
If you wanna puzzle out that bug,
I’ll see you in two minutes.
01:35:11espartapalma Hi, good morning y'all
All right. Hello again. Anybody solve it?
Oh, hey, again
as part.
Yeah.
So
why does this test?
Oh, there’s a fun distraction.
I don’t know if you all get the cute little animation that just played on chat,
but
I set a powers of two follower goal and it just ticked over
nanerbag yep we did
espartapalma I got the animation!
the first couple of follower goals. Twitch had me set
kind of blew through them immediately. Which was really neat.
I,
but you know, doubling, they come much less often. So,
how do I manage that? Oh, you got the animation? That’s great. I think those are cute.
How do I
had? I
improve it?
All right. So let’s go ahead and say end goal.
You’ve ended your goal now, let’s say edit. All right.
So they recommend 270 but that is not a power of two.
I don’t know if I wanna go straight to 512 because then I’m not gonna see it for months.
What is a nice powers of two number bigger than 256?
Hm.
Let’s just add 32
2, 88.
That seems reasonable.
Then I’ll see it again at some point in the next
couple of months rather than,
you know,
two years from now.
All right.
So this test, this tweak of having an ID placeholder,
I would have expected that
c would start sorting into the correct place.
Oh, bye, bye. 42. When you set up your benchmark,
I was doing very ad hoc benchmarks
where it’s just kind of me watching the
server logs and grabbing out the production timings.
If,
if you get a repeatable test week going on that,
I would love to see that code regardless of what numbers you produce because
byby42 Sure.
I could experiment a lot more confidently with this confidence
order path stuff and try other approaches that just weren’t worth
10 hours of implementation and then
vibes based evaluation in production.
Hm
01:37:44Like one thing I suspect is if it’s gonna be three bites wide, where would I get if I did one bite for the confidence and then two bites for the ID because comments really are clustered around. Well, all the values that you would expect to see where it’s combinations of 1 to 5 votes rather than yeah, rather than using all of those two bites.
01:38:25Before I set that up though. We wanted to have some more test harnessing.
When I wrote that code,
I ended up pulling production data to my local and then
writing a script which is probably still in the pull
request or in the issue somewhere.
Maybe it’s even in the repo
that
sorted all of the comments on every story,
printing out the tree.
And then I took
the existing code and I dumped that out to a text file.
And then I wrote this new confidence order implementation
and then dumped out every,
you know, the tree for every comment and every story.
And then I just dipped those two to make sure I was getting the exact same results in
broad on either side.
chamlis_ maybe the final byte could be the current comment count mod 256, if that's available cheaply? would get determinism and slow wrapping within a story I think, though deletes might cause issues
So I
did I edit this test wrong. Where did that come from? All right. 25, this becomes,
01:39:30but that’s not even the spec I’m looking at maybe
the final bike could be the current common con
256. If that’s available cheaply,
you get determinism and slow wrapping within a story.
Deletes are not an issue because we only ever soft delete comments.
However,
to,
to know the current comment count mod 256 I would have to
go and hit the database and ask it how many comments there are.
In which case, I might as well just
ask the comments ID.
There are
the
rails
in production is a multi
process.
They are called Puma workers and then it’s also multi threaded.
So it’s not like I can just have a
global variable and incremented every time a comment is posted,
it,
it wouldn’t cross the process boundary. And that would also be
even more layers of cash invalidation.
And so at that point,
it just feels a little, a
little overkill.
So
before I move on from this ID placeholder byte,
I really do want to understand why is this test failing 50% of the time
instead of
one out of 256? Because I really did expect that.
Oh, it’s because there’s 22 parent comments. So it’s not,
it’s a, it’s feeling for a different reason.
It’s not
the question of where does C get parented?
I bet if I looked at this C would be under a one out of 256 times.
But if you ignore C
A and B are swapping positions.
So this
is the 5050 thing that I was expecting, we go look at those diffs.
Yeah.
So I expected 213 but got 312. So one and three A and B are in the correct positions.
I’m sorry, A and C are in the correct positions, but B is swapping back and forth over A
and that’s harmless.
The test needs to account for it.
So,
so Shamli, how do we write this in the spec that we expect
that we don’t care if A and B are
sorted differently. What we’re caring about is that C appears under A
before I give up on this path, I would like to kinda
chamlis_ sorry I've never written any ruby
to drive it out and see. Do we get to a one in 256 failure rate
because then I feel confident I understand what’s happening.
So I guess really what I’m expecting is that
sorry? You’ve never written any ruby?
Oh, that’s ok.
What I’m trying to say is that I expect.
So
let’s grab this.
Yeah,
and then
I am expecting
that we have sort of a a subset.
Can I do that, Ruby?
Can I just say in an array?
Because you can have arbitrarily nested arrays? I wanna say
without having to get into the indexes. So if I have a, is
12,
what I expect is, you know, something like 132.
01:43:23Well, I can say I ex I expect either. Yeah, I don’t wanna expect either.
…43Is there a don’t want to do each con I don’t want to kind of search, can I find a slice? Cause I don’t wanna make a slice? I wanna just search for, there is a slice of, you know, one comma three. I don’t care where it is. And so there is find index but there isn’t a find an index of a slice. And I am trying to keep this test reasonably expressive, which is why
01:44:30byby42 Can't think of anything like that
if I said
if I have that in my a
…40there’s a maybe it would be reasonable to say a dot Cycle now. Is it a cycle? Oh no, it’s gonna cycle the whole thing. Yeah. There is a way to say chunk chunk.
01:45:16I say
byby42 each_slice
there’s a way to say, give it to me in groups of two.
And at that point, I would get back an array that goes
13
and then 32,
which is fine
each slice.
…44And then here I have to say each slice too. Yeah, it’s not a sliding window. It’s giving me it in groups of two. So that’s not quite what I want. I want overlapping slices. Does that make sense?
01:46:09Where is seal? Let’s bring that back.
…20Yeah. See this is disjoint. I want overlapping tuples. Can I just search here for tuple
…36sort by zip s? That would be awful to do it with zip. Each cons which each, here we go. Each cons calls the block with each successive overlapped. All right. I don’t know if this is actually gonna make sense to anyone but me. But so if I say each cons
01:47:12byby42 oh TIL
and then I expect the relationships
to include
one
A dot id comma C dot id.
That’s,
yeah.
So it’s funny because
Ruby at least allows me to express it
succinctly.
But reading this,
I know in like three months, I’m gonna be like, what on earth is each cons doing?
01:48:01And I’m trying to express that.
…27So I think now the test is gonna be at a one in 256 success rate.
…43Well, so that’s an improvement. It looks like we’re off the one half rate and now I feel like I’m a getting immediately called back to my bays method of at what point I’m building my own confidence over confidence order, which is cute that this spec passes one in 256 times. I’m not actually gonna commit a speck that only succeeds one in 256 times, but it would be nice to know. So if I do the mental math, what if I run this? How many times do I run it to have to see a failure? I think if I run it 256 times, I’ll have a at some point. It approaches e oh man, my college math professors are mad at me right now, especially because there isn’t it, there is the mathematical answer that it’s going to involve E but then the practical answer is just run it 20,000 times and see that it fails about 20 or that it fails about 102 156. Mm.
01:49:59dzwdz 63% chance of hitting a single failure i think
I don’t actually want to sit here running this manually 256 times.
Obviously let alone 2560.
dzwdz in 256 runs
I don’t wanna write a script. So I am just going to
63% chance of hitting a single failure.
Yeah, I am sure there is
a proper
bay
dzwdz 1-pow(255/256, 256)
an answer and I don’t know it offhand.
I feel at this point
I am sure
oh, frequented statistics.
You
probably also, right.
So I feel
comfortable enough that I understand what this code is doing
to say. Yes, this, this spec is gonna fail 101 in 256 times
01:50:57and I can do something evil to make this spec reliable. So right now the fleck is, the spec is flaky. And the evil thing I can do is
01:51:26chindiana_jones Vim is the best editor
let’s justify it.
Set
necessary is one of those words I always struggle with.
Hi Indiana Jones. That’s
a
good title. Yes, this is
V
so if the speck is necessary, flaky,
the generated
I DS can be the same.
Bye.
dzwdz (also, had to go afk for a while - i see we're just going with a random byte?)
So what I can just say is if,
yeah, and if in a test who does that?
If a dot
id
placeholder byte equals
chindiana_jones What are you working on?
B dot id,
hm
don’t need to change thoughts.
So I don’t know that we’re
DZ I don’t know that we’re going with a random bite. We are
going with a
spec
01:52:31I just wanted to
try the random bite and see if I could write a spec for it and just kind of
feel it out.
So Indiana Jones,
I am working on wii this is very much a wii thing.
We are working on Lobsters,
pushcx https://github.com/lobsters/lob
which is a, a web form and it has threaded comments and sequel and threads don’t mix.
Here’s our scratch file.
dzwdz oh wow that if is cursed
I will just throw the main bug in here and you’ll have to follow links,
but most of them are, are linked off.
We made a recent performance improvement in 13. 0, a,
that if his curse. Whoa. Oh my God. Then what are you doing?
I have,
nanerbag that was new
I have never seen Vim.
I pulled it off screen because God knows what it’s doing,
chindiana_jones LOL
but it is scrolling continuously.
That’s
hilarious and awful. I wonder if I did something to Alari.
Yeah. It, it seems to be looping the
alacrity buffer. So I am just gonna blow that away and open a new Vim
dzwdz vim's trying to stop you
and,
dzwdz good.
let’s get back to Lobsters.
That was wild.
Where’s my scratch buffer?
So definitely something for that. Like
what on earth did I do to look pretty
to get it.
01:54:03nanerbag ive only seen vim go wild when i open it inside a cli debugger
That was a weird one. All right.
And I don’t need a terminal. I need a Vin.
…16chamlis_ could you loop the spec until you can do an expect?
You’ve only seen Vim go wild when you open it inside ac L A debugger. Yeah,
I don’t know if you saw this on the other
stream but since I use Vim as the terminal multiplexer,
dzwdz so this test would pass on the current codebase
could I loop the spec until I can do and expect? Y
dzwdz i'm pretty sure
yes, that is a different kind of cursed.
I think Dey
is right.
This test would pass on the current code base.
I can check that.
I mean,
with that if, well, yes, with that, if it would
pass because they’re both gonna be,
I mean it wouldn’t pass because the ID placeholder by doesn’t
exist. But if we grabbed it, yes, they would both be 255.
So yeah, I I had to
edit my Vim config to catch if I was opening a Vim inside of a Vim terminal
because
VX
Bucky, when you do that,
it argues over key commands or
grabbing various keys. So
01:55:28yeah,
I mean
dzwdz i'd maybe do a mix of the two approaches? run the test like 4 times, ensuring that if it fails it's for that reason
it is performance code. You’re allowed to be evil in performance code, right?
There is a stream title,
…48do a mix to run the spec four times, ensure that it fails for that reason. That’s
…58that’s fair. I don’t wanna over engineer this spec and I am aware like this is heinous but
01:56:15but I am kind of comfortable. That is the amount of heos city I can commit. I don’t know. So now that we’ve we’ve implemented, I think it was Chali Yeah, cham less. Now that we’ve implemented their, their strategy of just do a random bite and kind of looked at it. I mean the benefit is it avoids that round trip to the database, which is the thing I because the alternative I had of pull back the ID from the final bite.
01:57:07dzwdz i wonder how the odds would change if you just added a bunch more comments
So what do we think this
dzwdz wait no it'd just fail more
this quality here is
…16you wonder how the odds would change if I added a bunch more comments?
Yeah.
Yeah, it would just fail more
because these are just random bites.
So this is now.
So let’s think about what this means for production.
Cause even with all the fun
bit bashing kind of performance hacky stuff, we have to care
are the comments coming out in the right order
and inserting a random bite means
the likelihood of this exact bug which has happened plenty.
dzwdz hm, if it would fail more with more comments
In addition to the
chamlis_ what's the birthday paradox number for 1/256 odds?
four people here who have thumbs up,
which I am taking to mean four people have seen this bug.
dzwdz it would fail more in production
I have gotten, I think three direct messages from people reporting the bug. To me.
Whats the birthday paradox number for?
It would fail more in production.
Yeah.
And it would fail more in production because comments can have multiple replies.
So it’s not one in 256.
It is.
Oh and it’s even worse than that because it is.
So here we have
A and B and C and C went to the wrong place
but there is also D
and so this goes wrong
as a birthday paradox with
the number of child comments
that
the two parent comments
hit the same random bite form
and every time there is a child comment,
it is just flipping a coin
as to which parent it’s gonna go under.
And so we are not
getting it down to one in 256
and I don’t
immediately know what that formula is going to be, but it is definitely not
such a small number.
This is
a
oh, this feels like it’s gonna happen about
once a day
because you know,
just kind of rule of thumbing if there is roughly one bite of comments per day
and there are roughly 256. Let’s
here.
01:59:58I think it’s date
the,
is the function to extract it
from?
Yeah.
So if I say
dzwdz i assume you're not using the last byte of the id because fetching it would be slow?
select date
rated at
my
account
one from,
I
swear,
we, we’re in September, so let’s say three months ago or six months ago.
Yeah, let’s just do the last couple of months.
02:00:41Yeah.
Oh, it’s actually increased recently.
So we are running about one bite worth of comments. Hey, look, here’s exactly 255.
I mean, it’s not 256 but
that,
that counts as a round number, almost
roundish number.
So this really is gonna happen
way too frequently. So channel is, I think,
I think this was a solid idea
but now that we’ve seen it and kind of kicked the tires on it.
dzwdz what if: two random byes
I don’t think I wanna go with this.
This is
a,
it’s clever
but it’s just not quite reliable enough.
dzwdz i mean it'd still be shit
What if two random bites?
02 random bites in the confidence order
cause I could just override. Oh, that’s
well then I’m,
if I put random bite,
if I make one of these two,
a random bite,
dzwdz no, i meant making the confidence order 4 bytes long
we are definitely getting comments sorted wrong.
02:01:55dzwdz unless you can't
You meant making the confidence order four bytes long.
We are waiting on. Bye bye 42
doing a performance test harness
before I’m willing to consider making that four bytes wide
byby42 Oh I didn't promise anything :)
because of its four bytes wide.
At that point.
I almost just wanna make it five bites wide and
I know you didn’t promise I’m just putting you on the spot. I hope that
I was hoping that came across as teasing.
It was also a like, hey, are you still listening?
No, I know you didn’t promise. And
also it’s a volunteer project. So even if you promised,
I don’t think we can really yell at you if you
promise to give a gift and then don’t give a gift.
Like nobody gets yelled at for not giving a gift.
So,
yeah. All right.
Where am I
at?
So I’m gonna take this diff.
Yeah.
02:03:18Can I just do this to?
So there is something weird with my,
my vim set up where I tell it to run a
command and instead it just shells out instead of running the command
dzwdz possibly even worse idea: use the top byte of time()/60
and I know roughly where it is.
Oh, I know exactly where it is. All right. Post stream.
Yeah.
…51dzwdz it's basically sequential
Use the top byte of time divided by 60
dey.
dzwdz also please don't do that i was joking
What’s the value of that? It’s basically sequential.
dzwdz er yes
I think it’s sort of top bite. You mean low bite, right?
The, the rightmost digits
of the number.
Yeah.
Yeah.
So we’re, we’re big Indian versus low Indian here. But
hm
I see where that gets. Yeah, that’s not,
let’s just go ahead and do this, let’s run
02:04:40and then
I don’t wanna just totally throw away the code. So I’m just gonna dump the,
oh, that’s my
what’s the command to not use? No X diff.
I want a proper patch
and then typically I use
diff Tastic.
So that’s why it’s getting that nice side by side view.
But I wanted to generate,
you know, an actual patch.
So I’m not gonna just,
I am gonna revert this code because I don’t wanna keep
any of this,
dzwdz it wouldn't even make the test pass without the safety latch lol
but I did just wanna see it
all right.
02:05:25Yeah, it’s been
safety latch is,
oh, really,
dzwdz unless you threw in a sleep in there
really flattering way of describing this. Terrible if,
safety latch the,
unless I threw a sleep in there. Oh, it’s just getting worse.
It’s kind of fun to write that sort of evil code of like,
what’s the worst way I could get this spec to pass.
…58So I think the answer is going to have to be, let’s insert the low by of the ID to confidence order. Oh, you know what I think I actually do want this spec back. I shouldn’t have thrown it away. That’s the thing we’re fixing. Good thing. I saved that. D huh. Every once in a while I do something clutter and it’s not too clever. every once in a while I miss in Dent Ruby. All right. So let’s go ahead and drop all that and we’re gonna get rid of. Yeah, we’ll keep that part of it.
02:07:01So hang on to that spec for a minute.
…07So I want to say is, let’s, let’s split this, where is the,
…20what I wanna say is, comment that no self dot update and this is where even having used rails forever. I have to check this. So there are a bunch of methods on active record models and one of them does a direct update that yeah, bypasses validations, callbacks are skipped and I can never memorize. Is it update column? Is it update attribute? Is it singular? Is it plural? There is not a consistent way to do this. So we’ll just say update column again. I am allowed to do evil things in performances code and we want to say update column, confidence order and I’m going to update it to wrap that and then,
02:08:32then the boat model. Yeah, boat models shouldn’t have any. Yeah, there’s no model hooks there, no callbacks because that would be expensive. We get a lot of votes every day. Yeah. The other problem with cash invalidation, which is a lot of what’s happening with all this story and vote stuff is it mixes really poorly with Rails or active records fondness for model callbacks that happened before or after crate and at some point they just, it is very easy to have these cascade in a loop or like vote updates, comment, common updates, story, story updates, vote, vote updates and then you’re off to the races, especially when those callbacks have ifs that’s been some very fun debugging for me at clients.
02:09:34So this at the point that it calls record initial upload. I think Rails automatically selects it back to have the ID rather than it’s still having the, the wrong thing. So if I said this wouldn’t be nil,
…59byby42 Not with update_column I don't think so
it’s just,
yeah, run that spec
02:10:09wrong number of arguments given one expected two. What was update, column name? Oh, name come of value. It’s not a, it’s not a hash
…38byby42 update_column(name, val) / update_columns(name => val)
can’t write
unknown attribute. Oh This,
this needs to be a symbol.
…57OK. So yes, Rails does auto populate the idea. It didn’t have to reload A to get its ID and then the spec passed because the relationship is there that gets us down to the very low error rate we expect.
02:11:18So while this is a round trip to the database, it is only one round trip to the database where always calling update score and recalculate. It’s gonna kick all this off
…38byby42 Yeah, still saves 7 queries IIRC.
and it might actually be.
So the second thing I was thinking about
…49still save seven queries. Well, but I, yeah, you’re correct that it saves seven queries. However, it saves a query that I don’t know. We want to save, I’ve been thinking about this 1308 and having traced it here at the start of the stream and then traced it again.
02:12:19This part is fine like this is correct.
This is the long version of what I’m doing
where
we know that with only the submitters initial up votes that this is one,
we know this is zero.
So we know what these things are.
All of these are the same. The only thing that is different is this idea at the end.
However, this calls update cached columns on story
and update cached columns
does actually matter.
So for example, we’ve been using that comments count
column in our database.
This immediately has the wrong number. So now when a comment is inserted,
a
story will not show that comment until it gets a vote or until
well, really until any
comment in that story gets a vote,
byby42 :facepalm:
it’s not
byby42 Totally missed that.
super.
And then
the other thing is each comment
slightly
contributes to a story score.
Yeah, bye bye. It’s ok.
by route, missed it and I missed it when I merged the pr but I’ve been thinking about
it
and, and looking at
it
and I had initially said, well,
yeah, so the cash is slightly out of date for a moment,
but people vote on stories all the time,
but it feels like a pretty
fundamental thing to get wrong. The comments count.
02:14:13So I’m wondering whether it’s worth it.
byby42 The comments count can be done atomically.
And one of these is gonna be the source of the deadlock like we’re
touching comment and then we’re touching story
and the story touches comments back.
Well, but there’s no votes in there.
The commons count can be done atomically.
…37byby42 But yes, I was trying to reduce the queries on story to reduce deadlocks.
Yeah.
Yeah. One thing I don’t love about this is
it’s doing this on the rail side
where this could be a
a sequel command to tell
the database to count
rather than
pull the count back to the ruby. And then
hm,
yeah, the deadlocks,
the deadlocks are something about voting and comment creation.
It is very rare that it’s story creation,
which is kind of funny because story creation
is way more expensive.
So it would hold that lock longer. But creating stories only happens
byby42 Yes, but the issue is that to create a comment, you lock the story in the one transaction
10 to 30 times a day where comet creation happens, you know, 256 times a day.
byby42 That I believe is the source of deadlock.
Hm.
02:15:36It’s possible that the comment locking. The story is the source of the deadlock. I have once or twice I have traced because you know, on the rail server output, you can see all of the comments that or you can see all of the sequel queries that get executed and I’ve kind of traced them for when I insert a comment versus when I change a comment, what’s the path of tables we hit and then the same thing for comment, creation and story. And it didn’t leap out at me that we were locking the same tables in different orders. It’s possible it’s there and I just missed it, but I didn’t spot it immediately.
02:16:27byby42 To be clear I don't have strong evidence.
So what this really comes down to is,
am I OK? With these caches
being invalid for a little while?
…42Ok. I I understand your clarification that you don’t have strong evidence. Yeah, I don’t have, I also don’t have super confidence around the deadlock. It’s one of those where it’s so rare and fiddly that I don’t try and speak too confidently around it. So what I’m wondering is,
02:17:15is it OK? That
…28so it’s the comments count. It’s the hotness and then same again for other stories
…45dzwdz i assume comments_count is the one displayed on the front page?
and merged stories are like,
I don’t know 1% of stories. So that doesn’t really keep me up at night.
But seeing comets count in hotness.
Yeah, commons count is displayed in
believe list detail.
Yeah.
02:18:13Yeah. So list detail. This is the partial that gets used for the home page. The list of stories that is used everywhere. So it is basically everywhere. The story title and that familiar row of links appears, the comments count would be wrong.
…39I guess what I’m asking is what is the median period between a comment is posted and any comment on that story receives a vote, right? Cause if that window is very small, this is no big deal. But if that window is minutes long or hours long, we’ve inserted a bug. The smaller version of it is. Yeah,
02:19:42dzwdz so the comment count only increments once someone updates the comment?
byby42 Story.increment_counter(:comments_count, comment.story_id)
dzwdz that feels bad for first comments on stories
that feels bad for first comments on stories. Oh,
that is a really good point.
Can I just
come here? Which which give me that
struggling with the mouse cursor?
I just wanted to just grab that straight into the log. Yeah, that’s
that is especially bad for the first comment on stories
because
oh, and it’s worse, the the fewer comments on the stories are. Yeah.
DZ you got to a really good point here
that
this window where the comment count and the hotness are
out of date
is going to be significantly longer for
stories with few comments
when a comment, when a story has 200 comments, well, you know,
dzwdz and if you were going to calculate the median time you'd need to take that into account
a vote is gonna come in fairly soon
when it has zero or one.
dzwdz hm, how long has that been a thing for?
And it’s especially the worst for the first comment
because then the story will show zero comments.
And so people won’t wanna click in to read the first
comment because they won’t know it’s there.
That’s
brutal.
Yeah. So that’s,
that is a, I think a really compelling point.
So I was writing down this question because
like I can mentally start writing this query,
but I know it’s gonna take a couple of minutes to get right
and then I was gonna talk about hotness, but this is so bad because
we
don’t want to be clicked into
stories with zero comments to vote
and invalidate the cash
and
you know, that points to,
oh, what if we had a background job that every 30 seconds ran and poked
all of the stories that are on the home page or on slash newest?
And when I, I see that it’s, it’s just epic cycles on epic cycles trying to
minimize the badness of the cash invalidation.
The other part is
it just bugs me that hotness will be wrong for some period of time
because
that affects story scoring
and it feels like it dis incentivizes
hosting a comment because comments
contribute to story scores.
And so
I’m kinda down on it for any amount of time.
I want
stories to be able to move around.
I want the home page to be dynamic as people are commenting and
on stream.
A week ago,
I was even talking about starting to theorize waiting comment points more
because I think comments are,
I think I called them the beating heart of the site.
Like they are the point of the site is to have discussions with other people.
And if we’re not
getting that very,
very right to reward people for doing it and encourage good conversations,
what are we doing?
So I think
having seen these two approaches to implementation
and thought about the consequences
and
this is really just
how I work on a typical basis of.
I kind of have to see the code to be able to reason about the consequences.
I think I have to revert 1308
because it’s not just more special casing, it’s
02:23:45it’s invalid cash data on story and it’s kind of and for an indefinite amount of time. And I think that’s really unpleasant.
02:24:01Yeah. So let’s look at this diff so this approach,
…14let’s look at the Yeah hatch. It’s shorter on code than the random ID thing, but that’s just a fluke.
…40So having done that, having done that, I think we have to revert 1308.
…53
arh68 revert seems reasonable, given all that's come to light
T I’m not sure what you’re asking about. How long is what been a thing for
1308 has been
merged for about
week, 10 days. Oh, that’s the pagination. I have two browsers open.
The other one was the pagination stuff. Assuming we would get to that.
This turned out to be a, a much deeper rabbit hole than I expected.
So let’s look at what I was in 1308 and see if there’s anything to
save
because it touched a whole bunch of code.
So this inserted placeholder values. Oh And it’s not just this,
this is
not the complete diff because I landed some code on top of this.
And I
did, I put it in the merge commit.
No, come here.
02:25:53Let’s look at the recent commits.
02:26:06dr3ig pushcx, for my own sanity check: in order to build the comment tree of a story (if you were to build it in ruby), you only use the following data: comment_id, parent_id, number of votes, number of upvotes ?
So here was his.
OK. So yes, I must have put it in the merge commit. So it’s showing here but not on the pr
in order to build the commentary of a story,
you only use the following data ID parent, number of votes, number of unfolds.
No.
The the other part you’re missing is confidence. That’s why
arh68 # of flags ?
that term keeps coming up. It’s the
where is it?
…47Where’s that function? Maybe you came in a minute later on the stream. I showed it right at the beginning. There’s this comment hash number of calculate number of we’ve been running long enough that I’m I’m starting to run out of steam. So this calculated confidence is a formula for creating the difference between a comment that has a score of one because it has one up vote and no flags and a comment that has a score of one because it has 10 of votes and nine flags. One of those is a better comment than the other. This function is a little bit overkill because on Reddit, people upvote and downvote comments much more frequently because Reddit has down vote to me and I disagree or I dislike and lobsters in part because we’re a much smaller forum basically got rid of down votes. And there’s the implementation detail that flags are implemented as a down vote. Some of that is our historical path, but it really is confidence, not number of flags. Very few comments get flagged at all. It is slowly moved into being a mod only signal.
02:28:30It is not
given where we
go, where the site has gone over the last dozen years.
It’s actually totally possible that confidence
is not adding anything of value to the site anymore
because
arh68 i don't think i've ever flagged anything HahaHide
in part because we get fewer votes
and that’s just a popularity thing.
But in
part, because
02:29:05if people aren’t down voting, we’re not getting that signal anyways. And so we’re just doing a bunch of arithmetic to get to up votes minus flags. So Drake, you know, the narrow answer is no, we need the comment ID, the parent ID and the confidence. But the broader answer is, yeah, maybe it could just be the ideas and the score, the score is, you know, up votes minus flags. And if we’re getting the same amount of signal out of it as all of this confidence math but all of this confidence math is not actually winning us anything. Do we wanna drop that? Because then then confidence order path could become something like score, order path. And if it’s still three bites wide, it would have one bite. That is the score which yeah or maybe call it like the score plus 10 cause comments can hit negative scores. There is a a limit there of negative 10 which is why I came up with 10 as a number and it is theoretically possible up votes are unbounded. But for sorting, we could bound it, you know, 255 or 245 and effectively not lose any data and then two bites could be the ID.
02:30:54intrex111 hello hello
How do I feel about that?
02:31:02intrex111 how are you all?
Hi,
re
welcome.
There is we are like
pushcx https://github.com/lobsters/lob
intrex111 btw yelp filed a lawsuit against google
way deep into a coding stream. I am not going to try and catch you up. But
the short version is there’s this bug and in debating
whether to revert the performance code that introduced it,
do we wanna just totally reduce comment scoring?
So does comment scoring?
Well, sorting
even benefit
from the complicated,
don’t need to disparage it from the confidence
calculation.
Lobsters doesn’t use down driving much.
…57intrex111 im still shit at all of this stuff, trying to get into uni rn to study this whole thing mainly cybersecurity
Well, let’s say it better doesn’t use down voting.
02:32:10dzwdz i wonder how hard it would be to make an userscript that sorts comments in this way
intrex111 what do you think about yelp and google though?
Mhm.
In tricks.
That is neat but we don’t really talk much about
dzwdz to try it out
businessy stuff on this stream or on the site.
intrex111 okay okay
I
have a personal opinion about
intrex111 no worries
Yelp and Google, but
it’s just not relevant. This is a coding stream.
Cool.
intrex111 i meant the lawsuit btw
if you haven’t seen hacker news, they probably have a lot of responses about it.
And that site has open, sign up.
So that would be a great place to go to talk about the lawsuit.
I am sure they have a thread on it
and I will probably end up reading it this afternoon.
So.
02:33:03intrex111 where again sorry?
dzwdz news.ycombinator.com
intrex111 there are a lot of noise around
dzwdz if i wrote it from memory correctly
Oh, hacker news. It’s,
here, I’ll give you the link and chat.
Ah,
there we go.
D
got, it
…17intrex111 thank you both
looks right to me.
If it links to a very tan and orange site you got. It
…31does dropping confidence actually get us fewer round trips to the database.
Yes.
intrex111 have a good day you two also be careful of telegram it aint end to end encryption
Yes. Then
1308.
Well, no, it still has to.
…50Yeah, we’re very aware that telegram is not end to end encrypted.
dzwdz signal gang
It is
a
weird Russian social network
mostly used for
scams and crimes.
Yeah,
dzwdz i use signal to cmmit all my scams and crimes
man.
Signal
on a technical side. I love signal
there.
I don’t,
intrex111 there is a whole new case if you didnt know
I’m not a deep crypto guy. I know a little bit about
crypto. Like I can kind of read, I understand what the double ratchet is doing.
nanerbag Me who is switching between this stream and talking to my friend on telegram
Even if I couldn’t implement elliptic curve cryptography myself
or even
do any kind of cryptanalysis. But like I’ve read some of their stuff and it,
the parts I get are very inspiring.
The organization is
intrex111 ceo got arrested and encryption being weird and stupid and all that
lobsters the organization is so weird and
I’m still deeply skeptical after they introduced a pre minded Cryptocurrency.
arh68 i am so out of the loop on all that stuff LUL
I sort of understand how that
intrex111 i dont want to talk about it since you dont want to talk about that stuff
happened where they’re like, oh,
people really want to integrate payments and it would be useful.
But
the cost of that was so high where
they stopped updating the server code for
something like nine months or a year.
And everybody was like,
hey, has some lawsuit happened is something very bad happening.
We don’t know and they just refused to talk about it
and then
they introduced it. It went really poorly.
My understanding is it has seen little to no adoption.
dzwdz puts tinfoil hat on
There was weird price movements just before it was
dzwdz the coin was a coverup for a lawsuit
released and publicly announced that implied insider trading.
I don’t know. They never,
if signal did a retrospective about it,
I would like
them a lot more
dzwdz to explain the lack of activity
as it is now,
everybody makes mistakes.
But if they’re not willing to look at those mistakes
and learn from them and address some of the shortcomings,
intrex111 if you want arh you can go on spotify there is a podcast called "security now" by steve gibson and they talk about all of that and more important stuff, they also have a link to a pdf which are their notes
dzwdz also they're lizardpeople
I don’t know, the coin was cover up for a lawsuit.
Yeah, that’s getting tinfoil Hatty.
But feel free to send that to me offline.
Yeah,
Steve Gibson.
I can see that.
That would be interesting if you’re really interested in starting with security.
There are other podcasts for going a little deeper than Gibson does.
He does great intro stuff, but he is not a, a
deep expert and some of his opinions are
out of the mainstream of expert security analysis.
Hm
intrex111 who would you recommend?
So coming back to confidence calculations, if we don’t use downvoting,
would this let us merge 1308?
Would this let us keep 1308? Yeah, maybe.
Yeah.
02:37:0610. Which is the what is it called? Is it min comments for Flagg min score.
…26What are the most uploaded comments on lobsters now that I say all this
…36is it Subster, I’m never gonna get this right
…56dzwdz what about the comments_count?
on the comments. Do
story ID in the stories
order by
let’s grab the score too.
02:38:20Look at the top 10. Oh yeah, there are comments that go way up to 200. Not bad, not too surprising. They’re all gonna be, they’re almost all on meta threads and then two or three spicy threads. Yeah. All right.
…52Let’s move that to its online.
02:39:10Eliminates birthdays. I’m not remembering the right word. The reason I say eliminates is that scores are or co stories are only open for comments for I think 90 days. And so if we only get 256 comments a day, a 65 K ID is not gonna roll over before the story. Yeah, I think that might actually be a a serious improvement, I guess. Really the the open question is how much does confidence different from sport
02:40:04or another way to put it is if we start a score instead,
how many comments
change order?
Right?
DZ. What about
dzwdz yeah i sent that ages ago
the comments count in what context
…26the,
dzwdz no np
oh, you sent that ages ago? Yeah. Sorry.
It’s hard to
jump back and forth between chat. Especially when I’m trying to
think of everything I’ve said in the last 10 minutes to, to summarize.
So,
…50to answer this question, we could just dump out the comment trees again and d them and then just kinda get some stats on it. Yeah. That’s, that’s pretty straightforward. That’s an interesting one. It’s a big enough change that especially absent this. I kind of wanna sleep on it rather than YOLO it on the stream. It’s tempting to YOLO down the stream though. I will say that because it’s, it’s fun to factor this stuff, especially with commentary. I’ve heard people talk about mob programming. I guess this is my first time doing it.
02:41:42dzwdz i'm impressed just how much you've misinterpreted my question lmao
All right.
So let’s
go back,
I think regardless I am gonna
revert
1308 here on stream rather than go deeper into.
Oh, should we factor the way comet sorting and confidence works entirely.
02:42:15dzwdz i asked it in the context of you talking about keeping #1308 in if you change the comment ordering
So this, this is the other stuff I was thinking. Do we wanna keep any of this?
And the answer is no,
because this is just testing
new behavior.
This one would pass anyways in production,
dzwdz because there's still the issue of the caches being invalid
but it doesn’t really
does. It feel like it’s a missing test from
…55dizzy. Thank you for explaining. Yeah, there’s the cash in validation cause and we’ve mostly focused on comments count because it’s, it’s really noticeable like it’s a very glaring issue of a story says it has zero comments and you click in and there’s one and especially if the time stamp is, you know, this comment was posted an hour ago, that feels really bad. It’s very obviously bad. But the story hotness also, I didn’t believe it, but it really gets to me because I want the home page to be dynamic and reflect people’s contributions and that’s a very deep motivation. So increments correctly, that’s all fine. We don’t need that performance.
02:43:49I could almost keep this speck and I say almost because it doesn’t really get me anything but it doesn’t express, it does express an invariant and leave open the door for. Can we improve this in the future? But the thing that it’s missing is we don’t have two more layers of expect of. We expect story hotness to change, we expect story comments, count to change and that’s why this spec was inefficient. So, no, I don’t think I wanna rescue this. OK. So I’m just gonna revert this entirely. What is the commit?
02:44:31All right,
I almost never run, get revert. I usually end up keeping things.
arh68 tactical revision SeriousSloth
Oh I’m not allowed to avert a merge commit.
Now, I wanna actually see what dash M does because
you can’t ever heard to merge because you don’t
know which side should be the main line.
Oh I see.
Tactical revision.
Yeah,
there’s that question of
if the build fails or rather the build passes and
you deploy a site and it starts failing in production.
Do you try to roll back or do you try and roll forward?
And especially after working at Stripe
on a enormous code base with huge volume.
My answer is roll forwards
because
it’s not just about scale.
It’s a be honest with yourself. Now you have,
you have possibly invalid data in your database. You have
all the customers who have that experience. You are rolling forward period.
You cannot really
roll back all the things that happened.
arh68 can't roll back the clock
It is almost
never the correct thing to do to toss
the
the users work in that way.
So
56
can’t roll back the clock. Yeah,
I mean,
I almost don’t wanna say that because it’s a little bit reductive
of Yes, of course. No one is saying that we have to be pure in some mathematical sense.
So let’s
trace these little Ask arts.
I want this side of the merge the 354 side.
Yeah,
I think this is literally the first time I’ve run get referred on a merge commit.
Oh, cool. Doesn’t have the parent. I didn’t follow the ask yard.
So four F six
has the parents
972
and 354. Am I reading this wrong? That’s
let’s take a look.
02:47:043 54 77, whatever
D
commit doesn’t have the parent?
Yeah. Yes, it does. You just told me the parents
I trusted you
who has run a get
revert on a merge because
dzwdz lol
I am puzzled.
Why can I not?
…41dzwdz guess we're keeping #1308 in
Because the
this 972 side, that’s,
that’s definitely
that last commit on the branch. Yeah.
Guess we’re keeping 1308 in. Yeah.
Yeah.
Get itself has an opinion on what I commit.
All right. Well, let’s look at the all of fails. Read the docs like what are you
learning?
A merge commit declares you will never want the tree changes brought in by the merge
as a result, later mergers will oh Actually, maybe this is bad.
Later mergers will only bring in tree changes introduced by
commits that are not ancestors of the previous reverted,
merge.
This may or not be what you want. I hate,
I hate this sentence. I have seen
versions of this sentence. I have written versions of this sentence and I always
revert it
or revise it because
it doesn’t express anything. People always want different things.
You need to give me the criteria for deciding.
And I realized some of that is this first two
arh68 "it's just a DAG" lol this is the reality
reverting merge commit clear. You’ll never want the tree changes.
Maybe I just want to revert the individual commits,
but I thought I landed a couple of changes.
It’s just a dag. Yeah. See the revert faulty merge. How to,
is that a link down here? Yeah. All right. Let’s,
02:49:14let’s grab kit. Oh Yeah, I can’t just,
…24you gave me a URL that didn’t work there. Oh my God. No. So this is this dock of how to revert a faulty merge is just like, well, you’re in trouble. Here is a mailing list message of Linus Pio and someone else just griffing.
02:50:06I am tempted to just grab the diff of the merge hardwood still exist. Yes, that’s fine. I wanna see it. Oh Future mergers will see that merge as the last shared state. So revert undoes the data, but it’s very much not an undue in the sense, it doesn’t undo the fact.
…34Wait, why are we reverting a revert?
…55So I think it’s warning that if I’m just doing a textual revert,
it will still think those commits are merged to master.
And if a later merge comes along and says,
OK, we’re going to
rescue those two commits off the branch and add one more git will go cool.
Oh I already merged those first two. I’m not gonna do it again.
So I think this is one of those. It is a problem in theory but not in practice
where
in theory, if I was going to reopen the 1308 branch
and add another commit or two on top to fix it in some way,
it would get confused about what to actually apply.
But in practice,
we’re not gonna reopen that branch, it can just be dead forever.
We can redo the changes in a new branch. And as long as Beirut knows that
and given that
bey
route is
I think a Rails contributor,
they must have seen,
you must have seen stuff like this before.
So we’re just gonna go ahead and
revert via diff, I think.
Yeah.
byby42 It's definitely no worries.
So if I say,
let’s show this,
let’s
02:52:32diff this against its head, right?
…42Yeah, let’s say no ex diff.
…53And then,
02:53:00so if I ran this diff like this, this is the inverse of the merge because you can see it’s deleting the 255 placeholder, which was the thing I added at the last moment and then inserting the old code. And I’m just kind of rereading to make very sure I’m about to do what I think I’m about to do. Oh, I am a little missing this, this do bites that did make for a clearer test, but I would rather do that as a second commit because yeah, so if I take this diff and then I apply it
…42one or two hunks field
because probably because we touched comments since then.
dpk0 uh, since (afaict) you didn’t push the merge, why not just reset your HEAD to the pre-merge commit?
Yeah.
So let’s go look at the file
dpk0 or maybe you did push and i just don’t see it
and
now assign initial confidence.
That’s fine. But the
a court initial
DB K, the problem is I did push the merge like
10 days ago.
dpk0 aaaah
So if we look at the git
tree,
I would scroll down a bit.
So 1308 was who can spot it
was not a page down.
02:54:41Yeah, so it was 10 days ago. Why, why is it not jumping out at me 1310 pre compute, it didn’t end up with the name in the title. So it was merged 10 days ago and it was live for 10 days. And that’s why there’s all the bug reports because we’ve had 10 days of comments being in the wrong place. So I believe that’s what that was. Let’s double check the diff. Yeah, we just always did it.
02:55:30What was the comment? I wanna put it back. Exactly. Trigger DV. Calculation.
…42OK. And then how do I finish? Oh, it’s just a patch failed. So the repo isn’t in a weird state so I can just remove the failed patch stuff. And if I look at this diff this is now the inverse of the merge. Yes.
02:56:10So let’s add all that. Oh And I can’t commit because I’m in a V and A B there. Can I thought I allowed it? Nope. So here, let’s just create it here. Let’s revert. I want that. Hash. So 4567.
02:57:00Yeah. It feels like such a shame to lose that. I’m gonna grab the one thing out of here that I did really like the bit packing part where down here I’ll go ahead and say dot Bites. That bites.
…39Yeah.
…50So if I grab these spec changes, yeah, it was just so much nicer to express especially to be able to say like I’m expecting this to be the ID, not some random hex value.
02:58:10And in this one, it’s gonna be C dot ID. It’s, that means parentheses.
…22And with that,
…34then let standard RB take care of the spacing. No, didn’t weird. All right. And then can I run this full test? As long as everything is green? 000132 one of my. This is the place folder on creation. Yes, let’s close. That. Did I not revert correctly?
02:59:13Oh, it’s because I am inserting the placeholder. But then the after create to recall, record, initial upvote is firing before it comes back to this spec. So how did this spec work?
…41I would think that this spec was failing. All right. Hold on. Losing track of where I am. Let’s back everything out and get shows. No changes. Does the spec pass? No.
03:00:10All right. Let’s put everything back to the nice way now. So why did this run? Did I change the behavior of how record initial uproot works to wear coat on top? So this happens and then
…36this, this is left over. That’s what’s happening here. OK. Yeah, I’m gonna have to do a rebase here because that was, that was dead code that shouldn’t have been committed. Let’s, let’s see the whole sweet pass.
03:01:04So that implies that I accidentally let that
or I accidentally committed the the update code column.
I must not have cleaned up my repo before I ran that patch for revert,
which means the previous commit has
dzwdz rebase?
a line or two extra that I wanna commit. All right. So if the build is green,
standard RB
Rebase, yeah, I am gonna do an interactive rebase to fix that.
I don’t know why standard is only now catching those things
and I’m not seeing the
test change.
…55I am in a weird state.
03:02:10Yeah. This one came back. I committed too much code. All right. Let’s go ahead and add this.
…24Let’s look at the head. So this did accidentally include that
…43and did accidentally have this new test, which I am gonna leave because that’s a good new test of the behavior. I don’t want to leave it in the revert. Oh, that’s tacky. So, so let’s go ahead and say this is gonna be a fix up for the revert. And
03:03:20this is one of those times where I’m seriously tempted to spend a couple of days playing with jiu-jitsu. It has a very different model to user interface than GIT does. And I feel like I would have just put on myself a little bit less here with that.
…45So let’s this one. No, no. Oh This is all standard. I don’t know why standard didn’t run on those. So this is the,
03:04:15we reference that and then these are just standard RV firing. So,
…30so now I wanna jump back to fix this. So we will go ahead and say throw this code away and the, I am sure that I could edit this. But the fastest way I know is to do a to drop it and then make a merge commit. So we’re gonna do that
03:05:05and then I’m gonna put it back and,
…20and what was this bug? 1313. Yep. All right. So with that done, it’s good. That way I’m gonna need the width,
…48let’s do some re basin. So the rebase is happening on this bottom pane if you can see that. So there’s the revert and we’re gonna fix it up. Which is just not actually fix it. Squash, yeah. Or fix up is the one that merges it in, merges the commits and then drops the message. I don’t know why that’s in there, but that’s fine. Oh, got a comment.
03:06:35Why do I have a conflict?
…42Yep.
03:07:04I’m just gonna, just gonna be slamming my hand in a door over and over by adding and removing this code because it’s gonna merge conflict every time I thought if I did it last, I wouldn’t. But,
…36all right. So now
…48where’s my spec?
…58I think I managed to screw up the rebase and dropped that speck on the floor. I think this is an empty commit. Oh, no. Ok. It’s there. Don’t sort when they haven’t been voted on. What is this crate in the middle doing? I must have, I must have, I didn’t recognize it because I didn’t go ABC. I must have accidentally edited it along the way. Some, the one hassle with them is random. Typos can just do substantial edits to things.
03:08:39Why did I do that? Sort of just checking them in?
03:09:01arh68 does it complain like go if ya don't use b
arh68 idk otherwise
All right.
All right. So
here’s a whole pile of commits. We’re gonna look at them to make sure I’m actually
committing what I think I’m committing.
So this is the revert of the merge and there is not the extra line of code.
There is the tests,
changes back to what the specs were before the merge. That’s all fine.
Yep. That was a new spec. That’s all good. All right.
And then this one is just the improvement to the spec
and this one is just Standard RV. That
should not even be firing because I haven’t touched these specs in ages.
Maybe they decided a different rule.
All right.
…52So I expect to have a green build here. And one of the ways I am gonna be a little bit sloppy as I’m not gonna make sure that each of those different commits has a green build, especially because I assume that the standard RB is breaking the build on the previous ones. All right. So if that’s all green, that’s good enough for me. Let’s push all that up. And then with that pushed,
03:10:28we’re gonna, I’m gonna bring in my personal browser, which is over here. There we go. So all of this was to investigate this bug
…50and here’s people calling it out. Yeah,
03:11:02let’s name that commit. It is.
…14No ne So
03:12:02yeah, and I’m just gonna predict because I know what URL works here even though it won’t have a title on it. And this is a broken link. But when I end this stream, this will be live in a couple of minutes.
…53dzwdz fyi i'm writing a quick and dirty python script to see how ordering the comments by score would change the order
dzwdz (although i think that even if the order changes it won't be a big deal)
dizzy for
dzwdz i mean i'm just curious
you. Don’t, I don’t think you have to write a quick and dirty Python commit.
Give me a second to finish this commit message and
or if you dig around in the get history,
I’m pretty sure, I committed my script for
dumping all of those out
and so you could just
have a running start on that is what I’m saying.
03:14:07New paragraph.
…54I’m trying to think of the right way to say this. it’s not that it, yeah, it’s just a little harsh to say it caused a bunch of problems.
03:15:31Mm.
…43What’s the, what’s the old method where it’s not like the command pattern but on a service object
03:16:18life cycle callbacks.
…52Ok.
…59Yeah, I think that’s the best way I can describe that. So with that done, I’m gonna go ahead and start a deploy
03:17:20dr3ig you said at the start of stream you were planning to rework stories and comments models; do you know how you want to go about it, more or less?
and
and
DZ let me see if I can find that code for you while the deploy runs.
I said at the start of the stream was plan to rework the story and comments models.
Do you know how you want it to go about more or less?
The
whats different here?
Dang. It
did. I really
right. I think I
pushed a broken build,
hang on
or this kid is just out of date.
Hold on. Let’s close that
you open it.
You still say there’s a change?
Oh, I’m in the,
how did I not
dzwdz i love how i only realize chat has broken for me once you start interacting with someone else
get that in there?
Oh, it’s
dzwdz i get dropped for a while, then i "reconnect"
all right.
All right. So the commit history was correct.
Well, I must have like rewritten a local buffer or something.
Oh boy.
Oh DZ, I’m sorry that you’re having that, that chat issue.
That sounds really frustrating.
dzwdz i should just connect to here from weechat
Yeah.
So,
so the specific
reworking I wanted to do, oh,
let’s get answerable running again because it’s not actually
broken
connect from wechat.
Yeah.
03:19:06So Drake and, and Drake perfect timing for this question because as I’m wrapping up with a deploy of the work that got done here on stream, it is a great time for anybody else who has questions or comments or queries. They would want to run on the production database. Now is a great, great time to ask them because I can drop all of the code out of my head and we can go ahead and work on office hours kind of stuff. So as you look at the story model, the way merge stories work fundamentally is the stories table has a column called merge into story ID. So if these stories, if a story has any stories merged into it, there are just other story records. And so the problem is the idea of being a merged story is whether or not anything appears in this column merged into Story ID. And it is an infinite source of bugs where almost always what we want to be printing is the title of the Merge Target and not the title of the story that the comment is on, but it’s much more natural to write that. Of course. And I think what needs to happen here is a factor to separate out a story object that has a title, a short ID. Probably votes and most of these associations and everything move over to a something else object, call it maybe I just used link for another table but call it like a URL because it story is a URL and or a text. And then all of those have a foreign key to story and always has at least one. And so then it’s it’s breaking the idea that a merge story is special structurally. And so every story that is just one link. So that common case, you know, the 95% case that would still have a story record. And then in one of these individual objects, and I wanna say I left a comment talking about this a bit in the issues fairly recently.
03:21:53Yeah, this is
so as you can see by the fact that there are immediately three things just
this database structure
pushcx https://github.com/lobsters/lob
infinitely promotes
generates bugs.
Yeah.
So here’s the comment I was thinking of. So if I copy the link, I can drop that to you
and it is a slightly shorter way of saying what I said
slightly better.
But Drake, does that make sense of what I’m trying to get at is
the database model needs to, to better reflect
story merging as opposed to it, sort of bolting on.
And this idea of does a story have multiple links or texts merged into it?
It’s kind of an endless source of bugs.
dr3ig yes, makes sense, thanks
The other thing that could happen is
that would prompt remodeling the story page a bit.
OK, good. I’m glad that makes sense. The other thing is
what was the title of the one that I went to that had all the merges.
It was something about XC.
There it is.
So we have this view of story merging
and it has
two issues with it.
Number one is
when you look at this on the home page
or anything that displays stories, you see this line
and you see this line and you see no indication
that
eight or 10 stories that have been merged into it.
It would be nice to add, you know,
something that counts the number of links into it in that list detail.
So the other way to say it is
there is no way to look at this and know if any of
these stories currently on the home page have things merged into them.
Oh,
and I’m going to have to merge these two stories because that’s a dupe
topical
and another one of these Russ
Linux thing.
Let’s see if that’s been a week
works. Is this a Mastodon?
All right.
At least it’s not a master on post.
So
it would be good if this indicated that there were more stories merged in
and then in here
as commons get merged in,
they get an icon to show that they were merged.
Oh Boy. And it’s gonna be hard to find any.
Here we go.
So this one
was not on
the primary link.
I think it would be really good if we had a section heading in here, that was
the alternate link that this is specifically responding to.
And so a story would have multiple links and then each of the alternate link,
each link would have
a heading in here very similar to its list detail.
So that that was clear, but then all of these appear on one page.
And so we keep our discussion together and we
cut down on the amount of rehashing we do.
There are also some other nice things like
slash active could show
the latest
the latest link merged into things.
a frustration of this aside from this is a big block is
people who submit responses, especially when they are the author of the responses
are frustrated by story emerging because they end up feeling that
their link was denied the attention
it deserved or maybe even to be more charitable deserved
was denied the chance at attention that it deserved
by being a top level story because again,
only this first one
appears for story merging.
And if the database structure
better reflects that a story is multiple links,
we can do things like
have a page that
shows the latest submitted links regardless of what story they’re merged into. Or
I hesitate to offer an option
and I definitely won’t offer an option to
never show story emerging and split them out. Because that breaks
the value to discussions
of
not repeating stuff.
But it would be very nice
to maybe have an option that
lets you see the latest link on a story instead of just the newest.
Because then readers would see
a constantly rotating list of links on the home page
where you would see the last one down here.
I don’t know that might not be a good option. But
once the database model is cleaned up,
we can kind of kick the tires on it and design
a few different things because the database supports it much nicer.
So that’s something I’ve been chewing on for.
Oh God, maybe,
maybe about 15 minutes after I became the admin and started
having to fix all of the bugs related to story merging
and I tease a little but
it really does generate
a new bug. Hm. Maybe
two or three bugs a year. And I realize in a commercial setting
where someone’s getting paid to work on this 40 hours a week,
that’s not a hell of a lot,
but for us as all volunteer with a fairly low pace of change, that’s,
dr3ig would the comments belong_to a link then instead of a story ?
dzwdz unless i messed up something in the script, only 4 comments in the largest thread moved: https://pastebin.com/kdANS5eU
that’s a lot of bugs.
So that’s why that’s hanging around.
And
dzwdz (diff between old and new order)
if you are asking because you are thinking of doing it,
I would love to work with you on that.
03:27:59Unless you mess something up in your script, only four comments in the largest I move. Oh, I was gonna look around and see if I had that script in the still in the repo
03:28:18something about threading. No said in extras. It’s possible. I just did it and threw the script away. But I thought I committed it. I may not have kept it. I would have to dig back and find the confidence or commit only four comments in the largest thread moved.
03:29:05dzwdz for the record i'm assuming that the json output is in the same order as the comments on the site
dr3ig no, i was just curious; it sounds too scary to take on; i'd love to watch you stream it though :)
dzwdz which hopefully isn't an incorrect assumption
dzwdz yo free giftcards
Remember this stuff is
oh I see. So you wrote a script that went and hit production
and then you took the current order and
replaced it with just the score.
That’s clever.
dzwdz one :)
Dizzy. How many threads did you run this on?
And yes, the
JSON output is in the same order as the comments on the site.
Just the one thread.
Yeah.
…46So I see this moves like
dzwdz i'll see the xz one, i bet that one had more flags
three or four threads around.
Oh This is the homepage review thread, isn’t it?
Yeah.
Yeah, the XZ one probably had more flags.
Anytime something real spicy happens, there’s more flagging.
OK?
The other thing is
this is also a judgment call of
is the ordering better
because the ordering is different,
but it is not necessarily worse. Not every change is is bad
and just doing score directly
might do a better
dzwdz i don't think it matters on the scale of a small comment thread
job of shoving down the things that have been flagged.
It might not, I I honestly don’t know
dzwdz tbh
or maybe something simple like
like changing the weighting of flags. So they’re minus two instead of one.
Yeah, I don’t know.
And
in the scale of a small common thread.
It almost certainly doesn’t matter because if there’s three or four comments,
you just see them all. It doesn’t really
change anything. If they’re in a slightly different order,
dzwdz er, i did not mean to put the "small" in there
if you want to see a really big change,
there is one feature I would like to steal from hacker news.
It’s
a, a
dzwdz overall i meant that comment threads are small enough for this not to matter
big change
to
one of the things that they do
is on very new stories
and it might be
one hour, it might be
the first day or so.
Overall common threads are small enough for
a better
probably roughly.
So we score comments where we just try and put the highest voted thing first and I
a downside of that is a bunch of people click into the comments,
they read a couple of comments and then they leave,
but also they interact with a couple of comments.
So there’s a rich get richer effect where whatever
is uploaded early on gets all of the attention.
So hacker news does a clever thing where if you watch these time stamps,
you can’t see the scores,
but they show
threads where the top level comment,
especially this first one is actually a very recent comment
and then there is the most uploaded comment
dzwdz oh, cool
gsvolt i'm traditional - always prefer chronological comments - regardless of how many votes
and then there is the second newest comment
and then the second most uploaded comments and they kind of weave them together.
So that
comments that
arh68 oh does lobsters not do that already ?
are brand new show up interspersed in the top
couple of comments so that they get some attention
and they’re doing,
they always have very low vote counts
and no air h we do not do that already. That is,
I think
HN only started doing it
five or six years ago. At least that’s when it came to my
awareness.
I picked up on what it was doing.
But Lobsters has never done anything like that.
And that is a feature I would love to steal.
So if you wanna add one more complexity on to
doing something with confidence order,
that would be pretty neat.
And again,
it is possible that this is just me having an old attitude of,
I wanna shove all of this stuff into the database and I wanna get things
I want to make the database do the sorting and
I wanna loop it exactly once on the front end.
And it’s possible it’s actually totally fine to do
something like,
well pull it in the confidence order from the database,
but then walk it once to find all the top level comments sorted by time
and then maintain a second list.
Yeah. GS. Well,
we actually
as a, a point of opinion,
don’t allow you to just sort comments by newest
and always present a tree.
I don’t know,
features, features everywhere.
There are lots and lots of things we can continue tinkering with.
So,
I hope
that was a,
arh68 can I ask a couple random questions
a fairly interesting stream of deep diving on
a performance improvement that unfortunately turned out to be
a little buggy
and
a lot of why does the site do some of the things the way it does and
most of it is
arh68 when you were lookin at that merged story, how do those avatars show? do i have to turn those on
what leads to good discussions and then a small amount of it is
let’s just
tune the hell out of performance because
you know, some amount of that is fun and some of that is very valuable.
What are those avatars show?
Oh, you mean these merge icons?
arh68 nah like USer avatars
I think there might be an option
on the settings page to
hide these on comments.
I don’t remember if that’s behind a flag.
Oh The user avatars. Yeah, that’s on the settings page.
You can turn those off. If you don’t like seeing those, you wanna save the data or
you are using links.
arh68 oh i guess i never turned that on
It’s funny a couple of the options we have on the site
like hide avatars are
very, very specifically,
we have a bunch of gray beard folks with a bunch of very strong opinions about.
I’m not sure about these images on my website.
I
get a kick out of that.
All right.
dzwdz wait does comments[].score include the flags
So I’m gonna go ahead and unless there are any last minute questions,
I’m gonna go ahead and end the stream because we’ve covered a lot of ground.
We wrote some code, we deployed some code
things are in a better space. Hopefully a bad bug is fixed.
dzwdz if so then my code is broken
Does the comment score include the flags? So score is
mm
excuse me, score is
defined as
uploads minus flags.
If so, then your coat is broken.
arh68 when I pull up a never-seen-before thread, is it supposed to say 0 unread ? i'm ok with it, jsut strange
Hey, congrats, you wrote a bug too.
Everybody gets a bug. You get a bug. You get a bug
if you wanna play with that and revise. Go ahead when you
pull up a never before seen thread as it’s supposed to say zero on red.
No,
it should say nothing.
That sounds like a bug.
It’s not nice to nerds.
dzwdz does that happen with comments on there?
Might snipe me with bugs that I wrote just before I end the stream. So on story show
I put in or zero.
Oh This is a bug.
Yeah, it should be a
let’s just
grab. It
shouldn’t be possible to see zero. You see this if
03:37:22this, if is saying
if there is an unread count
greater than zero include the unread at the top,
I believe you’re talking about the
I can’t see it because I’m not logged in.
But you’re talking about the display that appears like right
arh68 i see 28 comments, 0 unread on like m9bhxk. yes
here between the list detail and the threads where it says
there’s a little summary for logged in users and it says, you know,
306 comments 14 on red.
Yeah.
A Rh that is a bug and I am not immediately spotting it.
We’re saying
dzwdz surprisingly more comments moved around once i fixed the flags counting twice
if the on red count is greater than zero, then we display
how many are on red.
Oh
It’s the s
03:38:13so I don’t think it should say, I mean at this point what we’re explaining is zero red.
…31So really what it’s saying is
all of the
dzwdz https://pastebin.com/1KvFs6Wv fixed on xz
so this is a database implementation detail.
So the way
that unread works is this red ribbon guy,
which is the time stamp of the last time you loaded the page.
And if
you haven’t loaded the page before, you don’t have a red ribbon.
And so it’s saying
they are none unread, which is not correct.
It just says you haven’t loaded the page before.
So it’s this message is wrong.
So really, we wanna say
we wanna have one, if or if there is a ribbon,
we’re gonna do something
and if there isn’t a ribbon,
we’ll just say all unread because
and it doesn’t have to
dzwdz actually i think it's still wrong. whatever
be read or highlighted because none of if they’re not gonna have a
little red links on the comments because there isn’t a read ribbon.
And then
03:39:50yeah, DZ is, is really hard to, to live code and iterate on stuff and maintain a conversation. I end up with writing significantly bug your code and I think you are enjoying that same experience. That’s OK. I appreciate that you’re trying to hack this stuff out live. It’s a lot of fun to do it interactively. So thank you. So if there is a ribbon
03:40:26and then
…32if that ribbon and we can’t, this is where I need the or zero because it’s possible this is no, it’s not possible that this is nail anymore. So now if it’s greater than zero, print that otherwise we print zero on red. Yeah, this is a ok. And thats now we factor
03:41:15so we’ll just grab that and we’ll say if they’re, isn’t anything you get all unread,
…33it doesn’t
just kind of flat this out. I really don’t wanna have a, a nested if,
and I mean, this,
does that look correct
where we’re saying,
arh68 missing trailing %>
gsvolt eh, elsif end tag needs a percent symbol I think
let’s try and grab their value if they don’t have one, they are all unread.
This is also gonna fire
if the user is logged in
or if there is no.
arh68 looks good otherwise SeemsGood
Yeah, I wasn’t thinking about the syntax just yet but I see it.
So I think I wanna say have a separate one here where I say
if not
user
just do nothing.
Oh I already have a if user up here. OK.
So we’re not gonna have to worry about that branch
03:42:35K and
…47dzwdz or no, i think it's the correct, i just got confused by the diff
all right. So
…54if we have a user and there are any comments,
then you get a summary with the number of comments.
And if you don’t have a ribbon,
we say that they’re all unread.
And if the number is greater than zero, we do the nice clickable red guy.
And otherwise we say zero
and the only difference between these two is the styling
and I would rather just express it like this.
Why do I have this raw?
I can get rid of that right
at this point. It’s for factor to the point that I can just say
this.
Yeah, that is a weird use of raw. That was never necessary.
Let’s be a little safer, I guess.
No,
no excess second injection in your unread count.
dzwdz the current way comments are sorted causes weird stuff in the xz thread - there's a -10 deleted comment that still manages to be above 8 normal comments
I
like that. Let’s see if the specs pass.
Either I have a typo and half the specs are gonna
blow up or I don’t and they’re all gonna be green
because I don’t think I have a spec for the
summary.
03:44:15The current way comments are sorted
causes weird stuff.
There’s a minus 10 deleted that still manages to be above eight normal comets.
Dizzy. That sounds like a bug
if, if you’re saying
dzwdz yup
is the minus 10 comet that’s deleted.
Is that a top level reply? I think I remember deleting that one
if it’s
oh
it’s still above other comments. That’s,
that’s just straight up a bug.
That’s,
I mean, I’m
…56come here
since I think, since I’m logged out in this browser,
I may not be able to even see the little tombstone.
I don’t remember
comet rule by author,
author by author.
dzwdz yup
Are you referring to this
comment here?
Yeah, logged out. So I’m not,
I’m not pulling up
my logged in user view because
this stuff gets decorated with moderator only information like I
can see the names of people who have flagged comments.
So I am not pulling that up on stream
and now I’m seeing the logged out view that doesn’t include stuff like
dzwdz i don't see the flag count either but i got it in the json
that’s
a
dzwdz hmm
boy that really feels like a bug. There is no way that the confidence
dzwdz er, the score
on
a,
you don’t see the flag count either, but got it in the
JSON. That’s
probably supposed to be there in the
JSON.
Yeah, that’s fine.
So this, this feels very much like a bug.
Like we’ve caught two bugs here at the end of the stream and I’m ok with
fixing this one on the stream, but that other one is complicated.
Who is this
someone hack my box and keep reverting code
or am I just
goofing up,
reverting that bad at it?
Ok.
So that’s fine.
03:47:06There we go. Yeah, that’s a good commit message. Ok. I’m gonna get that deployment started. Yeah. Ok. That’s the thing I don’t want.
…34arh68 cool. HahaCat that was quick
Yeah.
And
lets see.
…45Ok,
…56I’m gonna,
yeah, I’ll update that guy in a second.
This part is fine.
Yeah.
All right.
A Rh, thank you for catching the bug and reporting. It makes it
very easy to make those kinds of improvements.
That’s,
we really did a,
a deep dive on rails and database performance and all that kind of stuff.
So it’s really satisfying to, to have the happy path of,
oh, yeah, and rails,
you can just knock out a feature in like three lines of code and improve things.
Kind of funny. But the dichotomy, at least the easy stuff is generally easy.
All right.
So,
so Dey,
you’re seeing weird sorting in
a
live thread that is just
that is straight up a bug.
If a comment with a 10 flags on it is scored above, I can kind of imagine.
I would have to look at that in the database and see,
you know,
did 50 people upload it and only 10 flight? No, then it would be it.
I don’t know what the net score on that comment was.
arh68 is there like a local dev mode where you can dump [x, y, z] into the markup
I don’t remember what the specific wording was.
Occasionally somebody posts something mean
but
insightful in a mean way and people upload it before
I end up seeing it to remove. But I’m not sure.
Is there a local demo where you can dump Xyz in the markup? Yes.
arh68 i'm still kinda foggy on those confidence score things
I could just say something like
and then
Debug
prints out whatever that value is inside of pre tags.
It’s just a handy thing.
You can also drop into a debugger. I almost never use one but
some people swear by them,
03:49:59dzwdz wow automod hates me
eh, if you’re
dzwdz is debug a builtin rails thing?
Oh, how funny.
So Deezy just tweaked auto
mod.
Yeah. So
Deezy asked, is Debug a built in Rails thing?
And I guess that
auto mod has a, a dirty mind because
dzwdz lmao
it said that that was sexual content.
I can kind of see how you get there. But
arh68 PogChamp
so
Debug is a built in Rails feature
and then buy bug is a gem that for
an interactive debugger that almost everybody adds.
But yes, it’s a, a built in Rails feature
and auto mode is
very prudish.
Ok.
I’m gonna go spend a minute looking at that
comment with the terrible score. Let’s copy that link.
It’s funny, I have this whole thing of knocking stuff off stream,
but it is physically on the same monitor.
So
I’ve made a bunch of changes to the window manager.
So I can’t drag windows onto the stream accidentally, but I guess menus can overlap.
Yeah. So if I open like
a calculator,
I have to hit a hot key to say like, oh yeah,
this calculator can come on stream this floating window.
And if I say it can’t,
the window manager immediately shoves it off. It never gets one frame to render,
which is kind of handy. But
physically the
the offscreen browsers that are not being streamed are below
the streaming area and so their menus can overlap.
dzwdz ruby rails us all
So you saw that for a second. That’s kind of funny.
All right,
Ruby rails us all. Yeah, that’s
please do not taunt happy fun auto
mod. Eventually it will,
will bite you.
All right.
Take care folks
have a good one. I will see you back on next Monday at 2 p.m. as usual
arh68 cheers y'all HahaGingercat
because I will not be out in the forest
dzwdz this was an interesting bug
chamlis_ thanks!
and maybe, then we will pick up pagination.
Maybe a Rh will catch some more bugs or Daisy will
prove that comments should be ordered very differently and we’ll do that instead,
but
we’ll see how it goes.
dzwdz the c_o_p one
Shamli. Thank you again for your suggestion of the
insert, a random number. It was a good idea.
I’m glad we worked through it to figure out what was going on there.
It was a shame. It didn’t work out.
And,
yeah, the confidence ordering path was a weird one because,
ah, the coat is so clever.
I get more and more suspicious of clever code. The older I get. But,
you know, performance and our, our big primary thing is loading a story,
reading the comments on it.
So
that is the one path where I will do ridiculous stuff
to improve that performance
because
we serve a lot of those pages.
I mean, if you want to see the kinds of
depths I will sink to, you can look up heinous in line partial.
It occasionally gets mentioned because I put a big
warning for it that happens every time rails loads.
But
that is the sort of awful thing I’ll do for performance on the one hot path for the app.
All right.
Maybe one of these days I will explain it on stream and
dzwdz do you have any idea off the top of your head how you'd do the hn comment interleaving thing with c_o_p?
we’ll actually investigate it and someone will point out that it’s redundant
or I will foot Gunn myself with it on stream and we’ll have to talk about it, but
not today. At least.
No, I have no idea how to do the comment, interleaving thing with con confidence.
Order path
easy. That’s
that’s part of why I it hasn’t been done yet
and
it may just fundamentally be something that cannot work
with confidence order path or it may require.
The only thing that came to mind was it’s possible
that
you could layer on a union query
and you have a union where
you say
give me all the comments and confidence order path, union. The ones,
the newest, you know,
call it the 10 newest top level comments that don’t have replies that don’t have
three votes on them
and then
order by and then do something brilliant in that order by.
That’s, that’s kind of my best guess, but I can’t write it in my head.
So I don’t really know what that’s gonna look like.
I don’t know. Maybe there’s something there, maybe it points to
confidence, order path while clever is just
too clever and is not gonna work out
and we should take a fundamentally different approach.
I don’t know. We’ll see how it goes.
Always learning.
Thanks for hanging out, take care folks.