Maybe I shouldn't be using vim as my terminal multiplexer
Streamed
Bot traffic on Lobsters. Mobile layout fix. PR review of build task. SQLite migration debugging and todo list.
scratch
topics
PRs
sqlite https://github.com/lobsters/lobsters/pull/1871
add span https://github.com/lobsters/lobsters/pull/1900
build task https://github.com/lobsters/lobsters/pull/1897
issues
sqlite migration checklist:
- merge sqlite3 pr
- lobsters-deploy
- set the site in read-only mode (config/application.rb on prod)
- bounce puma
- be rails dump_db
- scp db my my local dev
- edit prod config/database.yml:
primary:
# <<: *trilogy
# database: lobsters
<<: *sqlite3
database: db/primary.sqlite3
- prod: be rails db:create:primary
- prod: be rails db:schema:load:primary
- be rails load_db
- unset the site from read-only mode (config/application.rb on prod)
- bounce puma
post-migration cleanup:
- remove trilogy from Gemfile
- poweroff db01
- decomission db01 from digitalocean panel
- remove db01 settings from config/database.yml
topics
post-stream
swap nvmes
Transcripts are generated with whisperx, so they mistranscribe basically every username and technical term. They're OK but not great, advice appreciated.
Recording
01:41graefchen Hello limesHi
All right.
Oh, hey, Gretchen.
We are struggling today.
So this, my browser is just sitting and spinning because prod is swamped.
That's very sad.
Getting set up for the Lobster's Office Hour stream here.
graefchen Weird Mic thing going on?
site is struggling because once again we are getting swamped by dumb bots oh there we go that was what a 30 second page load that was painful oh let's create that again all right so looks like yeah load average is five and a half thank you for catching that early
MutableHuman 👋
yeah i managed to miss that step every time i'm just barely getting a stream started on time or starting only five minutes late yeah i always forget my little checklist hey beautiful human all right so let's see where i'm at here get my everything set up the
Last stream ended after about 10 minutes abruptly because my desktop crashed.
I am suffering from some kind of hardware failure.
It's really painful.
It's been going on for a minute.
TD Thomas, who is a stream regular here, gave me some debugging suggestions, just reminding me of stuff I should have thought of, like running memtest86.
My RAM seems to be fine, which is good, given how RAM prices are.
The actual hard drive seems to be fine, which is good, because the biggest symptom is the whole desktop stalls for like, I don't know, 90 seconds to three minutes-ish.
I haven't timed it, and it's still interactive, but everything that's trying to read or write from the disk...
hangs and you know everything is a file on unix so that's most everything hangs and then the hard drive the root drive is remounted to read only which is not very usable and doesn't leave me with a lot of logs because hey the drive is read only but the all the smart stuff says zero error zero warning any of that so i don't even think it's the drive
MutableHuman is it a SATA drive or an NVME?
I think at this point it's got to be like the motherboard NVMe.
It's got to be the motherboard or the processor, I guess, right?
There's just not that many moving parts.
04:47You know, I have two NVMe drives.
I could try swapping them.
That's an idea.
I'll put that on my to-do list.
Let's start my to-do list here.
So yeah, stream topics.
We've got to look at the PRs because there's been stuff.
Got to look at the issues.
MutableHuman motherboard would be my next guess but it would be weird for there to be something weird with PCI and only affect the drive
And then post stream all swap NVMEs.
That's my debugging.
05:17Yeah.
I'm...
twitchtd hi! did you fix your computer?
kind of flailing with that one all right so let's see where's here we go just peeking at stuff off stream the the chat room got into talking about gleam just and so i started two minutes late because
It nerd sniped me.
TD, I was actually just talking about you, because I followed the debugging suggestions you gave, which I really appreciated.
The reminder, I ran memtest and it didn't find anything.
graefchen Gleam seems to be indeed a very very interesting language. limesNoted
I ran all the smart checks again, because that was the first thing I thought to try when this started.
But, you know, I want to look back.
everything was green on all of those which i think pushes it towards it's probably a motherboard thing but it just occurred to me as i was talking through this that i have two nvme drives and i could swap them because if the problem suddenly goes away maybe i've isolated it to one of the nvme sockets right the
The other drive, it's my old NVMe, and it has Windows on it so that I can dual boot to play games.
But there hasn't been a game on Windows that doesn't run on Linux for, I don't know, a year, something like that.
I have no files or anything.
It has Steam installed and those O and O shut up kind of programs.
Let's save this too.
Everybody knows those, right?
Oh, it hasn't come up on stream before.
07:21pushcx https://www.oo-software.com/en/…
It's a very funny thing.
They make two of these.
What's the other one?
Is it a product or a solution?
pushcx https://www.oo-software.com/en/…
This one.
This one's the other one I run.
MutableHuman product or solution lol
It is very funny to me that if you run Windows, you have to install random third-party stuff to yank out big parts of...
I pasted the same link twice.
pushcx https://www.oo-software.com/en/…
Did it not update the URL?
What just happened here?
That was a weird one.
All right, there we go.
MutableHuman there are a ton of Windows "decrapifier" scripts floating around
Yeah, handy little free utilities for...
you know, reclaiming a couple of gigabytes.
And yeah, there's a lot of those.
These O and O ones are the ones I've used.
Anyways, GraveChain, yeah.
Gleam is neat.
The only thing I'm not sold on is the lack of any kind of traits or interfaces.
It's the one thing I felt is missing from Ruby.
And we've actually had on lobsters, there are a couple of really good discussions.
Cause I want to say it's come up in two different threads that gleam doesn't have interfaces or traits.
And, the username is L pill.
And I don't remember his actual name.
Lewis, maybe the creator of the language has shown up in the comments to discuss why it doesn't have those things and that they see it as.
not worth i'm putting words in his mouth i'm trying to summarize but i'm probably not going to get it right it's something like they see the amount of boilerplate that you have to write as fairly minimal and just not worth the complexity of the feature and the damage it does to error messages which i mean i've done c plus plus i've done haskell i know what
those error messages look like when you have those features.
So I kind of get it.
I got to be sold on it is all I'm saying.
So more practice with gleam is coming.
All right.
So where are we?
Good.
Yeah.
And then how bad is prod?
Because do I need to?
All right.
Whatever that bot wave was.
So when I started the stream five minutes ago,
Prod had a load average of 6, and 4a, 4prod,
processor machine that's bad news right but it's settled back down it's at two and a half which is somebody is scraping us but at least it's not bad so much there have been a lot of these bad bots the last couple of weeks we had a big conversation about them in the chat room this morning and
I am forgetting the username of who made the point, but it started with a T. We were talking about smart bots and dumb bots, and actually, it's only the dumb bots that are a problem, because the smart bots, they just do a better job of blending in with real traffic, so I don't notice them, and they tend to be...
They rate limit themselves, at least a little bit reasonably.
And so...
we don't notice the smart bots because they don't knock the site over meanwhile the search bot actually here that's a that's a really useful description so let me show you this over the weekend hunter and i worked on this caddy filter for i've been calling it the the search bot or the search spam bot something like that
And it's dumb.
It thinks that it is hitting something where it can spam a URL to us.
And so if I look for, I gotta look at this off stream here real quick, just to make sure I don't put up personal info too fast.
So let's see if I, where's my grip?
Had this recently.
So what do we want to search for?
Do we have the parameters?
12:28Give me two of these. Oh, did the bot stop? Wow, that's weird. It's been going for weeks. So I thought I could pull up an instance of this in two seconds. So if I say this, and I say HTTPS. Yeah, OK. So I just want the URL field, gq-c URL. No. What are you called? I'm starting to get the hang of JQ, which is a very terse language. So is it under a path or something here? URI, not URL. OK. No. Sorry for the . There we go. All right.
13:51So there's this bot that hits prod. Here's just like 20 of its hits. They're all real URLs off of a bunch of news sites. So I figure it's reading their RSS feeds. And then let me just take, I don't know, 10,000 of these. And if I grab the timestamp, which is, This is a caddy log. So by default, it's a Unix timestamp, just the number of seconds since 1970. And I sort just because sometimes the log things get written slightly out of order at the second boundary. Unique-C, sort-RN, pipe-N30. So we're just like, let's look at, actually, let's look at 100,000, right? We've got them. Oh, you're doing it wrong. I left out the grip. I've typed this command a few times lately. Zero to nine. So let's grab. I'm going to round all of these timestamps off to the nearest second. Grip dash O. So this is just the first 100,000 hits from it. But it's been hitting us You know, typically 200 times a second. And this has been going on for a few weeks now. A very significant portion of all site traffic is this one bot that just fucking hammers the search engine with URLs. So like all of this nonsense, just over and over and over and over.
16:02twitchtd extreme idea, what's your thought of turning lobste.rs private and requiring login for viewing anything on the website?
graefchen 200x a second is very naughty and not nice. limesSweat
And you know it's a dumb bot because none of these URLs ever appear on the site.
...17And I have assumed that somewhere in these thousands of URLs it tries to spam us with is like, you know, the best plumber for Riga, Latvia, or
I mean, we've seen like garbage tier spam where it tends to be link building for local sketchy businesses.
MutableHuman huh weird
We've had, travel agents or, various adult services.
MutableHuman what would even be the point of this
Cause once or twice we've had one of these garbage tier spammers, pop an account on the site and I've, I've scrolled these, but like not actually seen what are they spamming for?
I don't know.
I guess it's also possible, I don't know, it's too well resourced.
This thing has, part of the reason that it's blocked is, hold on, that's not how I want to say it.
17:40The thing that is especially weird here is this thing has at least tens of thousands of IPs that it hits from.
This would be absolutely trivial.
MutableHuman i wonder if it just scrapes the whole internet and stuffs anything with search box to pump search numbers
If this was coming from one or two IPs, I would say somebody had a weird-ass bot that was running wild.
But it has changed behavior a couple of times, like changed user agents.
MutableHuman with a search box*
Yeah, I don't know.
What's the search box for, right?
I think the bot is written to assume that if it stuffs things into a field, that might eventually show up on a report.
And I have assumed that somewhere in these URLs is something that a spammer would get paid money for.
You know, like every once in a while, instead of seeing CNN or Yahoo Finance, you see like data center dynamics.
I have never heard of data center dynamics.
Maybe that's what it's link building for.
I don't know.
So how many of these?
So let's see.
Let's grab caddy.log.
Head-n1 to jq-c.ts.
What is this timestamp?
What's the...
Anybody remember the command for date to convert from timestamp to human readable?
It's not date-d. Oh wait, I bet I have a...
I have a local alias for this, don't I?
No, I don't.
20:06MutableHuman date -d @timestamp
All right, let's grab a date dash with the ampersand or the at sign.
That's what it is.
So date dash D, where'd that go?
Because I want to say this is, I've been fighting with log rotation, so I don't remember off the top of my head if this is going to be one week.
All right, so this is, and date dash U.
Okay, so this is eight minutes of caddy logs.
That doesn't seem right.
Did it really just rotate shortly before I started the stream?
I guess so.
This must have some kind of file limit.
February 1, February 8.
Okay, there are definitely...
I have been fighting with LogRotate on this box.
The default LogRotate config from Hatchbox is very aggressive, and it wants to throw away logs much faster than I actually use them.
And now there are two schedules running, clearly.
Fix.
All right, so if we look at Kennedy.log, how long are you?
31,000.
Okay, yeah, you really are only a few minutes old.
If we look at these, that is still not much traffic.
21:50All right, so if I said, caddy a log i typed to what was my graph let's bring it back yeah so that's just this random chunk of log files it's three quarters of all hits to the site.
22:32So if I said dot star up to the quote, no, that'll get me the last quote. If I said anything that's not a quote, let me pull this off stream for one second, because that could conceivably kick out somebody's IP address, and it did. I wanted to say grep dash o. Great. So there we go. If anybody wants tens of thousands of links to look through for what they're spamming, you can let me know.
23:44All right.
So all that fun distraction aside, what the heck was I doing?
Oh, yeah.
So this was the thing we fixed over the weekend.
Let's go look at pull requests.
graefchen Sounds like a very fun weekend project indeed. limesD Checking what a bad bot is spamming.
There have been some activity on pull requests, and I'm woefully behind because I've been spending my time on spam bots and hardware crashes.
24:18Thomas, if you're still here, you want to give me the state of your pull request?
I see you have merge conflicts, but you touched it in the last week or so.
twitchtd the PR is ready to test
Oh, did you figure out the conversion?
Complete output.
Hex values, binaries, and blobs.
...57twitchtd I wrote a custom dump/load task
You know, looking at your... Oh, okay.
I was going to say one thing we could do is that is a calculated column.
We could just throw away that column and recreate it.
You're right task generically dump all the data and load into the using these right tasks.
Okay.
Populate the FTS tables.
25:46the whole prod database old non-streaming task left here what's this one github a nice of github to be up today for me it was down for a chunk of the morning that didn't end up affecting me but i wondered
26:15Wait, is this? Oh, you dump everything to YAML. That is not a format I think of when I want data integrity. I see why you did that. But the hair on the back of my neck just stood up of whether that's safe.
...48dzwdz i just opened the stream are we using yaml to move data between postgresql and sqlite
hmm all right let me think about your pr for a second and run through these others because they're all small let's let's update my topics list
27:21pushcx https://github.com/lobsters/lob…
twitchtd dzwdz that was the idea, yes
DZ, you can take a look at the full request here, but I think that's what the script does.
I'm going to, I'm chewing on what I just read.
So I'm going to jump into this other stuff.
dzwdz D:
hello again, Grave Chen.
Adding a spanner on the status of the user.
So it's correctly displayed on both.
Oh, that sounds like a nice little mobile fix.
Sure.
Because otherwise it's loose and it gets, yeah.
28:01Great. Well, thanks. I'm not going to bother running the workflow. Let's just drop that in.
...19Thanks, Gretchen.
twitchtd dzwdz do you have any reason not to?
So, Oh, this one, we've got a bug for this one.
Same way.
Yep.
Clear list toggle fed up has been doing a lot of, I don't want to lose the.
29:52I should be separated.
30:00graefchen I spotted it, it annoyed me, i fixed it. limesSit
Use an input checked P display box.
Well, thanks.
...16So if the input is checked, otherwise it contains load.
...35dzwdz not to go "D:"?
Why did this change?
Oh, missing semicolon.
...49All right. Is there anything else in the issue, or is this ready to go?
31:11Let's see, we'll check the island author.
...20dzwdz pretty much what peter said
Didn't think so.
32:23dzwdz mysql2sqlite was a dead end, right?
graefchen For data integrety you should have used XML limesSmart (at least that is what my prof would say, i guess)
dzwdz yes, of course
This is one of those like doing the small intervention before we do the big intervention, because the big intervention looks like something stricter.
Years ago, maybe
got to be five or six years ago, I tried to code up a self promo feature where if one person had submitted, I think it was more than half of the links to a single domain, we kind of took that as a clue that they were doing self promo there.
And I think
think that lasted maybe a week before the false positives became such a problem that i had to pull it out but i've been thinking a lot about self promo the last few months because it's been an increasing problem and especially i don't know if it's a cultural shift but like when i talk to people about self promo
33:56fpsvogel 👋
Most of the time I get back a, oh, I'm sorry, I didn't realize the rule.
I'll follow it.
But increasingly I have gotten really entitled emails from people who are mad that they can't do self promo.
I got one just last night or this morning.
So it's in my mind, but it was someone who
34:26dzwdz i paid good money for this account!
Trying to think how to characterize this politely they just really wanted to promote their blog, even though it's off topic and.
we're very mad that they weren't able to continue posting their blog.
And it it came out in a very odd way, but the only way the like.
it's one of those emails that doesn't make sense, because it comes from such an odd headspace that it's like why.
MutableHuman posting your blog is a human right!
i emailed about one thing and i got an email back about a different thing and it was like right why are these related and the answer is because they felt entitled to see lobsters as a traffic source and they didn't like anything that got in the way of that i guess i don't know it was a weird one
35:25twitchtd dzwdz mysql2sqlite failed for pushcx and there's a bunch of open problems related to binary string handling
That's a build rate task.
Oh, FPS, hey, I'm just getting over to your pull request.
fpsvogel Oh hi
Bundler task is incompatible with newer versions of Bundler.
I have never heard of a Bundler leak.
I have a little fruit fly.
...57graefchen Being proud of your work is okay. But not following the rules is ... weird. Even more if it is off-topic. limesHmm
Felipe, can you remind me what bundler leak even does?
36:11fpsvogel It checks for memory leaks
Doesn't run specs and specs slow.
That seems reasonable.
fpsvogel against community reports
Specs slow is just there as a smoke test.
It fails so infrequently that I don't want
it in the developer workflow.
So this is a good call.
fpsvogel the problem is that it doesn't seem to be updated (the gem or the DB) in years
Oh, against community reports, yeah.
If that's stalled out, it seems like a good call to remove it.
37:22fpsvogel *memory leaks in gems
Oh, well, and it's nice to see TD in another pull request.
If bundler leak is just dead, let's just delete it from the bill.
That's not doing anything.
If it's been four years, so they're like an issue of no.
fpsvogel 👍 will do
Yeah.
All right.
This is.
This is an X parrot.
Why did the whole build fail on this?
Did I break stuff?
fpsvogel no it's because of the bundler downgrade
Any steps?
This kind of comment is great.
That's exactly what I was looking for.
38:22And that's fine.
There's the inverse comment break man from exiting this entire task on break man's completion.
Sure.
fpsvogel (I downgraded bundler in this PR in order to include bundler-leak in the bundle)
fpsvogel which breaks CI apparently
That's okay.
Include vulnerability.
Oh yeah.
Let's just, let's just get it out.
...58Great.
39:17Why did you delete my comment about RSpec or my description?
fpsvogel Just reordered
Did it move down?
...35Oh, did you put them in the order they run?
fpsvogel Yep
Standard break man, database consistency, and then specs.
Is this the order they are defined in check.yaml?
fpsvogel No, RSpec is first in CI I believe
as you're yeah could you put rspec first because yeah the the value of rspec first is does the code even work i don't care if the code is well formatted if it doesn't work and the same with you know database consistency and
Like BreakBen, I would rather get its warning after the functionality works, because most of the time it's false alarms, especially when we're careful to use the active record interfaces.
So like, I want RSpec first all the time.
41:00fpsvogel Gotcha. I put it last in local build because all the other steps run in a few seconds, while specs take a lot longer
fpsvogel But I can change it to match CI's order
Yeah, that makes a lot of sense, but specs are the valuable thing here.
Yeah, please.
All right, so if you want to do that and drop the leak stuff and the downgrade, if you have a minute to do that right now, I can come back and merge this PR in a couple of minutes.
But this is basically ready to go.
And I really appreciate you picking this up.
So let me leave a comment to that effect.
Did I push the starter review button?
I did.
Okay.
...47fpsvogel Cool, I'll push up in a few minutes
Let's get the right year.
See, I'm even typing the year correctly now.
Now that we're in February, I'm starting to get it.
Thanks, Felipe.
I appreciate it.
42:34So I'm going to just put this in my, I didn't put it in the notes.
Let's grab that.
veqqio heyy
Come here.
Build task.
I'm going to keep that tab open and come back in a minute.
Hey, Vic.
43:00I know that file exists. Oh, the swap file exists. Yeah. Cause the last stream crashed. All right. So do that again. Okay.
...19Temp files are no fun to clean up after these crashes. All right. So open graph.
...37yes we talked about this one a bit maybe two streams ago so hugo has come back to add another feature this is great the sequence a sequence of unicode characters that's all right converts from Markdown to raw text that honestly, I'd almost rather just give people Markdown off the top of my head. Let's see how long the cause Markdown is readable enough, especially given our user base. All right. So we have a description. We say raw about Markdown or to raw. Oh, that's interesting. Ah, you take common marker. That's before first chunk.
45:09Yeah, so this throws away the URL of a link, which sure. Try to get the description of various lengths. Okay. Escape symbols cannot be removed. What is this?
...46complicated with lots of symbols. Aha.
46:30Is this duplicated? Yes. So if we're going to do this, we should lift these up to a constant.
...59Yeah, yeah, that's just.
47:22Then the rest of this is ready to go. Yeah. Yeah.
48:12fpsvogel Not quite ready
yeah the database consistency this one was failing on main for a couple of days but it's done now so that'll be ready to go soon kind of come back to that all right felipe i'm coming back to your pr or did i beat you to it okay not super fast let's go look at
I'm not trying to rush you.
Let me think about this.
...55Topologically sort by foreign key dependency. That makes sense. Records in groups of a thousand. Records in groups of a thousand.
49:18twitchtd you're looking at the old task file
i suspect this is going to blow out my ram what do i have 32 64. okay well i got 35 free well you know the database is only i'm looking at the old task file
The database is only a few gigs.
And so even if there's a significant overhead and like two copies of this running around in memory, we should still fit.
twitchtd https://gist.github.com/thomasd…
Yeah.
So I'm thinking of like, what is the checklist for
going to look like for actual day of deploying it and it's probably going to be put the site in yeah we actually have to write that a proper list so let's let's write that because I have to make these checklists when I do things in prod because otherwise like I never remember all the steps and I always
It's usually Hunter double checking me on these, like every time we've had a migration in the last couple of years or something large.
And I highlight this because sometimes people think the other mods don't do stuff because they're not super present in the mod log.
twitchtd above link is the streaming migration rake task
But Hunter watching over my shoulder while I'm typing things into the root console is incredibly valuable and will not show up in the mod log.
So I try and call it out whenever I think of it.
But
Great.
I will pull it up in one second.
So it'll probably look like with the site in read only mode, which I'd like to make these specific.
How do I, I think that's config application, right?
Yeah.
51:32So set the site. And then bounce Uma. And then once we're in read-only mode, dump the DB. Create. We can do these out of order. We can create SQLite database. For the DB, this is me writing a first pass. We'll figure this out more. And then load to SQLite DB. Post migration cleanup.
52:34CP, shutdown db01, decommission db01 from, yeah, so instead of shutdown, we'll say power off, from the digital panel.
remove db01 settings from config database yaml because i'm going to leave them just commented out right so edit prod config database yaml to pumped out MariaDB add sqli okay so this one
Rod, we will be Rails DB schema load primary.
This one we can go up here.
OK. That's a reasonably complete first draft.
Now I can think about Thomas's gist.
But it's one of those where, as soon as I get
twitchtd could we do a test migration to a new VPS btw? or did you just want to do it without a test?
like five things in my head it's distracting and i'm like juggling them and i'm trying to hold on to them so i've got to just like dump them out even if it's just a rough draft so i don't feel like i'm gonna get confused or lose track of what i'm chewing on so this one can we do a test migration yeah oh no we are definitely doing a test
54:31twitchtd ok, sounds good :)
I don't know how much of that is coming through, but if you hear cardboard clunking and bumping, the cat figured out there's a cardboard box in my office.
Yeah, so to a new VPS.
...47I was thinking I would do it once here in my local dev.
Are you just looking for a full sequence
Or are you looking for the script to get exercise I guess i'm wondering like.
Do we get something extra from setting up a second copy of prod and doing it there versus.
If I can run all the steps on my local will probably be fine.
twitchtd I wanted to just run a test migration, could be on the same running vps / local
i'm kind of thinking through that.
yeah i given that they're both linux i think that's close enough to prod that we would be fine because like if any of these steps fail we stop we take the site oh that's gotta you know
56:01unset the site from read only mode bounce puma right because like worst case well i hesitate to say worst case but like the migration could fail and that's fine because we're not going to delete the prod database for realistically a couple of days
worst case scenario really is like some kind of subtle unicode corruption issue that we don't catch or some kind of performance regression that doesn't show up for a couple of days i mean even ours is bad because then it's like what do we do do we like
manually copy stuff from SQLite back to MariaDB?
Do we just drop a day of activity?
Yeah.
And I'm reasonably confident that this will perform significantly better than MariaDB, but nothing is 100% and there is
Like the last time we moved databases, we ran into a really subtle optimization difference between MySQL and MariaDB on literally one query.
And that was busy enough to knock the site down into very unstable.
twitchtd could run a 2nd copy of a sqlite version of lobsters under sqlite.lobste.rs for a few days and ask for testers if anything
Because it was the query for counting how many replies you have, so the inbox counter.
57:50That's a good idea. But we're not going to get... Yeah, it's not going to be a realistic load. Which is funny, having just spent the start of this stream talking about how painful the bots can be. Because what I'm really curious about... The part that concerns me that we can't really replicate is just how much the site gets hammered. the caddy behavior is not going to change and there goes, you know, more than 90% of traffic right off of the gate. And then at the rails level, yeah, it's just, it's that mix of votes and comments and stories and searches and reads and yeah. Let me think about that for a minute. Load stream, okay. Added a backup stream for the SQLite job. Oh man, I'm already missing interdiff.
59:42Excuse me. I should have muted. I felt that one coming. All right. All right, so now we have a backup command. and you put it in the rustic job, which is just where it belongs.
01:00:27I we must have sqlite three already installed by hatch box doesn't hurt to be explicit but it's got to be on progress. yeah it is. Because i'm. Solid Q and solid cash or sqlite for us.
01:01:26Let me bring this up. Yeah, I still have a dump of the database on my local.
...57over there, and then grab this path.
01:03:09chamlis_ small thing, but dumping the backup sqlite database to sql text should mean restic can chunk/compress it better
twitchtd the task is in the gist
Oh, hey, Chandler.
It's good to see you.
chamlis_ great work on getting the whole thing working, that's about all I can contribute from having managed sqlite databases in the past
twitchtd did you want me to check it in?
I'm not on your branch.
Oh, yes.
yeah if we're going to run this in prod you know it's like a migration regardless of which machine it runs on yeah and i would like it in our git history
01:04:19Hmm. All right. So we'll take that. What's this one doing? All right. How are we doing, GitHub? I didn't even populate.
01:05:14I'm going to go look at issues for a minute and it doesn't look like there's this didn't close.
Oh, I haven't hit merge on fully based PR yet.
twitchtd alright, committed the migration task
This one, I think I left the last comment about search bot.
Yep.
Ah, thanks Thomas.
Doesn't look like there's any other issues for me to check in on.
Alright.
01:06:20Thought I had tab completion set up for that. Guess not.
...57I can't run it on the branch because you removed it from the gem file, right? Yeah. All right, let's put that back for now.
01:07:20We'll have to maybe split that out into a second PR, that kind of cleanup task. Uninitialized SQLite 3 adapter. That's implausible. Rails must not have loaded?
...47What does it say?
01:08:15twitchtd it's in sqlite_functions line 2
yeah you know
...53fpsvogel You might have to come back to my PR later. Running RSpec first is not as straightforward as I thought it would be, due to how RSpec exits the process on completion. So I'll need to figure out how to run RSpec differently.
Right back to this same error.
Oh no, everybody exits the process, huh?
That's... That's frustrating but kind of funny that both Brakeman and RSpec are doing that to you.
Yeah, oh, and RSpec is...
Doesn't it work by actually like hooking the exit handler for all of its summaries and everything?
fpsvogel Something like that, still puzzling it out.
It is the most indirect flow I've seen out of a Ruby program.
And that's saying quite a bit.
Yeah.
It's like an at exit handler.
I almost never see this in Ruby code.
I forgot about that.
01:10:14I'm trying to think if there's a better way to do this, because the other way would be like make the task basically into a shell script, but then you pay the overhead of booting Rails four or five times, you know, once for each subtask.
...39I wonder if there's like a an API to our spec. So rather than letting it totally own the flow, you could call like, well,
01:11:23Yeah, I just don't know the RSpec API. I don't know if it offers an API.
...35So Thomas, do you know what's going on with this exception in the script?
01:12:02twitchtd not sure
twitchtd I think it works for me, what environment are you doing this under?
I mean, it would have caught, but I'm not sure what you're asking for, Thomas.
This is just my local Linux install that I've been running everything under.
And I did a bundle, so I should have all the same gems.
...37Guess I should delete this. That's not doing anything.
...47Where am I on time? Just to set expectations, I've probably got about 25 minutes left on this stream before I have to go. I have a... You know, I usually go for three hours, but there's a social obligation that can't work on another day, so... Where are we here, then?
01:13:21Why does the cat only sit so widely in my lap when I am distracted and frustrated by a bug?
...34twitchtd I'll create a dedicated branch for dumping the db, since it's going to depend on a gem that I got rid of in the sqlite gem
Sir, holding up your 800 pound chin while I am typing just puts my arm to sleep.
No, that's not what I wanted.
Whole page jumped around.
01:14:26fpsvogel I relented and put a cat bed on my desk
fpsvogel to keep the cats somewhat contained not on top of my arms
Yeah, he wants to sit on my warm lap.
And he seems to think that he is the boss of the house.
chamlis_ does it only load the adapter if you have sqlite in you config, or something along those lines?
I have tried to interest him in a bed on my desk or a bed on the file cabinet.
Neither was of interest.
01:15:00Yeah, this is an error out of an initializer, that, isn't it? Yeah.
...14So it's not something particular about SQLite, or somebody else would have this bug already. Hmm.
...38i don't even know where to start with this one is it what was that trade where was that trace back yeah that attempt to put the
01:16:09Come on, take this out of here. Oh, this is the error that... Wait, this is the Reddit error. This thing is in an initializer. That was the Reddit error. So what was... Oh, my God. I have an extension on my regular browser for fixing these. I'll have to copy that over sometime.
...42MartinJaniczek https://xkcd.com/979/
case the initializer I don't know 979 oh yeah who are you what did you see yeah that's but this time you know both the question and the answer are there so I feel like I'm ahead of the game I
keep running into Reddit comments that people have either deleted or overwritten with random words, because you can edit a Reddit comment forever.
All right, so if this is the thing,
Let's try that.
But the odd thing about it is I don't understand why it would work for Thomas.
01:18:26twitchtd oh, that's why I put it in a gist, it worked for me on main + migrate.task
guess it's running.
So Thomas, if you caught that I was able to adapt that require worked for you on main.
...50twitchtd main didn't have the sqlite work
Oh, and so on main, you wouldn't have that
twitchtd ya
function hooking the SQLite stuff, fer.
All right, well, we're running.
So let's... Is it dumping to DB?
Where was the file name?
Dump.yaml.
Refuse to snapshot some files.
All right, it's one gig big, so we're getting there.
01:19:43Two gigs in the time it took me to refresh. All right, so we're ticking up.
01:20:11All right, did you finish? No. Just thinking about that one for a second. So let's see. I'm going to peek at that because I'm curious. So let's say tail dump.yaml. That's very wordy. tail.shin 30 dump.yaml. Yeah. So nothing sensitive here. So I can bring this on. This must be a vote or something. id isFollowing isFollowing. What attribute is that?
...59So I want to say story, but that doesn't look like a story. The ribbon, it's dumping the read ribbons. OK, so it's marching along. It's already in the Rs. This is going to be a larger file if it's, you know, all of this text for each line or each column, each attribute. Yeah. It'll get there up to three and a half. So yeah, if it, if it is going alphabetically, And that's not certain, because ActiveRecordDescendants is not necessarily sorted. Descendants? ApplicationRecordDescendants. All right. Whoa, whoa, Vim. Vim, what are you doing to me? Oh, why are you scrolling? Why am I being punished? Okay, that's weird and awful. All right, it stopped scrolling. I don't know what the hell that was. Something about the way I dragged the mouse. It must've thought... Okay, it must've thought I was like, Dragging off the top of the window? I don't know. Maybe I shouldn't be using vim as my terminal multiplexer. So application record.descendants. Nothing? I guess it finished, so I'll stop playing around.
01:23:12twitchtd you have to eagerload
twitchtd to use descendants
4.7 not bad ah yeah definitely didn't do that well i was mostly just curious to see what order they came out in so i could get a feel for progress but we seem to have finished so all right then
...46I guess now I would be editing config database YAML, right?
...57twitchtd ya
Let's bring, let's pop that off hand.
So I'm going to pull it up off screen because I can't remember if my local has a password or what.
It does not.
This is totally fine to put on screen.
Good.
So.
David Price- Big database.
David Price- database yaml.
David Price- So here on development.
David Price- Have a primary and so instead of saying that we will say.
David Price- sequel a three.
David Price- And then we will do that in.
David Price- test.
Oh, this is from an experiment.
Somebody made a Rails app that would log exceptions to a SQLite database.
And I spent a minute playing around with it.
It did not work out.
So grab that one.
Grab this.
I was just thinking of the other day.
I guess it must have been Saturday when I was fixing the bot stuff anyways.
I was thinking about how I wanted to split rack attack out into its own SQLite database, but I realized at the last minute I should have hold off.
So I'm not dealing with even more verge conflict, the kind of stuff while this one's in flight.
All right.
01:25:34twitchtd you need to change the database to a path
Is it load underscore DB?
Need to change the database to a path.
Oh yeah.
All right, let's bring that out of there.
And then we will say db slash development slash primary, I guess, dot SQLite 3.
01:26:08Just bring this down here.
...15twitchtd you need rails db:schema:load:primary load_db
Yeah, I want that on the same thing.
Also on development.
twitchtd yup
Thomas, what we should be doing is debugging my little first draft of the to-do list because we're kind of working through those steps, right?
We just did the dump.
We can imagine I did the SCP.
Now I'm editing config database, coming out MariaDB.
actually let's just let's prod one i know it doesn't have errors we will just grab
01:27:14And then in prod, it obviously doesn't need the environment because it only has prod. And I was tempted to copy that kind of like have the environment name on prod, except it should never have the others. And if it starts up Endeavor test, I want it to be very obvious and fail. So, all right. So there's that. And then... DB schema load primary. Sure. I bet I have to do create.
...54Yeah. So the rails DB schema, no DB create primary.
01:28:12Okay, db schema load primary. Promising. Let's peek at it. sqlite3 db, where did I just put it? db development primary schema. All right, that's promising. That's what it's supposed to be.
...50And then now I will run rails load db.
OK, it's thinking about it.
twitchtd btw, i would maybe do it offscreen in case load_db crashes and spits out some yaml
So let's put that watch, not here, watch-n 3ls-db development primary.
Dash and 3.
Sure.
I can do that.
All right.
So it's going.
01:29:45Let me switch that from 3 to 1. 3 to 1. All right. So it's going up about two megs per second, a little under. But pretty close to two megs a second. And I don't know offhand what that final file size is going to be, but not too bad.
01:30:30twitchtd what's your memory looking like?
Yeah, about the only thing I would worry about it dumping is like a private message because everything else is on screen or pretty minimal.
Let's look.
Yeah, it's barely using anything.
Because we were at 32 gigs free earlier.
twitchtd ok, the streaming is working then :)
So, you know, four gigs has picked up modern computers.
I remember having four megs of RAM.
Yeah, that was why I was hitting up enter here.
It's like slightly ticking up.
Maybe it's keeping the whole SQLite database in RAM, but that's fine.
All right.
Well, this is really promising.
Let's go look at that to-do list.
Yeah, db, oh yeah, db query.
So that does the load.
So this would be b rails load db.
And this would look like b rails dump db.
I do have my little bundle exec alias in prod.
So this would be merge sqlite3 pr run lobsters deploy to get it into prod.
This feels very tractable.
boost-migration-cleanup-power-fdb, oh yeah.
That'll be a separate PR to remove Trilogy from gem file.
01:32:53So this is ticking along.
01:33:06You know, when this finishes, you know what I'm going to do? I'm going to run that dump task again. And I'm going to dump the SQLite database to YAML, and then I'm going to compare the two. Because if we... That catches some errors, right?
...32It doesn't necessarily catch any errors in conversion from MariaDB to YAML. But it's a reasonable reason to think that they're in sync. What else can we check? Obviously there's stuff like the number of rows in each table. What else?
01:34:09That just stopped counting up. I'm going to look at it off screen to make sure it didn't crash. It crashed. What do we got?
...26I don't know what this is, actually.
...33Let me find the top of this. So we got to trace back. This looks like it's a comment, maybe. Betty, what's the, what in the? Are these, is this like a prod comment? Hold on.
01:35:06Any examples? It's just like someone's...
...14I know what I'm looking at. It looks like a very big comment of someone posting about an AI tool, but it doesn't appear on prop. I don't know what these IPs are. This is somebody's comment? It's not prod data.
...50Oh, it could be a story text, right?
01:36:00Ugh, dead link.
This looks like an ad anyways.
twitchtd could it be a private message?
I don't know what we're looking at here, Thomas.
...18I mean, it could be a private message. I hope it's not, but where's the rest of this? There was a hash in that output. I Haskell a Git. I think this is just the contents of the story underscore texts table. Because this thing isn't going to say shit about Betty, is it? Yeah. So if I say Betty and iTunes, there's this thing. So yeah, this is contents from the story underscore texts table.
01:37:06So this I think is part of vibe of cigars. Yeah. I Haskell a get. So if I say file does exist. Yeah, here we go. So get compresses these files get compresses these files and we'll need to commit to 48. So like right here. Vim is not a particularly robust terminal multiplexer. And if you resize it, I wasn't thinking, it lops things off. So we were on migrate line 27, which is where we knew we were.
01:38:08So whatever... Yeah, this is... Not what was in the middle of vibavs.
...32twitchtd what's the exception?
Hmm.
Well, it sort of is, actually.
Wait, so this x something something mj is this.
We don't see one.
We just see that there was one.
twitchtd is there anything up top?
Unless there's like 10,000 lines up, it's quoting something.
So let me, because I don't know what I'm going to scroll past if it's dumping tons of data.
Yeah, I'm going to go up top off stream here.
01:39:13Ah, here we go. This is, yeah, it must be like the exception is repeating the string twice and the string is some giant chunk of the story texts table. But I searched for the name of the file and hit this. All right, I'm going to pop it off stream a second, see if there's any more higher up. No, that wrapped around. So if I page up a few more times, this is just a lot of values from the Story Texts table, it looks like. It actually might have dumped the entire Story Texts table to my terminal.
01:40:13Yeah, I can't actually see in the scroll back the LoadDB task.
So if I search for LoadDB, there's that.
And that's the only instance which tells me it really did dump the entire table.
twitchtd probably a locale thinkg
But once again, it's probably like this, which it's funny we ended up with the actual values.
It must be here in this that's getting printed as a dot.
01:41:00I'm coming up on time, actually. I've got to stand up and start heading towards the door. I'm trying to think of what next step I could give you for debugging. And I don't immediately have any ideas. i don't want to give you just a whole dump of prod even though that would be the preferred thing because it would allow you to actually see this whole thing around on the whole thing but
...59twitchtd could work on it next time, or work on it offline
twitchtd I'm chicaog
veqqio hey push. i think you said 25 mins like 35 mins ago
yeah thomas if you want to think about it a minute and give me any ideas or you and i could even schedule a pairing session whether that's next stream or something that's more convenient for your schedule because i don't even remember what time zone you're in like i don't know what i would debug i could my first thought would be i think you said 24 yeah
Vec, that's about right.
Actually, my little timer is at zero, but I gave myself some padding because I always run like five minutes longer than I want to.
Wait, you're here in Chicago?
How have we never gotten a beer?
Throw me an email.
I didn't realize you were here in Chicago.
Or if I knew it, I forgot it.
We could go by JCS's house.
Ages ago, I had lunch with him a couple of times.
This is well before, I think the first one was, I realized that he was not terribly far away.
And so we just had lunch.
twitchtd the stream died for me
I remember what job I left from, but I don't remember what year that was.
And then the other one was around the time I took over the site from him.
Well, what I was saying was, let's catch up here.
I guess chat is still live.
I don't see that.
Oh, yeah.