Mercury is in retrograde and it's high tide

Streamed

Reviewing PR #1672 for managing unknown cookies and PR #1673 for deployment paths. Checking CommonMarker is at 25% and off-stream for boringness (PR #1627). The UK OSA is killing sites, same as ever, six months later. Writing up Hatchbox deployment documentation for sister sites. Recheck beta demo, a tool for finding invalid data in production databases, that has found 3 real bugs in Lobsters data.

scratch


topics
  commonmarker - 25% complete https://github.com/lobsters/lobsters/pull/1627
  PRs
    unknown cookies https://github.com/lobsters/lobsters/pull/1672
    deployment paths https://github.com/lobsters/lobsters/pull/1673
  IDN URLs: https://github.com/lobsters/lobsters/issues/932
  OSA rehash https://news.ycombinator.com/item?id=44629134
    https://lobste.rs/s/ukosa1
  hatchbox
    * todo list
    * Hunter's email rep tool
    * hsts
    * logrotate
  recheck
    https://github.com/recheckdev/recheck/tree/main/ruby/recheck
    https://github.com/recheckdev/recheck/tree/main/ruby/recheck-rails
    https://recheck.dev



title


post-stream
  fix story 84270
    

Transcripts are generated with whisperx, so they mistranscribe basically every username and technical term. They're OK but not great, advice appreciated.

Recording



02:30brainwane Hi!
Last stream, which we talked about Carmen Marker. I'll do the PRs first. So anyways, Lobsters is a discussion site about programming and related stuff. And then we have these office hours for folks to drop in anytime to ask about the site. Oh, hey, Brainwing. Nice to see you again. pushcx Welcome to Lobsters office hours, ask questions anytime!
So let's throw this in chat to... And when folks aren't asking questions, I will work on the site. So there we go, getting the noisy AC sorted. So let's see.

03:22So when folks don't have questions, this is just also a nice time boxing for me to be working on the site. And last time was pretty dry because I was working on this common marker update, which is let's look at every comment on the site to see if it is rendered in a semantically equivalent way. That is to say, The changes that are made are not meaningful. And so far, that was a lot of time. So the update here is that it is 25% complete. I want to say I got to around 10 or 15 on stream. And then in one of those forcible reminders that you have to take breaks, you have to get up from the computer occasionally and move around. As soon as I got up from the computer and moved around, I realized how to make it work better. So I left a comment down there. I've tinkered with it a little over the weekend. I probably will not beat this to death, except to say that the strategy is in there. I've caught one or two small bugs in the new renderer where Like if you have www dot, it thinks that's a link and highlights it. Or if you mention HTTP colon slash slash, just the protocol in the slash slash, or same with HTTPS, it makes that a link. That's not useful. That's not a valid link. I mean, the www one, you could torture DNS into making that work, but there's no reason for the spec to support that. The one where it's just a protocol, that link can't do anything. It doesn't make sense. So that's moving along. The bugs are pretty minor. I would deploy with these bugs, but it gives me a lot of confidence touching the most important part of the site, which is everybody's comments. So that's moving along. I am not going to keep running on that on stream. other prs let's look at them on the last room i looked at this one about the story repo tag that one is hanging out no movement from the author yet and then these two more came in so let's let's grab them we got one for cookies We got one for deployment paths. Both of these are from pretty new contributors who have made two or three small contributions each related to getting the site working in a development context. And so it is great to start seeing a couple of more meaty, meh, things that are gonna improve the site. So I peeked at this one off stream, because I wanted to make sure it wasn't gonna be a big thing. But the gist of it is, if we can get rid of cookies that we don't use, we have a better cache hit rate. So the site runs faster. And every site accumulates cookies that are no longer in use. Because cookies are sort of a global variable that hangs around. Not just through runs of the program, but in storage between runs of the program. So any part of your front end or your server that can touch headers can create cookies. And then you never have a reason to definitively say yes, we're not using this cookie anymore, unless you have built some kind of functionality around cookies to. like force them to go through some kind of gating mechanism so there's a single place in the code where you register cookies. So while we haven't added any more than these two cookies over the years, also all kinds of random stuff injects and creates cookies and you never, nothing ever gives you the confidence to delete cookies unless you write a function like Jacqueline H. Ma has written for us here that says, hey, if you're not in my list of known cookies, you are deleted. And yes, that has the possibility to break things, but you have to do that to have any kind of confidence that you don't have extra cookies hanging around. pushcx https://github.com/lobsters/lob…
So I asked her a couple of, oh, I'll share the link here because we're off the code. I asked a couple of questions I suggested an alternate name I don't quite get this test I wouldn't expect this test to pass as written, so I either don't. Either the code doesn't work or the test doesn't test what i'm thinking it does. Does the build run okay, so the bill did run because I merged her other commits so we'll see on this one. And then this was discussed in the comments of one of Yuanji's previous PRs that since the switch to using Hatchbox to deploy things, there's a bunch of hard-coded paths. Although I don't think he knows it's since the switch because We did that what two months ago and he showed up maybe two weeks ago so for him it's always been like this. So let's fix a bug that it's always been like.

09:53yeah that's fine. So the setup here with Hatchbox is that creates a directory and in this lobsters directory lobsters being its internal name for the site there's no dot it creates a series of release folders for each git checkout and then there is a symlink called current that the site runs out of as its working directory inside of current there are a couple of sim links back out so that current slash storage and current slash log or log links out to this stable directory. Because of course you wouldn't want your database to be deleted when you redeploy the site. All right. So that's, this is all fine. So when I wrote this, I was doing the most straightforward thing that would get lobsters running. Cause there were hundred things to touch for hatch box deployment. And then now we can just do the easy things here. Yeah. Local can be seen as the local disc, not local end. I don't know this config variable.

11:33Oh, so there. We.

...45That can't be it.

...54I think these are just two things that happen to be named local. So in YAML, you can like create a key and then kind of reuse and override it. But I don't see that happening in this sample.

12:20And I don't think that's quite... let's pull up seal all right now you can see I was doing some regular expressions for that common marker so rails it's called config active now hate searching for these configs see what's called actives underscore storage service so if I just search for service

...58under the service key okay so is this okay so it is mapping to the key okay so it's not a different thing this is sometimes the line between config and rails is very blurry so let's Okay, so I have config storage YAML that looks like this. And this is just redundant. Like these two actually evaluate to be the same thing. They're just different paths in. Yeah, all right. So now that we get what it's doing, can we see... to grab that quote see if someone can link to here so six and we're on what like 8.0 okay

14:44So that's totally fine to delete that and use yeah great. Oh yeah I forgot rails route has this join method so that's even nicer. yeah. Okay, this all so. So didn't mention it, but must have gotten a break man wanting to be updated error. So I'm going to go ahead and.

15:31Checks are green. Yeah, so there's not.

...44This can just be Rails root, right? No, Restic actually does care. Because we don't link the etc directory into the Rails root. We could, but no, there's stuff in there like keys that shouldn't be. Let those be owned by root. I mean, the deploy user can touch them, but yeah, let's just leave it. All right. So I'm going to merge this. Yeah, it's a little scary because it's touching key config. And, you know, it raises my my red flag of like, why are you touching stuff that's not broken? But this is a thing we want and it makes deployment or development easier. So, OK. So let's go ahead and say Let me double check something. Did you get all of them? All right, so we don't want to touch the ones that are in Hatchbox. And then the scripts here are going away mostly. So this all seems fine.

17:24ComplexPlane thanks for lobste.rs! Have been a lot more active there this yera
I didn't see Puma in there, did I?

...33ComplexPlane thanks for lobste.rs! Have been a lot more active there this year
Oh, hey. ComplexPlane ah up does not edit, rip
Well, welcome, complex plane. Thanks.

...43Oh, no, he mentions Puma. That's fine. So this one can be just fine. Yeah, no, don't worry about it. We all have typos. We'll live. All right. Now, so. Make sure. So let's unmerge this. And fetch this down.

18:35Yeah, I did a fetch this morning over on the common marker branch and Chanless had pushed to that one. And so I got some first class conflicts because So you'd rewritten commits and I was able to just ignore that and hop over here to master to start the stream it's very nice that. jj handles commit so cleanly or conflict so cleanly alright so with that let's back jump over to.

19:09Great. Right so i'm i'm on master in case I need to fix anything. I'm thinking about, will a regular deploy touch everything? Yes, because all of this is in the Rails app. So yeah, I don't have to do like a restart the server or anything. Just thinking, like every time I touch code outside of the app directory, I have to stop and think about deployment. So the site goes down here in two seconds. We'll fix it. I think we're pretty good.

20:00i'm going to grab okay i'm going to grab a terminal off screen so that i can watch look at the production log live yeah that looks fine site's loading fine everything is happy cool did not break prod always a nice deploy right

...38All right, so close that off screen terminal. So there's the two PRS. Oh, I had one thing I wanted to talk about because we did a lot of this work on stream. It seemed. Yeah, that's everybody. And I don't think there were any issues. Oh, someone opened a bug. Thing I kind of don't want to double check to resort. Yeah. Oh, and you know, as long as I'm looking at the sorting, that reminds me, I mentioned off stream that there were some pull requests from someone who was trying to get a JSON API added to the site in kind of a misleading way where they took a feature and they reported it as a bug and then slipped in 20 JSON endpoints. I was pretty unhappy about that actually. And I don't know which one of these was it. I don't see it in the list. It's like 1663. Anyways, got a couple of responses. Did they delete stuff?

22:07Okay, so here was the issue. That's weird. I'm used to GitHub forwarding you. If you type in a PR number as an issue or an issue number as a PR, it used to redirect. Maybe that only went one direction. So anyways, if you are curious how that's going, not well.

...37It's done. So what is this?

...46Normalization. Right. Yeah, so not exactly the same error. Okay, so the homographic spoofing thing is the thing I care about. Yeah. Let's grab that link.

24:08Is this a good first issue? How much do you have to know about the site? Not a heck of a lot to fix this, right? Yeah, I think that's okay as is.

26:03So 932 is, that is a bug, right? It's not a feature request yet. It is a bug.

...35Yeah, if we're going to see... Yeah, actually, we don't permit... Okay, so...

27:07so in story mode it requires the encoded domains be used look like

28:20Actually, this kind of is the same bug, it's just. This one.

...43This one is just what do we render? Yeah.

29:13yeah this really is to do the more i talk through it there isn't a second part all right

32:54So kind of redundant feature request. That's fine.

33:13So let me grab that for the scratch notes.

...34Right, so then the other thing I was starting to say that we have handled on stream when I remembered I hadn't looked at the issues was this. twitchtd hi, @pushcx
So this website janitor AI seems to be some kind of forum and LLM host. I didn't really click through, but I kind of skimmed these comments. Oh, hey, Thomas. Yes, yes, I will do the recheck demo. I'm kind of doing... My next to last thing before getting into it, let me put it on the to do list. Anyways, welcome back. So on this stream, I have done a lot of the work of working through the UK online safety act. I made the URL memorable and then I didn't remember it. So if you compare, so the gist of this, where'd that go? What even just happened? Where's my link? Okay, you went off to a new window. One of my least favorite features. All right, so this site basically said, yeah, we looked at the UK Online Safety Act and wow, it was a lot of stuff. And they came to the same kind of conclusions that I did. that including the big one that it is written for tech giants and not small platforms but it absolutely includes a ton of incredibly expensive legally risky things for even tiny sites so they hit a lot of the same points that i did and honestly if a site that's about llms is going to like pick up my post and hit all of the same points and not give credit, well, that would be on brand for LLMs, I suppose. I hope you can hear that there's a smile in my voice. I am not really severely put out if they kind of lifted my analysis. They hit a lot of the same topics, including things like it is functionally impossible to comply with below don't know a scale where you have a lawyer on staff i guess theoretically especially if you could get someone to donate some legal time to you you could comply but you still have to have like full-time staff period if you're gonna even hope to there's just so much to do realistically small independent things cannot support it So they they wrote up that you know no it's impossible for us to comply we're tapping out there was this petition to repeal the act I don't know how that's going. And then. I read this ending part where they're like clearing up some misinformation. So people misread the post as saying that maybe UK users could be jailed under the Online Safety Act. And that's not what it says. It only threatens people who run sites. And I don't know, just watching people start hashing this out again was kind of painful. And then I am not going to skim down the HN comments. pushcx https://news.ycombinator.com/it…
I saw this when it was like five hours old and had maybe 30 or 40 comments. And it was kind of... I'll throw this link in the chat. I should have. It was just... All of the comments were exactly the hashing everything out that we did on stream and in the Lobster's thread and that I wrote up that giant long... which one, this giant long comment of what the hell is the OSA and why does this matter and tried to address all of these misunderstandings because if you come at the law assuming that it is generally well-intentioned and it's misunderstood and it's about just making sure that reasonable best practices are followed, you will assume all kinds of things of the law that aren't true. You will assume that it applies to paying sites commercial services you will assume that applies to big services you will assume that it applies to prevent abuse when actually it prevents all kinds of things so yeah it is honestly it was just kind of frustrating to see because The Online Safety Act has not meaningfully changed in the six months since I wrote this post, four months since I wrote this post, or I guess six months since we really made a project about figuring out what the hell it was and how it mattered. And the censorship office, Ofcom, has not made a serious effort to make the law understandable because they're only after Albynton Ji
law is written for tech giants they are a bunch of lawyers writing for tech giants there is not a meaningful effective attempt to help small independent sites even understand the law let alone comply with it and ofcom is not in a position where they can even make the law reasonable and possible for small sites to comply with and i guess i'm frustrated because Like, nothing improved. I don't know. Albynton I meant, "Hi"
It's a shame that the UK is doing this to themselves, but it's also a little bit of a shame that everyone is rehashing and doing this from the scratch again. Hi, Alvington. Yeah. I wonder if anybody mentioned it. Oh, yeah. So somebody linked to my post. That wasn't there when I looked. Like I said, I looked when it was five hours old. I don't know. I hope they... Hold on, I got a cough.

40:26I had hoped that this long write-up would serve as a kind of harm reduction, so that people would stop making all of the mistakes, like surely it's not that bad. Let alone help people effectively advocate for improving their law. graefchen Hello limesHi
Although, I personally am staying out of UK politics. That seems like a reasonable thing to do. I don't know. So that's about the end of that topic. And howdy, Grafchen? All right. So where was my to-do list? Because I misplaced it out of the stream notes. There we go. So before I got months past the actual Hatchbox deployment, I wanted to finish graefchen just "graef" is ok. limesLurk
writing up those notes because i'm going to start forgetting stuff graph okay

41:56veqqio Howdy
All right, so this, and if anybody's curious, you can follow along. It's in the repo. It's the Hatchbox to-do file. This was the to-do list I made for our process of moving over to Hatchbox. And I wanted to turn that into notes on how to deploy a production copy of the service. 12charonsobols how often do you stream
So I wanted to make it easier for sister sites to get started. So I'm going to grab that and... Okay, 12 Charon. 12 Charons, Obols. I thought Charon always charged two pennies, right? Not Obols and not 12 with them. Oh man, inflation even for crossing the river Styx. It's hard out there, even in the afterlife. So anyways, I stream Monday afternoons and Thursday mornings. I'm in Chicago, so that's Chicago time. And the Twitch calendar is correct. Sometimes people see a weird time zone issue, but that's a bug in your calendar. It's in the fact that's linked under the stream. All right, so part of this was deployment setup. Our development setup and then production. So I started touching this a couple of weeks ago. So I had talked to Jacqueline, who is the author of that cookie PR I showed a few minutes ago, because some of the confusion she had with the readme was that the development setup kind of blended into the production setup because I started talking about production setup to make it a little easier. So let's go up here.

44:10So I'm going to pull this whole thing about a production service down here.

...25So To.

...35So.

45:01hard part that's starting a sister site it's not new circle site has a chicken has to solve a chicken and egg problem nobody wants to want to participate on news site until other people are participating it just i've seen this so many times i've even done this with barnacles where i never quite reached that critical mass and i mean veqqio I was curious what Barnacles was!
Not to toot my own horn, but the reason I'm the admin of Lobsters is because I helped solve... Yeah, so Barnacles, you can find it in the Internet Archive, but it was a sister site that was about bootstrapped entrepreneurship, technical entrepreneurship. So the kind of thing I do. And it never got to that critical mass to be a viable social community. think first about how you'll know you or you start working with the code to make a plan for how you'll reach potential community members and

47:02veqqio I saw that you posted 10+ stories a day to Lobsters back when. Reddit made stuff up iirc / posted as other users. Others scrape from others etc. Many such models. Real engagement though... harder!
and how you'll kind of start the discussion yeah oh yeah i distracted myself mid-sentence the big lift i did on lobsters early days was posting tons of stories for like 18 months and yes reddit kind of infamously The owners have said that they created themselves a bunch of sock puppet accounts and would have discussions amongst themselves so that it looked like there were live discussions. I wasn't quite there. By the time I met Aaron and finally signed up to Reddit, it was already viable. Yeah. But I do believe that they had done that in the early year or so of Reddit.

48:02And how you'll...

...22Trying to say this in a way that's not like fake it till you make it because I don't actually recommend that. I do think the best way to get started is like get 10 actual people and do it. But I did want to mention this.

49:02If you don't track enough early users to reach a self-sustaining level of activity, the code doesn't matter. Okay, so then That doesn't need an exclamation point, it's not that exciting down here.

...45All right. Did I mention? graefchen When people see that there is activity on a site, they are more likely to engage in it. I guess. So either way it will be a lot of work. limesNodders
Yeah, OK. Don't need the recent because people are going to read this for years and I'm not going to remember any year to come back and remove that recent. Yeah, Graf, you're exactly right. That's why I call it a chicken and egg problem of nobody wants to come until people are already there. The other metaphor that's really useful here, kind of on a related thing, is inertia. Once you actually get that flywheel turning and the community is going, they become very resilient. So we've survived, you know, outages that are seven eight hours long if someone had a brand new site and it went down for eight hours you know one month in no one has really formed a habit by that point and so they may just wander off and break the daily habit of checking it and then they're just gone yeah this paragraph i could write in my head i have been not in my head actually i have been collecting them but i've been writing notes about How to do effective moderation and run communities and one of these decades I will write it, but there is there has to be a whole chapter about that early like, how do you actually start a community because. There are so many ways to look at that problem of getting the flywheel turning getting the egg to hatch a chicken.

51:33I don't know what it is with my writing. I've gotten really interested in or I've gotten really attracted to using the word so in writing. Shows up all the time now. I wonder if that comes from streaming because it really is the last year or so that I keep doing it.

52:08graefchen It reminds me of how I liker certain Streamers over others. Interaction is just such a big part in it that can influence people. limesNodders
so there's that yeah so starting to stream is very similar in that it is a brutally hard competitive slog where no one wants to watch you if no one is watching very often and edit for the streamer there isn't a motivation to keep doing it if no one is watching so what do you have to know to set things up so let's say oh so my plan for the stream here and folks can throw in questions about our site or codebase anytime this is still lobsters office hours even when i'm and i'm maintaining stuff when folks aren't asking questions so i'm going to write up these deployment notes before they completely leave my head and then i was going to do a demo of My software project recheck which I have been alpha testing against lobsters and it was very funny like a week ago it got to a week yeah yeah week 10 days ago it got to this very funny you know we're talking about the chicken and egg problem this point where I was alpha testing it against. The lobsters code base and a backup of the production database. and It hit that point where I started getting useful information from running the tool and I started to switch from I'm running the command so that I can see if the command works and throws exceptions into I'm running the command because I genuinely want to have this tool and use it and that's the thing that's motivated me to make this tool. So anyways, I'll give you some of that demo and we'll probably see an exception regardless, but it hit a really fun self-sustaining point and I I say I. A friend, in a very friendly way that a trusted friend can do, gave me a kick in the ass to get the beta out. So, that's going well. We'll get there right after this Roku stuff. Roku. No, Hatchbox. Wrong age. Alright, so, what do we have to do? We got... logs all this stuff is things that we had to do to production right so where are the hatchbox create steps i want to say i met chris at rails conf and i should have bent his ear about marketing There we go.

55:54This is, that's not useful.

56:27Yeah.

57:32so you create yeah there you create a cluster server capitalize these looks for server

58:05okay i don't need that and then i'm going to also pull up box the one thing i did slightly bend chris oliver's ear about at rails conf was how I can't click around in the hatch box ui without it showing me a bunch of like production keys.

...42So I have that off screen. And i'm going to run through the settings that users of the code base will have to use. Alright, so. Frafabowa hey - how much of a headache is storage for lobsters? apparently your maria server has a 160gb disk - i know you don't host anything which isn't text and avatars which helps, but with backups and all and a modest user base i'm still curious how much it is
So actually, let me break this out to create an account.

59:13Oh, yeah, the production database is nothing like 160 gigs.

...27yeah so it's if digital ocean is now offering 160 gig disks they updated their hosting since we started using them a couple years ago so we have a 78 gig disk of which 29 is in use and i have never attempted to like cut down the ubuntu lts size or anything the actual production database is i mean just took a let me check this path off stream okay so it's it's date first 20 25 0 7 17. yeah so a b zipped copy of the database is 538 megs it text compresses wonderfully but This whole thing is a couple of gigs and part of that is because we store all comment text rendered in the database as markup or i'm sorry as html it's a. lobsters has a couple of these quirks because of three dates rails having a really nice built in cash.

01:00:56So anyways. The actual production database is only a couple of gigs. Happy to answer any other questions. Or if you want a more specific question, we can look at that. If you're curious, the number of rows in the database, off the top of my head, if you combined all tables, it's around 10 million rows, which blows my mind because I'm an older coder, but it's not a heck of a lot of a lot of data anymore. That is like a mid-sized database now.

01:02:26khronicz Proider is mispelling -> Provider
Proider. Yup, and it's red. This one's fine. I don't usually think too much about spelling while I'm writing these first drafts, but thank you. And welcome, Chronix? graefchen It sounds like a lot, but also Lobste.rs is not that young either. limesLurk
All right, Chronix. All right, so you create a cluster. pushcx https://lobste.rs/t/announce
khronicz I am also a rails dev.. I work on a fairly large rails app
Frafabowa nah that's about all I was wanting to know, and it's comforting! i've been thinking about starting a hobbyist project but storage concerns have been something which scared me for some reason
yeah i mean lobsters just turned 13 what two weeks ago so if you go to the announce tag you can see the we really only have like an announcement every three or four years which is kind of funny so like most of the announcements are hey we're one year older ah welcome chronix Well, you know, if you would like to work on a second small Rails app, we are always looking for developers. pushcx https://github.com/lobsters/lob…
Good percent 20. Where is this? Here we go. We have lots of issues that are tagged. Good first issue. And I would love to have more contributors to the code base. So you came in after Chronix, but I... khronicz I try and tune most of the time
khronicz *in
pretty much start all these streams by reviewing pull requests and issues and a lot of the times contributors drop in to talk about their stuff so if you would like to practice your rails happy to have help oh great the twitch tells me that you were a first time chatter so i am mostly reacting to that and giving you the intro but thanks for listening All right, so you create a cluster. Let's just call web04. khronicz I assume Lobste.rs is not your day job
How many. Yes. Oh, actually, what matters more is. There's a. DO limitation that it has to be named for your host name, for your domain name if you want. Let me pull up the relation kit. So I mentioned this to Chris in... How do I get over to the support while logged in?

01:06:16I guess I just emailed him. God damn it. I can't think to anything. All right. Has to match that server name must match your domain name. If you want for them to create a, what was it? An SPF record or a reverse DNS record. Where's my email?

...55Yeah, I don't know where I had this conversation. Dang it. Lobsters is not my day job. Lobsters is not a business. And not only does it not have any commercial aspects, I don't take donations for it. And the reason I don't do that is because it kind of, Didn't I link it in here? It comes up often enough. I thought I put it in my bio. graefchen I must say that even as a more or less hobby developer I enjoy the streams, even if I have no idea how Ruby and/or Ruby on Rails works. limesComfy
You'd have to search the site for comments from me, but I don't take donations because it kind of automatically starts creating a second class of user. And you have to worry about, hey, did this person get any extra benefits like being allowed to bend the rules more because they donated? And I don't want that to even be a question. pushcx https://push.cx/stream
veqqio Terrible question, but I assume there's no bus-factor issue, with others e.g. having access to the domain registrar etc.?
frankly it's it's cheap enough to run it's not worth that stress ah well grave chen you might enjoy and i'm going to be i'm going to be shameless about this one but if you hop over to the stream archive i just added to the fact how to get started with ruby how to get started with rails and then hey look it says you can Practice by contributing to lobsters. So if you are curious about how Ruby and or Ruby on Rails work, I would love to have more contributors. There is a bus factor, but it is fairly low. So I don't know. I'm married, so if my spouse is on the bus, Or no way in the bus factor you're getting hit by the bus I guess we could be crossing the street together. getting very literal here. There are some contingency plans and the site is unlikely to fall over before that can go into effect, but there is not currently anyone else with. broad access to. like the domain name registry because that's just my personal account or nobody else with root on the production server that is also in the well the seeds for that are in that lucky 13 lobsters post that i posted for our birthday two weeks ago where i'm looking for more mods I think I've gotten all of the replies I can get, and I'm going to start with the other mods, engaging with the folks who have applied to bmods, and that is just getting more people involved in helping running the site so that we can reduce bus factor. None of that is a, like, that'll be done by next stream. That's all a weakening. building the trust to give someone root on the server is a lot more than a few days. We'll get there.

01:10:26veqqio Fair! Also, I'd like to compliment how friendly you are to people who may contribute etc.!
Oh, it's a PTR record, isn't it?

...35veqqio I love the little presentation
Yeah, and they don't actually have a page. There's just this.

...59graefchen Thanks for the link. I am gonna give it a good look. Hopefully when I am less preoccupied with nor doing any University assignments. limesGiggle
You know what? I'm actually annoyed about this. I'm going to link the words limitation rather than the topic.

01:11:16Oh, yeah. Well, I will mention if any of your university classes have an assignment like contribute to an open source project. I have like written letters of reference to professors, which I mean, it's a big fancy name, but like I have written emails to professors confirming what people's contributions were and what it meant in the scope of the project. And thanks for doing it. So if that is ever part of your classes, I am happy to spend 30 seconds writing an email back that says, yes, Grafe fixed this bug for me. Grafe added this feature. We have X users and Y scale. So this is meaningful work. I definitely want people to benefit from their contributions. I don't know if anybody but me has put lobsters on their resume. Maybe. All right. So let's do simple. Server.

01:12:26And then I think once the server is created, yeah.

...47graefchen I fully believe I don't have those. Mainly because I do not think that University's in Germany fo such a thing. They appear more or less way more Conservative. limesNodders
server create a firewall rule permitting ssh access which this has to be the default rule right yeah yeah no that's fine i don't have to tell people to set up the defaults because they're on by default all right so there's not a lot of server config there the ssh stuff happens automatically you name your clusters we didn't need all right yeah

01:13:53And back Thank you for your compliment about my friendliness I do try. catch a lot more contributors with honey than vinegar right.

01:14:17customer server.

...29do have lots of app settings customized so I'm kind of working my way around that don't create a oh wait yeah you only need to create a database we use MariaDB in prod but are working to migrate to sqlite which is what 8 30 something 8 39 5 39.

01:15:39And let's go through the settings.

01:16:17processes, activity, repository, means in SSL, environment, database, environment, databases, ground jobs, settings. Really, one of my subtabs in Settings is Settings? All right, Hatchbox. So processes.

01:17:23activity, this is just logs.

...38twitchtd in order to access settings, one must first access settings
Well, there is a big difference between, I'm assuming you're a computer science student, but there is a big difference between computer science and computer programming. Yeah. That's a good way to put it, Thomas. Repository settings. Connect to your GitHub repo.

01:18:11And then there aren't any other changes there. No. Oh, this is one of those pages that shows a bunch of keys just as soon as I look down, though. How funny. All right, so then the next one.

01:19:09graefchen Technically not. More Computer Linguistics and another way more non computing related field. While we do *programming stuff* it is way less then I assume CompSi students would do. limesLurk
veqqio @graefchen creatively, you could fufill a hochschule's practicum with a looot of commits xD
and you know other custom stuff environment now this is the big one

...43Is that like a German thing or is that a person who did something once? I don't know this name. We have bundle without. How do I want to present these? Don't I just have a code block? It's the easiest thing to copy and paste out of.

01:21:02veqqio German thing, they have unis and "highschools" which give up to MA's but are more "applied" and have many required internships etc.
German thing they have unison high schools which give up to ma's are more applied. graefchen @veqqio I do not know if the Institute would allow that. But I also would not be fairly surprised if they did. limesLurk
Would you say it's if you're familiar with American systems is does that like a trade school. So. It is not an especially formalized system but. In the United States. Around the high school level but definitely around the college level so age. veqqio Not at all! Like you can get a master's degree in engineering from one.
veqqio They have trade schools too, but those happen earlier/before
16 to 18 you can kind of split off from the standard academic college university path to a trade school that teaches a practical skill so it's things like construction plumbing car repair all those kinds of useful professional skills i'm not aware that there are trade schools for programming but there probably should be and then sometimes there's a little bit of hybridizing where the education for teenagers like puts you into one of those tracks at a younger age? I don't know that I can explain the whole thing. Oh, okay. veqqio It's more like North Western
Oh, so it's more like getting a master's at the same time as a bachelor? Interesting process.

01:22:26twitchtd trade schools for programming, wheren't those called coding bootcamps?
How did we generate the ingress password? twitchtd not sure if that's still a thing
I think that's just a random variable.

...59veqqio It's like normal college, but then they'll require a few semesters of internships where you have to write about the experience etc. to contextualize things
veqqio They are seen more highly than universities, for e.g. engineering
I think a lot of coding boot camps wanted to become trade schools for programming, but then they didn't want to actually do the certifications to become trade schools, and then some of them were required to because individual states looked at them and were like, you're obviously a trade school, you have to do the certification. It was a whole chaotic couple of years. I have a couple of Folks I know who went to coding boot camps from 2012 to, what was the last one? 2020, yeah, during the pandemic, so 2020. But I don't really know anything about the regulatory side. ghost_user_1984 they were
I think they were effectively trade schools, but not legally trade schools is what I'm getting at. ghost_user_1984 big mess
And so I'm a little hesitant to say anything too much about them because I can't really speak confidently there. veqqio highschool or "Hochschule" is an etymological issue, where a language purist wanted to replace the german word for university. Eventually it came to mean this other grouping
Oh, hey, Hunter. ghost_user_1984 hey
I am writing up our work. ghost_user_1984 oh good, I finally wind up in git
So for anybody who doesn't know, ghost user 1984 here in the chat is Hunter who helped enormously with this migrating the server over to the Hatchbox deployment tool. So actually, Hunter, I am finally finishing up these notes before I let them fall out of my head. pushcx https://lobste.rs/s/8we4dn/luck…
Oh yeah i'll have to we'll have to figure out how I do this in jujitsu so that I can mark you as a contributor on this commit so for anybody who doesn't know I mentioned the. The birthday post already, but in our birthday post I highlighted that we have had a whole lot of contributors to the site, the last year or so, maybe that's because of this stream but. Hunter here ghost user 1984 is Sir not appearing in this git log because he's always been pairing with me and like it's my hands on the keyboard so it's my name in the git commit. But if it's touching production server settings he is almost certainly a co author of. PudottaPommin Good day bobeeeDobryden
every commit you can see in our Git log in the last year. So it would be good to actually make sure his name shows up. ghost_user_1984 I think I’ve made that list yearly at this point?
So I have mentioned, like, I write letters of reference for people who help with stuff. I don't think Hunter has needed that and is not likely to, but it's always nice to be able to point at your accomplishments, even when they're... This is... Well, you have made the list yearly, but that's a reductio point because that is the first time I've made a list of contributors because... Honestly, this was the first year where there were a bunch of people where I was like, oh, this person has contributed like six PRs. We've had people who have contributed substantial contributions, but people have tended to be one or two and done. ghost_user_1984 because I feel like it’s a once a year big prod ops thing
And I'm seeing rewards when I put time and attention into the people who contribute to the code base. So the more we do it, the better. Oh, you're saying once a year we fix something in prod? That's about right, yeah. ghost_user_1984 oh yeah totally, we did stickers at work for this
Yeah. I'm kind of like mentally thinking through the last couple of years. So I mentioned one of the things that was here in the edit was like we called our server web04. There has been a web01. There wasn't an 02 because I refreshed it without fixing the name. There was an 03. We're on 04. Hunter's been involved in all of those. ghost_user_1984 to encourage folks to help the SRE team
stickers why can rails and bracket be two different things that's That's dangerous. I don't like that.

01:27:33ghost_user_1984 because rails predates rack…
True. 10. Rails. I suppose. Yeah, I guess if Rack has a different notion of production verse. Yeah, I don't know.

01:28:03ghost_user_1984 I think it’s the same thing since 3?
What else do we have here? Okay, that's all our envars.

...17I think I must have... Hunter, what is the same thing since three and can you try to avoid pronouns because I think we have a lot of delay ghost_user_1984 RAILS env VS RACK ENV
And so i'm not able to jump back the right number of seconds in my head to understand what you're referring to. rails inverse reckon. yeah I I genuinely don't know that i've heard of reckon before. I don't really know. Cron jobs. OK. Wait, we. We should not have Cron jobs. Where's my recurring?

01:29:20Development.

...29I guess that one's not moved over yet. So if there's still cron job or two, I will document them.

01:30:30I'll be glad to see the last of this. Cron jobs. We've been moving everything over to recurring.yaml. That can go away pretty soon.

01:31:02It's only one of these. Oh, right. Expire cage cache is a shell script. And then settings under settings is the big one, right? Yeah.

...25Want to manage.

...39We have made leaks to our production config files. And we will subtract in our Git repo. So that's, where's the Hatchbox path?

01:32:12Oh, not that. Remaster.

...30moussx_ Hello! Hatchbox is a service to deploy the lobsters stack?
Moose, hello. Welcome. Yes, you correctly understand what Hatchbox is. We shifted over to using it about Two months ago, something like that, and I am finally writing up my notes so that other people can set up without figuring out. Well, without having to reverse engineer my chicken scratch notes on the right here. So I'm kind of running through all the pages in the hatch box dashboard, which I can't show on stream because it shows production API keys all over the place. moussx_ Makes sense LUL
ghost_user_1984 we have also created the best hatchbox tutorial
And then writing up a like how to set things up.

01:33:14Yeah, I suppose we are actually creating a Hatchbox tutorial. Although if I was creating one properly, so this is one limitation is I am writing up what the settings are. There is like a wizard flow for creating Hatchbox. Because remember, Hunter, it leads you through creating the cluster, the server, the app. And I would have to run through those screens again if I wanted to match them. And I guess I don't care that much. Let me actually make a note of that. Yeah.

01:34:36moussx_ I'm reading the notes a bit, and I see that you mention migrating from MariaDB to SQLite; but the issue is about migrating to Postgres, which one do you want to do eventually ?
Well, it's not just if I don't want to run through it once. It's also if I run through it, then I have to do that on a regular basis to match it. moussx_ I was interested because I wondered what makes soneone migrate from Maria to SQLite, as they seem identical to me
How do I want to say this? MooseX, if you scroll way down in those very long comments, you will see that we have pretty much decided on SQLite and the... Title is not updated, which I've been meaning to, so why not do that now?

01:35:46moussx_ Thanks!
there a less lazy lazy way to say that i don't want to do this every six months or something to make sure it's up to date rather than try to stay current with no certain setup so this should be all the info you need, just in a slightly different order. OK. Vim's Markdown thing is showing me there's white space at the start of this line, but not the others. All right. You know what I just did? I put another so in.

01:36:45So there's that.

...54We have tweaks of our production config files. We want those tracked in our Git repo. And we've rigged up an unfortunately tuned twitchtd sqlite is a much simpler database to manage, no need for a separate server, it's just a single file
Whatever script to make sure. That's updated every deploy. Oh yeah, as for why Maria to SQLite. Mostly it's right sizing for our needs. moussx_ oh, I thought maria was also running on a file. Makes complete sense then
sequel light is probably more than enough power and comfort for us now and it has all the functionality we need so why run a second one maria db's production so what thomas is saying is SQLite writes mostly to a single file. If you grab the one main file, you may miss a couple of things in the write-ahead log, but you will have a working database. With MariaDB, a database is a directory full of files, and it's actually even a little more spread out than that. And you cannot snapshot it with standard Unix tools like CP the way you can with SQLite.

01:39:02moussx_ I see
Thank you.

...49brainwane I was distracted - did you talk at all about SQLite's interesting open source approach?
No, I didn't talk at all about it because I was talking about it from the perspective of why one would want to move from MariaDB to SQLite. brainwane sure
That is a very interesting thing about open source if you want to talk about that while I'm pulling over settings. brainwane so, some open source projects are deliberately really open to contributions from nonmaintainers
So let's see, host deploy script. All right, so we got the, and then deploy script is blank.

01:40:36brainwane SQLite is NOT like that -- but, also, they are UPFRONT about it, which I find refreshing
TheYagich just gotta make sure your driver or orm or what have you auto-converts booleans to and from the number type. it fucks me up a lot in sqlite
yeah and for an example of what brainwind is saying you can listen to the last 45 minutes where i shamelessly try and rope people into contributing to lobsters let's see this is

01:41:39brainwane SQLite's maintainers basically say: if you submit a bug report: awesome! if you submit a patch: we will not use it! Possibly we will treat that as pseudocode and rewrite it our own way
moussx_ I mean, those office hours are great for that, I'm just not sure I want to delve into ruby right now, I want to follow the Zig train and find a project there
moussx_ yeah
oh yeah zig's a pretty popular language especially if you're doing what it's aimed at system software right And the yagetch, yeah, we use active record on the Rails side and it does handle that. That's also sort of a thing in MariaDB where for booleans and enums, well, enums are, I guess, more complicated, but booleans are a zero and a one in MySQL as well. brainwane https://sqlite.org/copyright.ht… the "Contributed Code" section at the bottom
And MySQL and MariaDB are pretty interchangeable.

01:42:46So that's all the settings from the Settings sub-tab of the Settings page. Settings to Settings, Settings, Settings. Nothing else on the other tabs. Server, Scripts, Scripts. That's old dead code, isn't that? Yeah, that's dead code. Sorry, I'm poking around the production config and blowing away things. All right.

01:43:47All right, that's pretty reasonable, actually. All right, so that's the core of the HatchBox config. And then three. So let's make this four.

01:44:23brainwane also the SQLite Code of Ethics https://www.sqlite.org/codeofet… 'This document continues to be used for its original purpose - providing a reference to fill in the "code of conduct" box on supplier registration forms.' is the Rule of St. Benedict, a Christian code of ethics
Why did I say on setup? That's not grammar. Did I get that wrong? No, it was already like that. All right. Oh, yeah. moussx_ @brainwane It makes sense, they "just" want to protect the license at all cost
Lobsters had a spicy thread or two about codes of conduct. Those used to be. So codes of conduct were brainwane the world is a dizzyingly varied tapestry and the SQLite approach is one of the threads in it
a matter of debate for a while, where people were like, why do you have to tell me to be polite? Or why is there a rule that I have to be polite on this project? Yeah. And then so the SQLite code of ethics dropped in at that same time. I remember we had a thread or two on the site about it. All right. So that's everything here, I think.

01:45:38Oh, Hunter, if you're still here, when we were deploying, you gave me a command to run that would email some like email testing service to see if your emails look like spam. If you can give that to me again, I don't think it's in the notes. Yeah, but I would like to put it in the notes.

01:46:20Oh, no, here we go. It was mail-tester.com. Here we go.

...36moussx_ I used that one too, at least
Yeah, there we go.

01:47:01Oh, why am I? Definitely got the wrong clipboard that time.

...12And.

...44So we got this. And then.

01:48:21Test dash whatever. SRE1.

...34This was echo. I must have missed a couple characters. Anyway, I wanted to both to put this in the notes because Hunter had a really useful tool, but then also actually do it. So I'm going to mail tester. I'm going to paste. Why am I struggling to paste things? There we go. I am sending a test mail in production. It is bouncing out of the queue and we will see if there's any other DNS or DKIM or SPF or PTR or records that we have to do. Okay. Look at this, lobsters doesn't look like spam. My message could be improved. Well, I mean, my message is like a single link. Yeah, we're not on any block lists. That's valuable. Good. This actually, you know,

01:50:10block lists email ip block list check don't want one yeah i know there's one of these i've used before yeah so we missed a step hunter and i in when setting up the current production server and We rolled the dice and got lucky. Yet this is the one I've seen. I remember this really irritating animation.

...59But when you set up a new VPS, if it's going to send email, you want to check that very early on.

01:51:14So let's see.

...57moussx_ Do they have an API so you could at least test that at server startup?
pushcx https://dnschecker.org/ip-black…
Create services. Give me one sec. Moose, probably they have an API. pushcx https://mxtoolbox.com/SuperTool…
Well, here, I will throw this link in here so you can see our results. There are a couple of these services. Like you saw, I clicked past mxtoolbox.com. I've used this one before. It doesn't check quite as many block lists. They don't all matter. Honestly, if you check the mxtoolbox one, those are probably just all the important ones. spammers constantly try to set up servers on every hosting provider the posts spam them but the address goes back into the pool and may have a terrible may let's be specific may be on block lists when you want to start using them using it so check your server's ip asap and to delete and recreate

01:53:32brainwane heading off, wishing you well
easier to delete and get off the block lists especially because you can't because near mortals don't get have any insight into the internal block lists of big email providers like Google. Ah, see you later, Brainwine. Thanks for dropping in again. Like Google, Apple.

01:54:46What am I getting pinged about?

01:55:02There's an interesting discussion happening in the Lobster's IRC channel where, speaking of trade schools and boot camps, a new boot camp grad is asking some questions about getting a first job, which is hard, especially lately.

...45And then throw that. And that's already in the single back ticks. All right. So that's a twofer for my to-do list because that's, I wanted to run Hunter's email tool. And then also I wanted to document it. What is HSTS? Oh, I didn't even, I'm doing this all out of order. We'll go that direction because I'm kind of doing the to-do list. The thing was to write up the instructions. And like everything, this has taken me a lot longer than I expected on stream. That's fine.

01:56:29Oh, yeah.

...59moussx_ DNS checker has no API, and MXToolbox has a paying API :( so I guess it'll need to be manual (or annoying headless browser stuff)
You know, I don't just email the hatch box guy about I don't need to gripe in public. Yeah, I would assume that anything that runs, I mean, lobsters has already seen a lot of hassle with bots and scrapers. And I think we see more than most sites because we are for programmers. And so when new developers are like, I want to write my first bot or first scraper, What's an interesting site I could scrape? Who has interesting data? Oh, hey. And then they think of that site they visit, right? And so I've had to do some rate limiting and other stuff. All right, so all of this stuff is all fine. Yeah, these cluster things, that's clean up, clean up. Archive the repo. Caddy's not serving avatars. We still have a bug for that. Logs. Just punting on that one. moussx_ Yeah, I thought that at least you would be able to get a free API key that would have rate-limiting + limit API key creations per IP or something
All right. Yeah, that's everything. Like, that's that done.

01:58:26Yeah, and I don't know... moussx_ Seems tough though, you probably have to fight AI scrapers too, "definitely human" content like that is going to be a target
You know, I was just describing it this morning and on stream earlier, but maintaining an API is a whole project that people have asked if lobsters could have an API or they want to use the JSON that we produce, sometimes inadvertently, because Rails really wants to produce JSON. moussx_ Makes sense
But we don't want to take on the responsibility of maintaining that because we're already constrained on developer time. Yeah, I would assume... Yeah, and it's not just AI scrapers. It's also other ones. For example, a particularly annoying example, the Brave browser, which doesn't identify itself consistently, has a whole scheme where... They use users of the browsers to scrape pages so that they avoid getting recognized as a scraper, and then they resell that scraping, including two AI trainers, which is the current annoyance, but it's the fact that they enlist individuals that's kind of irksome. And they also BS about They say they respect robots.txt. They don't follow robots.txt. They don't comply with robots. They respect it.

02:00:05TheYagich oh man i didnt know that about brave
All right, so there's that. This isn't a bug, really. It's a feature request, which I don't care about. Yeah, we have an issue here somewhere about them and the nonsense they get up to because this comes up. TheYagich add it to the pile of issues with that browser
I wanted to say, didn't I have a comment here on the site? no it's not yeah oh yeah and there are like two issues where people are like blah but i don't want you to come here where is it let me find this one because this one's worth mentioning

02:01:08pushcx https://lobste.rs/s/iopw1d/what…
Here we go. So the gist of it is for years we have blocked or tried to block the Brave browser, but it misrepresents itself. So the block mostly didn't work and then their browser started not universally because it's kind of platform dependent and then also can be configured and then also there are plugins. But like whenever this was, seven months ago, so back in December, the browser started admitting what it was and the block started working again, because at the time it was just based on user agent. And then while trying to explain what that was in this thread and why we do that, and this links to a lot of the history, But then also, there was a whole ridiculous thing where, yeah, if you search for this comment here of, I was reading a nerd backpack forum because I like reading nerds talk about their passions. I mean, I am one, right? I ran into a claim that brave sold user data. And that's how I learned that they use their users to scrape sites without respecting robots.txt. And then they made a bunch of misleading statements about it. And so it's this experience of like, I don't want to care what browser someone is using. But every six months they do something either really sketchy and they're misleading about it or inappropriate and they're misleading about it. They're never forthcoming. Anyways. I really don't want to care about Brave. I don't want to talk about Brave. It's just, I'm happy to explain why stuff happens. That is the point of the Office Hours stream. But I don't need to beat that one to death when I can just link to the thread. So let me not get distracted from this whole guy. So this is the Hatchbook setup notes. This is getting to be a fairly long readme, but that's okay. You know, I was thinking about this header. the so if we look at it rendered yeah the readme open it's kind of in a self-deprecating way because i mean people on the internet spend a lot of time deprecating things and Self-deprecation is a reasonable response to diffusing those.

02:04:35But it is not a great intro to the code base.

...50Especially if I'm trying to attract contributors.

02:05:07So maybe it is time that this opening sentence moved down or went away.

...26Yeah, I don't know.

...40TheYagich the lede will be missed, i found it quite funny
So what if I just said. All right, let me maybe I could say, well, the code base.

02:06:13Despite popular descriptions of this site as a ghost town and the codebase as quite sad,

...43At least we have nothing, no relation to the self-help group. Then it can be its own little paragraph.

...58Isn't there a, yes, you know, this is on the about page.

02:07:09Yeah, this made it into the production notes. All right, so. Where am I? We're going to go ahead and just remove HatchBox to do because that is historical. I know I made edits. It's fine. None of them are meaningful if it's done. But what I want to see was app view about... There was something... So there's a thing I have said a bunch of times about transparency. Yeah, that is worth mentioning here. Is it in contributing? Maybe it's just something I've said so many times that I thought I had written it down.

02:08:25The code is open source as part of our.

...40To transparency.

02:09:05So, one of the.

...17Oh, this wiki is gone now. And we keep a list. Did I make a, yeah. So what's the link here on GitHub? Grab that.

02:10:07Yeah.

...36yeah yeah i'm gonna tweak this back to make the language a little stronger because it's funnier if it like accepts the insults in a self-deprecating way as opposed to the popularly described as despite being a quite a ghost town running on a quite sad. Code base, at least we have nothing to do with no relation to the self. moussx_ Have to go sleep now, have a good day! (assuming you're US bassd)
That seems pretty good.

02:11:42twitchtd cya @moussx_
Oh, yeah, have a good one, Moose. Mouse? Moose? Yeah, it's only afternoon here in Chicago. All right. Contributing new bug fixes and features. Development setup. Production setup. Hard thing about the code base. Do I have a way to render this markdown? No, I'll just push it live and force push it. We'll do it live because I know somewhere in here I'll have goofed something.

02:13:22Thank you.

02:14:07So there's all of that. Good. All right. Move Hatchbox info into readme. Don't say reorganize tidy. Probably want to spell Hatchbox correctly, right?

...43All right, let's go look at that on the production GitHub.

...54chamlis_ evening
All right, so the Yagitch, your lead is not missed. TheYagich o7
It has moved down just a little.

02:15:08All right.

...21twitchtd o/
So this this feels pretty good. I'm going to kind of like reread this. There's something about seeing it rendered in a different font where I see things I missed or I see typos or I see bad grammar. And writing on stream is harder than writing privately where I'm not trying to keep up a running stream of patter and also keep an eye on the IRC channel or or or.

02:16:00and let's see oh yeah this lobster's ansible that's outdated so all right let's grab that see this is where i was talking about the development setup really shaded into production

...46What is the way of... Yeah. Okay, so hang on. A Docker setup guide. graefchen Documentation is really really hard to get right. limesNodders
If you use that for development. But if you use... stall what do we say directly on your machine not in docker yolo mode old people mode oh man it was really interesting at rails con everyone just sort of assumed that docker was being used for development mode

02:17:52Obviously I didn't push, so that's fine. As long as I'm touching it, because I have to, there was something about the third party service almost eliminated, third party dependencies. What that really means is third party services, like queues, caches, databases, or SaaS services.

02:18:34Yeah, so I'm saying SaaS. So we were just talking about email. There's Mailgun, Postfix, SendGrid, all of those that we try and avoid.

02:19:16Then so now there's the development setup I didn't change anything here so i'm going to hope that all of that is still valid I kid but it's not that old production.

...40Okay, so here's the new thing.

...57There we go. I feel good about that now. That's reasonable. Then there's Zulip again. Setup.

02:20:13Yeah, this could be more detailed, but on the other hand, I kind of deliberately want like literally step two to be a little bit of a pain in the butt where you have to be able to read Ruby and not be overwhelmed by it, because you really do have to know a bit about Rails to stand up the site. Or you are going to learn quite a bit about Rails in the process. And so I'd rather people have that experience very quickly. That's kind of also why I didn't say, let me give you the git remote add command, because If the first two steps just go right into the jargon, hopefully people will believe these instructions, assume you know the basics.

02:21:05All right. I have seen one or two people who don't know any development try and get started with the code base, and that's... were following this was god a year two years ago so there were the older development setup instructions and they could do some of it and so they had done a bunch of things but realistically you have to have some development skill if you're going to run this code base in prod it is not i have joked about making lobsters as a service to try to fund working on it more but Also, the people who want to start communities think $5 is a lot to pay for hosting. So that was not a viable business, I don't think. twitchtd see discord
All these links good? Good. Discord, yeah. Well, realistically, it's more see Facebook and see Reddit. One of the things I like about lobsters, well, I mean, I would like it because they're mine, but it has a lot of very small design touches to try and improve the resulting community. Where Reddit and Facebook encourage some bad behaviors kind of structurally. And without getting into a pissing match of we're cooler than them, it's just, they're really aimed at different niches. And I would like people to have more options, especially if it's an open source code base that's hackable. You shouldn't have to run PHP in production anymore if you want to run your own forum for your hobby. And you can sort of run discourse, but I am really not a fan of that forum software's UI.

02:23:16twitchtd I'm not a fan of discourse either, it always has weird UX decisions
TheYagich i don't like how discourse looks either. so i decided to make my own forum software lol
But making the code base reusable under adverse circumstances by non technical people is such a huge project it's realistically something i'm never going to get to. yeah i'm. twitchtd @TheYagich link?
Oh, well, the baggage I mean you know if you want to pull features from lobsters knock yourself out three claws bsd is very from permissive.

...50Oh, I actually got this markdown right. Yeah, I'm also curious about your forum software, if it's something that you can share a repo or running instance.

02:24:12It's a little bit clunky writing.

02:25:27TheYagich it's currently running just for friends, and it's geared for small uses, but you can explore a bit of it here: https://forum.poto.cafe/topics
TheYagich this instance is written in lua but i'm in the process of rewriting it in python and flask
Looking pretty good, Jagic.

...34Game ideas you'll never make. Oh, I need to be posting in this thread. I have a list, a long list. But yeah, pretty classic forum styling here. There are limitations to this limit offset pagination. We've got a bug about that. That's clever. TheYagich it's css
Is this JavaScript where the avatars move, or is it CSS? Ooh. Hmm. How did you do it? Where do I find a tall one? Come here, page two. Here we go. So part of the reason this catches my attention is we don't serve JavaScript to logged out users on lobsters. twitchtd is it the object-fit: contain?
So seeing clever interactive features in CSS, I'm always looking for good ideas to steal. Is it this object fit contain? I don't know this.

02:27:01should be resized to fit its container. OK, so no, this is about maintaining the aspect ratio, right?

...18TheYagich no, it's on the parent container
TheYagich it's just a sticky
I bet it's this position sticky on the, yep. But then how does it stop at the bottom? Oh, it's still contained. I'm so hung up on how loads work in CSS because I've spent so much time getting used to them that, of course, I would think, oh, well, it's going to be relative to the window or it's going to float outside of its container. But no, that's very tidy.

...53Oh, man, I wonder if Chambliss is watching. Chambliss, do you want to put a floating header on comments? Might be nice for a long comment. Sometimes they bug the heck out of me when I'm scrolling. Sometimes they're neat. I don't know. Anyways, good luck with your forum. I can see you got the core features there.

02:29:00TheYagich thank you!
chamlis_ I'll stick that on the ideas list! position sticky is really nice, great idea to use it for forum avs
What did I. I'm missing a step on deployment.

...17What is. Shoot. So there was something I did. On production.

...34to wire up the cron job or the root deploy job. And I'm trying to think of the minimal, you just deleted the file. How do I pull that back out?

02:30:09there's a way to show a specific file and I never remember what it is in git and I haven't learned it in jujitsu though I saw it go by when I ran the manual jj file show that's what it was

...48yes this was the setup i even put it in code block i just forgot to bring that in okay so let's that on production is all right so

02:31:20Thank you.

02:32:09So you don't have to be rude.

...18So let's let's just bring that up here. Actually, you have to do this.

...33So this becomes four. And this becomes five.

02:33:09so let's grab all of this stuff and then to test your mail config

...43And then what was the other thing I was going to say logs are pretty important, right? yeah, you don't eat this. You need,

02:34:19The admin and.

...43You.

02:35:12five six user user or so wrong as console and create a user for yourself set is admin true we also do create a category one tag site to run at all. I think that's it.

02:36:35It's shifted from many. I've moved a lot of stuff out of the Rails console. All right.

02:37:13All right, there's that. Let's push that up. Call that done.

...25Whoa, that's very big. It thinks the hash is a... I thought if I indented four spaces, it would take it as a code block, but I guess not. Because I don't even need those. I did say group above. I don't love seeing

02:38:11All right, that's a little calmer. We don't need to scream about making symlinks. All right, well, Hunter, if you're still here, there's our mostly complete setup instructions for Hatchbox, which, so where, I'm gonna grab my, Personal browser because I have tweeted about tweeted skated you know made blue sky micro blog posts about how on office hours i'm going to. Do the hatch box stuff and every time I mentioned hatch box Chris oliver clicks like on it. So i'm going to. GoofballGuinness delive31Fire
yeah so here's one ex cid3 there we go because you click like every time i mention edgebox i wanted to send you the link to my setup instructions to the finished pushcx https://bsky.app/profile/push.c…
instructions there we go so i will post that and i'm gonna rather than figure out how to get my personal browser into dark mode i will just throw that link in there and y'all can see the conversation right right cool welcome goofball guinness There's a Chicago brewery called, what, Gumball? Gumball Head? I don't know. Reminds me of that because of Guinness.

02:40:27I'm really happy with that. As always, writing is slower than I wanted, but that's something really useful. I've seen a couple of sister sites kind of struggle through getting set up, so maybe that'll be more. Or alternately, maybe it's just me very gently and carefully easing my way into being willing to support sister sites some more. I'm still very reluctant to take on any kind of maintenance burden that they would have, but yeah, that feels good. So speaking of releasing code and it feeling good, let's play with recheck. Thomas, if you're listening, you have asked a couple of times, so we're getting there. Actually, Thomas, do you mind signing off if you're present still? I'll throw you a link here. pushcx https://github.com/recheckdev/r…
So I have pushed the code base live. And the readme is here. Yeah, let's in the scratch. Oh, I didn't do these two things. Yeah, let's throw them in issues.

02:42:17twitchtd recheck time! :)
chamlis_ yay!
So let's do that. Let's assign me. That is a bug. And then, fuck, I created a bug in the recheck repo. Trying to move too fast.

...42Can I? Because I admin both. Can I migrate this? Transfer issue. I did this with the lobsters repo. No, it's because it can't move between organizations, I guess. God damn it.

02:43:14Let me. So anyways, have you ever made a bug in production? Because I just did. Boy, what a relatable experience. Let's talk about that. Let me get this down, because if I don't write it down, I'm going to forget to do it. And then I'm going to need the logs.

...39I feel like I'm missing a sit and spin joke to sign myself. And then the other one is patch box after a couple of months. Let's re-enable HSTS, which is, can I just, This change at, there we go.

02:44:26Since this is a prod thing, it probably has to be done by me or Hunter, but I will still just, I'm not going to assign it because maybe somebody will explain all the steps and I won't have to think. All right, so. Let's close that because that's just a mistake. How is there already a... Oh, I know this person. I already have two pull requests. Well, I haven't set up the spammy DCO bot to force him to sign the, what is it? The developer's certificate of origin. dlamz CLA
Basically the like, yeah, I pinky swear I have the rights to submit code. It's like the contributor license agreement. Yeah, so DCO is a simpler version of that. So contributor license agreement really says, hey, Peter, you can do anything with this code. You can effectively take it as public domain. And if somebody signed a CLA to me, there is always that risk that I could pick it up and say, oh, I'm changing licenses. And now that open source code you contributed is closed source or is my weird business source license and I would like to make the. More of the public commitment that I won't try and pull that switch. But I would still like the at least somebody comes along and says something of yes, I really do pinky swear that I have the rights to contribute under this license and that you can reuse it so. I don't know. Maybe it is looking for trouble, but it feels like responsible in both directions. Anyways, let's get back to what I wanted to do. The reason I go into it is it's literally the top item on my to-do list, and I figured it out yesterday. If you look on my personal GitHub, you see Shameless is already opening, and you'll see I have a repo where I'm sorting out the DCO bot and fighting GitHub actions. So the important thing about recheck is this first sentence, that it is a very young beta. We will probably get some kind of exception out of it. It is a bit like test software, where it wants to rescue exceptions. yeah anyway bear with me it's early i am trying to do the like it's okay to be embarrassed by a very young thing in public because it's much better to get it out there and start getting feedback and real world use even if it's just my personal real world use that 25 i'm not actually going to push this commit but i want to name it what is today 21 stream recheck demo on the screen yeah because i'm going to find this commit and be like what in the heck is this commit all right so yeah that was everything on the to-do list oh i haven't even said anything funny

02:47:54So demo time. The gist of it is, on a Rails project, we get there with adding a one-liner to the gem file. And I am going to slightly tweak it. Yeah, these are roughly by group. I guess this goes in the database section. Right? It's a database thing. It's also sort of a deployment thing. I don't know. Not worth overthinking. The one tweak I'm going to make is path. So you can install a gem locally from local without having to do the whole build step. And I am assuming here for the purposes of this demo that either I am going to break something and want to fix it live on stream, or we are going to think of a thing to tweak and improve, and I will do it live on stream. And I don't want to cut a gem release every single time. So push CXR public Ruby. recheck rails i bet that's even correct let's find out look at that it's like i've spent a ton of time in this repo and then that's recheck rails because recheck rails depends on recheck i'm going to do the exact same thing for it actually let me do it in the other order i don't think bundler is path is order dependent But we'll make it easy, right? So then I say bundle. And I even remembered that gem file config correctly. Okay, so that's the only thing that is off script on the readme. Yeah, speaking of writing, I tried to put a lot of voice. I fucked up that link. That was one of Dusty's commits, I guess. I've tried to put a lot of voice into this readme of. And it's I hope it is recognizably the voice I am talking to you with on the stream all the time of we are all developers, we are figuring this stuff out together. I am explaining this to you like I would to any other colleague.

02:50:28So the setup command. oh man come on no crashes bam look at that it actually didn't crash my standards are so low so you can see it booted the rails app because there's that heinous inline partial and it kind of creates a skeleton of directories I'm kind of assuming, well, actually, because it's Thomas and Chambliss and DLAMS, who I know have hung around the streams a bunch. I haven't really explained what recheck is, but it's like a test suite of your production database. Because when you run a site in production, you end up with weird stuff in the database. And in the readme, I give an example of you might call order find one two three dot valid and get false like i have seen this as a real thing in rails apps and we will actually we will have exactly this bug in two seconds because the default out of the box check suite finds one of these in lobsters actually it finds a bunch of these but i'm going to find one in two seconds Yeah, I've got to expand this paragraph kind of a longer version of it I gave is you will see impossible invalid data in your production database because everyone writes stuff like I will insert this row into the table and I'll have a null in this column and I will cue the background job and the background job goes and hits the third party service or does the complicated calculation or Waits for 10 other rows to be inserted or talks to my other service, right? And then the background job fills in the null. And so realistically the null is only there for like. A couple hundred milliseconds. And you would like to be able to say the column is not null. But you can't really say that the column is not null after a second or two. That's not. really a constraint you can express you can sort of get there with sql constraints but that can't hook back in to something that is then going to like run back to the user's rails session and like insert an error for them that's meaningful to say that the third party service is down like you know i say this and i know i'm talking to developers and so everyone is like well twitchtd most product dbs I've seen don't have checks for data quality and integrity, the only place I've seen something like this was with data warehouse where analysis and accurate numbers are more visible to the business
It would be a four week project, but I could probably rig that up or like, come on, realistically, instead of gold plating, every single background job and SQL constraint and, and, and yeah, we make these things work. And then like, if we happen to be deploying to the background job, you know, deploying the background job at the same time, the third party service is down. And the production database is under heavy load and Mercury is in retrograde and it's high tide. The job just fails and you're left with a null in that one in 10 million record. And then six months later, the customer says, Hey, I have this one page in the add more dashboard that crashes when I try and load it. And then you go look at your production database and you get this kind of nonsense. This is a real thing. This just happens at scale. These are limitations of the tools. It's OK. dlamz i do a bootleg version of this on a django app at work. all sorts of things you wish you could express as cross table check constraints.
I'm not going to pivot into the we just have to write perfect code without any bugs and without any possibility of data integrity errors. Because realistically, yeah, cross-table check constraints is another big one. So one of my other go-to examples is you'll have two records. that are both valid, but the combination is invalid. So I set an order here. So like an e-commerce, you might have an e-commerce order that is pending. That's totally valid. An order can be pending. You might have a shipment record that says it is delivered. That is totally valid. But if that shipment says I am for that pending order, like that's what your foreign key is, that is an impossible state of affairs, right? Either you checked out that t-shirt e-commerce order or you didn't. And if you didn't, it shouldn't be sitting in a box at your front door. But these things happen in production databases. Explaining that individual records are valid but the combination is invalid is a thing most ORMs cannot express well. twitchtd credit & debit tables are the only place I've seen where data integrity was built explicitly for it in a product
TheYagich tell that to aliexpress...
like they often can but again we're getting into it's a giant project to gold plate that sort of thing don't do that yeah aliexpress yeah you know i say the e-commerce order i won't name a previous consulting client because i won't talk about them being a bug but that is a real thing i saw in a real e-commerce company And I don't remember the frequency because it's been a long time. But these are real things that happen in production databases, and it's okay. The goal is not some kind of mathematical perfection. The thing is, we want the goal is really effective. And that includes cost effective engineering, because it's not engineering if there aren't dollars in the equation. And programming time is very expensive, even with vibe coding. Oh, God, there's that sad story kicking around viral of some guy who had a vibe coding tool hooked up and it dropped the production database. That was sad. So jumping back to recheck here. The command I ran introspected all of the Lobster's models and started creating a check suite for them. And this regression directory is empty, because that is for, hey, you think you fixed this bug, but especially if you're not 100% certain what caused that bug, because it only happens once every nine months, that's a regression check. Write that check. You will thank yourself when it reoccurs in nine months and you have a check that has an explicit link to your runbook and link to your doc and link to your GitHub PR where you last thought you fixed it and that explains all your context, you are doing yourself a solid in the future in the exact same way that when you write tests, you are doing yourself a favor nine months in the future when that test fails and then you look at it and you have all this context. So let's look at one of these. Specifically, because this is a demo, we are going to look at the recheck validation for, yes, story. So this is the embarrassing part where it is not finished. You see all these coming soons? So hang on, let's get the model up. Model story. No, models, oh man, I'm auto-completing the wrong one. All right, so on the left, we got the actual Lobster's story object. And on the right, we have the generated validation checker. So Recheck has introspected on all of these because If we have a title, a validation that the title is in the length 3 to 130, we should be able to query for any title that is less than three characters or more than, oh, I'm sorry, 150, I misread, or more than 150, and any record we find is invalid. Similarly, there's presence. It's not allowed to be blank. And in Rails, presence is both not null and not a string that's all whitespace. So it generated a query that's a little futsy. I'm doing what I can to improve them, but we can look at the generator for it. It's a little wild. That if we trim, oh, the token, that's the include one. We're a couple lines down on title. that none of those are blank. twitchtd oh you auto generated recheck validations, based on rails validations
And there is the placeholder here for title. And my friends who kicked me in the ass to be like, look, this shows that it works. This is a checklist for your pre 1.0. Yep. So rather than call them recheck validations, each individual one of these, this is a query, and then in this whole file is called a checker, a collection. And every query runs against the checks. And for this check, if it finds an invalid record, you failed the check. The reason for this, I will show this over in the Ruby readme, is sometimes it is much easier to express what a bad data is in Ruby than in SQL. So an example I gave is like, you might have an initialize and then query for users with avatars because they should be updated in the LDAP. Well, your check wants to say, let's go hit the LDAP service and make sure that they have exactly one record. twitchtd yup, like a valid uri or something more complicated that doesn't have native db functions for it
And then similarly, you can say, let's check that the avatar actually exists as a file on S3. And those are things we can't express in SQL. Yeah. I mean, especially because these are cross system integrations. So the primary purpose of recheck is check your database, but it is kind of quietly a general purpose tool. Because this is checking, are these two different data sources in sync? Is this file up on the file hosting service? I'll show you some of the more generated ones because there's more general stuff. I don't want people to miss out on the general idea, but yeah. So on this one, the validation in the individual check is very simple of if we found anything, it's bad. So let's run it, right? Let's demo. So it is... We're going to say bundle exec, recheck, run. I am just going to run the ones in the validation directory, and I will explain why in a bit.

03:02:14It's like test output. This kind of thing is pretty familiar, right? So these names match all those guys that were in the validation directory. And it's telling you, oh, well, there were this many queries in the checker. Records failed, records passed. It is okay to see that there are zeros here because if we're just querying for invalid data, nothing returned is a success. However, Those are not successes. Very conspicuously, that is not a success. The story validation check called query presence title ran the check no invalid records found, and it failed on the record 84, 270. How coincidental that the query I explained everything with as an example has failed. And I kind of say it's silly, but this is... A real lobster's production database backup from two days ago. This record is bad in prod right now. This is a real invalid record that I found. And then I realized I should feature it in the demo. But when it started working and producing useful output was when my friends started kicking me in the ass to get the beta out. So let's just look at that one first, right? And then I'll talk more about recheck. But this is a normal part of using it, is you see some bad data. So if we say s is story equals story dot find paste. What do we got here? Story dot valid. False. right bullshit i haven't even started breaking this story and it's valid invalid coming right out of the database that shouldn't be possible this happens all the time and i showed you this incomplete one i guarantee you when i fill the rest of these in over the next week or two don't don't hold me to that promise i'm working as fast as i can but when i fill those in very soon i guarantee they will find more invalid data because it's not just that when people add and update validations, they forget to query databases or they write imperfect migrations. It is weird stuff happens at scale. And we have 10 million records in the database. That is 10 million chances for something odd to happen. Short. Why did I type that? Because I'm talking. twitchtd using stale data, not using transactions correctly
Story.errors. So if we look at this, This is the title validation failing. Probably because the title was blank. That's weird as shit. That shouldn't be possible, right?

03:05:33So if we look at the story attributes, we see a story that someone submitted that has a URL. And one conspicuous thing here is That is about the spamiest domain I've seen in like 10 minutes, fullremote.it. You know it's selling something, right? It's straight to the root of it. It's named after some commercial service. Somebody submitted something, and then I would bet I say story moderation. Oh, no, I got to go the other direction. Moderation where story is story and comment is null, nil. This didn't get deleted? I thought for sure I was going to see a moderation log entry deleting it. veqqio fullremote looks cute, one button outputs "quack"
So wait, what's the short ID? Let's go look at it on prod. If this 500s, story was removed. veqqio oh, the button is a duck emoji too!
If story was removed, why don't I have a moderation log?

03:06:48twitchtd maybe the comment is not nil
dlamz sounds like a new check is necessary
chamlis_ deleted but not moderated, deleted by user?
Did I get the moderation query wrong? No, the submitter removed it.

...59Yeah, it was deleted by the user. And the fact, yeah, so is moderated is false, is deleted is true. That combination of, so I hope this feels realistic because this is real data, but this process of There's something weird in the database. I wonder if, so I haven't actually thought about what this bug is, but knowing that the user deleted it, I wonder if in the story controller, the store control, def delete, destroy. Yeah, okay, so the user can edit it. We call update story attributes. Now that should run validation. Yeah, I have no idea how this user managed to delete the title off of their own story. They should not have been able to do that.

03:08:08So this bug is a mystery, right? Like that's a real, I legit don't know what caused this bug. But at least I do know there is a bug. And I have the same debugging situation I would have otherwise of, well, I could put in a placeholder title, right? And then this individual record is fine. We're probably never going to undelete this record. chamlis_ that validation wasn't added until about a year ago
That's probably the appropriate thing here. Maybe I could add a check for the validation wasn't added until a year ago. Oh, Chalice, is that the answer? Ah, good catch. So this is a case where the validation was changed and nobody wrote a migration. A random contributor without access to the prod data probably could not have guessed that one. twitchtd nice find
Yeah. So, all right. So I don't, I'm deciding, do I want to leave that record on prod? So I have it available as a test or do I want to fix it right now? We'll put it on the post stream, fix story. and not still in the clipboard. Thick story 89, somebody? 84, 270.

03:10:10So that's kind of the heart of recheck, right? That is the main workflow of we should catch if there's bad data in our databases. We want to know these things.

...31Recheck has a concept called reporters. So I am running the exact same suite of checkers. And instead of getting, there's an exception. Oh, I knew it was going to happen. Live coding. Undefined method for nil. Oh, I know what this bug is. This is, I was changing this API. Yeah. I want to fix it in the JSON reporter.

03:11:30What line was that? 226. There is a set of hooks that reporters work on. And a reporter is just when the runner runs, it calls your hooks. And so there is a hook for around the whole run. There is a hook for around each checker. There is a hook for around each individual check. And... Halt. Why did it fire a halt?

03:12:10Let's say can't run. Oh, so there's some safety stuff that I was touching. I bet I can run this on the individual. So what's happening is the runner tries to be a little bit clever and say, hey, if you told me to run a check and... There wasn't any queries to run. Let me give you an error. And then the JSON reporter is not updated to handle that error. I will have to add a test for that. So here is an example of running just that story validator. So The default reporter gives that standard out that you saw before. The JSON reporter kicks out a bunch of stats in a JSON format. And then this is where the sales pitch begins. So RooCheck is on the sidekick model of, it is a very handy piece of software that you can use in production. It's open source, have fun. If you are a business, you probably want to pay for Sidekick Pro or Recheck Pro. One of the features of Recheck Pro is a reporter that sends JSON off to a nice dashboard that you can show business stakeholders. I will not be demoing the nice dashboard because it is nowhere near demoable, but that's the gist of Recheck as a product and sales pitch. The other thing you can do with reporters is just catch those failures and say, file JIRA tickets, or I call it cron, right? Oh, don't need to push that to JQ. So this is a reporter that also needs its API updated. All right, so anyways, you're seeing a real beta. Checker class, yes. These are just out of date. This one is a default reporter that I wrote, or an included reporter that I wrote that is intended for cron jobs. Just the easiest way to do this, to run this in production, is to just run this command in prod. And the cron reporter is the default reporter, but it doesn't print anything except about failures. chamlis_ ahh cool
So it's useful in cron jobs, which presumably you have wired up to some kind of email or bug tracker or logging service. The cron reporter, if nothing fails, it doesn't print anything. Class cron. So this one, like you can see it around run, and then it's, if there are any errors, okay, print a summary. But otherwise, don't print shit. let's see what else i think that's all the ones that come out of the box reporters this oh right it's not run list reporters no the command is list can't remember my own freaking it's just reporters every command line tool is different right so we got cron default json i've shown you silent so there is a it is intended as a high roi engineering tool where i am not at all dogmatic around this thing can only have read access to your database you can only do these things if you can write like these will become queries that find invalid data if there was a way to fix this up and say well you know so let's let's strip this down right so checkers are intended to be very cheap in the same way it's always cheap to make another unit test file right so if that was the only query check fix titles it gets a story story title equals place folder date current you know if this bug started happening every month and it was only ever like deleted records and it didn't matter if you can fix it directly from the checker And if you remember, I showed in the README that example of like, there's the LDAP service, there's the S3 service. If this can be like, you know, kick the data to the S3 bucket again, restart the LDAP job queue. Like if you can do that, just do that. You don't actually have to bother a human. I won't tell anyone. That you detected a problem and fixed it and again like yes theoretically you would never have problems and you would fix whatever was generating problems in the first time. But all of these systems, we work on are these complicated automatically distributed systems and it's not just that we're distributed of. We have web workers and we have database servers and then we have oh it's a cluster database and it has failover and then we rely on a third party for email and s3 for serving and cloud flare for this. schlepping state around between different services that may or may not be down is an imperfect thing there will never be right like there's the CAP impossibility theorem. If you can write a checker that finds bad data and can just fix it, just fix it. It's okay.

03:18:52So that's the heart of recheck. So these are the validation ones where it's introspecting. Let's not save that. Where it's introspecting. There is an even simpler check I can give you out of the box, right? Let's go with story as long as I'm here. That's a pretty simple check. Grab all the records in the database, ask them if they're valid. And it is a very simple check and it will not pass. And I don't say that just in the narrow sense of you've seen that we have a story that's missing a title in the production database. I will say in basically every app that has been in production for more than two years and has more than 100,000 records in the database, cannot run this check against all of the tables and pass. It just, it basically doesn't happen. So I can generate this out of the box and it's useful. The folks who've worked on very large systems are looking at this and thinking of the term IOPS or like bandwidth pressure. Yeah, this walks the whole database. Yeah, this retrieves the whole database because in a Rails app, the concept of being valid is expressed in Ruby, not SQL. At a certain scale, this is a bad checker to run. If we didn't have, well, we have 120,000 stories in the database. If we had 120 billion or 120 gigabytes of stories in the database, this would be a bad checker to run. What you would want to do, and this is me transitioning into the pro pitch again, let me be clear, is what you would want to do is kind of decorate this and say where created at is greater than now minus interval one hour. Well, let's check everything that was created in the last hour. And really, you want to run that like, I don't know, every five minutes because if data is going to go bad it's going to go bad fast but then you want to run this i don't know every hour right check everything in the last day and then once a day let's check everything from the last week And you know, if you get huge, like up into the billions of records, tens of gigabytes of data, well, more than tens, right? Then you want to say, not all, but story dot stochastically chamlis_ Story.some
time waited waited sample i'll find each and so you can do very smart things like okay so for records that are years old let's sample one percent of them right for stories that are a a decade old let's sample a tenth of a percent of them and we'll make it a random tenth of a percent and we'll run that once a day And then on average, every record will get looked at every couple of months. And that sort of thing is fine for a huge scale. You don't have to be perfect. This is engineering. There are dollars in the equation. The goal is these are cheap ways to find these things before your customers report that parts of the dashboard can't load or the job queue backs up because one job is failing. And I keep going for those rare examples, but the other one, especially for regressions, like that title, if that's one in a million thing, fine. It is very, very nice to know when a one in a million problem becomes a one in 10 problem, because I have seen systems do that where they have a very nonlinear change in the rate of errors. And if you have a regression checker, one that goes for a known bug, you can find those very easily and promptly, right? So if we had a query for, oh, so I talked about like, there was that e-commerce, right? E-commerce, and we're gonna say query pending, shipped orders right so that's we're looking for any records where order where status is pending and then i also want to say i'm exercising the the rails api in my head here right i want to joins shipments And the left join is fine. And we want to say where shipments. Yeah, we'll sort that as shipments as a multi. Shipments status. Actually, really, we want anywhere a shipment exists on a pending order, right?

03:25:20I should have thought of what this query was. It would be a better demo if I had thought of what this was. It's easier to write from the other side. Shipment.wares.

...44Right. Exercising the Rails ORM off the top of my head is not the best part of this demo, clearly. So that was the example I gave earlier of, is there any shipment that exists for a pending order in the database, right? Now we could write, well, let's just check none exists. If we can query for it, fine. so do i not have the right i like these endless methods is it there it is so i didn't i didn't actually have the generator write endless methods because there were recent enough ruby future that i didn't want to include that but obviously i like endless methods because i'm a big ruby nerd so Should one purchase Recheck Pro for your business and it reports to the database, every once in a while this is going to throw a record into that database and say, like, hey, order 123 is invalid. And on the dashboard, oh, come on.

03:27:22Dashboard has your typical issue work queue of new and valid records. You work them. You mark them as fixed. You mark the bug as fixed. You link to your playbook or anything. Or alternately, and especially if this is like an e-commerce, You don't really care if the order is older than a year. So we could add that here, right? We could add that to the where. Or you can also just mark invalid records as known and say don't alert on them and move on with your life. Sometimes that's okay. Especially if there is like a bug in data you don't want to delete because one process needs it. but it's invalid in a way that's going to trip some checkers, it's okay to ignore those. twitchtd I can see the dashboard being useful for spotting long term trends, like oh, story titles have been trickling in with empty strings, so our bug wasn't fixed yet, like your regression tests
What else? I'm trying to think of what other stuff I should show here. Yeah. Yeah. Oh, I got there because I was talking about error rates. If that whole idea of a pending order with a shipment attached If that shows up like three times in history and your dashboard shows like three little bumps years ago, and then all of a sudden that graph spikes up and says, oh yeah, now it's true for 10% of orders. You know, you go and hit your big red incident button and you summon the war room or whatever your team's jargon is, and you fix that. But you know that it changed from a one in a million problem very quickly instead of very slowly. Yeah, I've talked about, especially that e-commerce order, I've talked about these things are inevitable in complex systems, and everyone has the hubris of like, but I can write the perfect system. You know, if they just used functional programming, if they just used SQL constraints, if they just used Kotlin, they wouldn't have these bugs and it wouldn't have, come on, I know, I have been that guy, right? Everybody has been that guy at some point. However, bad data is still inevitable. because humans have access to production. TheYagich that's why the only real solution is to stop programming
So like if I go and clean up that order, right? If I go and clean up that title, right? So story.find84, 270. I could have typoed this ID. I could typo the new title in here. I could call save validate false and skip it and put new and different invalid data. And like that e-commerce order of the order that's pending, maybe I go to mark that as completed, or maybe I delete the shipment, right? So if I said shipment.bind123.destroy, Did I get that order number right? Especially when it's like this long, did I get that order number right? Maybe I just inserted more bad data. Maybe your customer support team has an admin interface into the data that doesn't run every single validation because part of what they're doing is fixing other invalid data.

03:31:30twitchtd I could see recheck being a great tool for analytics teams as well
As long as humans have access to production, you know, not to sound like Agent Smith, but as long as there are humans in the loop, there will be real bugs in your data. So even if your favorite programming language and favorite testing tool and favorite everything work perfectly, you will still have invalid data. Thomas, how do you mean for analytics teams?

03:32:10So I'm going to skim this. I'm trying to think of if there's anything else here I want to demo. Huh. Apparently I want to demo fixing Markdown just like I fixed it on the other one.

...33Why does it think these quotes are starting? All right, I will fight GitHub Markdown later. So I showed you making reporters. twitchtd they deal with a lot of high visisble data to the business, but they usually don't query the product db, they usually have a dedicated data warehouse where you can do full table operations (OLAP) really quickly
Oh yeah, there's a little demo one here in the readme of, well, maybe you just email the team responsible. So in the same way that your unit tests are just Ruby scripts. Ah, yes. And especially if you have any kind of data lake, data warehouse, because you don't want to do your big giant queries on your online transaction processing database, your OLTP, yeah, of course it can run against that. So yeah, maybe you can write a reporter that says, all right, well, when this around checker, around run goes, if there are any errors, we will just email the team and we can look that up on the checker. Where did I put that? Up in the example check. Yeah, so like here on a checker, Let's just add some metadata, like who is the team this is for? Security. Great. Then your email reporter can say, let's look up the team and email them. I've tried to make a pretty general purpose tool so that even the open source version is twitchtd some data quality issues due to ETLs being flaky or not implemented correctly, caused our analytics team to build checks into their queries so that they don't show bad data to c-levels
A couple of lines of code to customize and plug it into whatever your production system is because the important thing is. twitchtd former company, not working there anymore :)
You want to find this stuff you want to get it to the right people and what that looks like inside your company that might be email that might be slack that might be jira that might be github issues that might be the recheck pro dashboard available now you know.

03:34:46Ah, well, I hope you left on good conditions and can pitch it to them. But yes, that is the kind of thing that this can help with, especially if you run recheck in a way that it can do queries against both databases. You could write a query that says, because recheck itself, oh, where'd it go? Recheck, story, valid, no, valid story. Recheck doesn't care that this method returns an active record query. It cares that it returns records in some kind of innumerable. And that's one thing I could demo. If we go look at site out of the box, three more of the checks that get generated are a DNS checker. so by querying we are going to say let's query for lobsters our production and lobsters.dev our backup and then standard rb formats it oh yeah i didn't feature it but when it generated those just as a cute little thing it detected that standard rb was installed and ran the code formatter on the checks so that hopefully they look like you lent your project to look so if we run this recheck site was that email what did i dns oops the name is slipped out of sync with the file name Did I change commits?

03:36:48Recheck run. Yeah. Well, I mentioned I promised there would be bugs. What did I do different between these? I typoed. I missed the letter C. Okay, I just couldn't see it because I've been going for three and a half hours. So this one said, okay, well, we had a query and we ran six checks because there were these two records. We checked these three things of you have mx records do you have an soa record do you have an spf record so you saw me run that mail testing tool maybe two hours ago on stream some of that stuff i can put in a check and know immediately if i have managed to goof up our dns it doesn't have to be the database so let's see the other out of the box is tls so for your domains check that your certificate is not expiring in the next 30 days, and I picked that because let's encrypt at 30 days renews. So maybe this wants to be 29. I'll fiddle with it. You want to check that the certificate has the right domain name on it, especially in the days when you manually do certs. This was so easy to get wrong and so irritating. You want to check that it doesn't have real old insecure ciphers because they are weak. And this guy also has a little helper method for fetching the certificates. So this is kind of the shape of a check doesn't have to be a single record in the database. Most of the time it is, but it can do more. And then the last one, who is, oh yeah, check that your domain isn't expiring soon. Check that it is registrar locked. You know, even though I've worked On a top level domain, I sometimes get registrar versus registry versus registrant wrong. At least it's not Latin. So this kind of stuff, like all of these are trivial, right? Like everybody has said, hey, my domain name isn't expired. Hey, it's locked at the registrar. Hey, I have two name servers, right? The value comes if this runs in the background once a day, you know that it didn't accidentally get broken along the way somewhere. Because these things are easy to break and they are embarrassing to break. Lobster's had an outage because our domain name expired a couple of years ago. That was embarrassing. A check could have caught it. I mean, how many sites have we heard of that have had downtime because an SSL cert expired, right? You see one of these once a month, even with Let's Encrypt. So this isn't quite an observability tool, right? It doesn't quite fit that rubric. It's not, is the CPU graph reasonable? but it's sort of observability of the production database. It's sort of a unit test of your production database. twitchtd it's like a spec check?
It's also sort of a unit test of DNS. So it's kind of a new class of tool. Yeah, like it's in the neighborhood of all of these things, but it is its own thing. It's not quite a test. It's not quite observability. And I think in the same way, basically every non-trivial programming project should have tests. I think every non-trivial production service should have checks. It's just a class of thing that we want to use. So that's why I'm writing this. Because as soon as I saw this one, you know i say saw this one so i was inspired by seeing the equivalent service inside of stripe that none of this code is from them none of this data is from them because of course they have their own everything but i've talked to developers about this idea for more than a year now and i have run into at least me be deliberate yeah at least eight independent reinventions of checks it's funny like i talk to developers and some people are like oh i get it that could be kind of useful and then one developer in 10 gets kind of a wild light in their eyes and it's like we have a version of this and it's this crappy service that runs on cron jobs and slack notifications and it's tied together with duct tape and it's nobody's full-time job to maintain but we can never turn it off and we call it revalidators we call it double checkers we call it integratron everybody comes up with their own name too But it's never anybody's full-time job, because we don't have a word for talking about it. And so I'm trying, you know, even if the whole product side of it fails, and I really hope it doesn't, I really want people to see this idea and go, oh, I want something like that. I would catch bugs faster with something like that. I have suspicions that my bugs are recurring, and even if I fix them, I don't have 100% confidence that I fixed them.

03:43:26So I would like people to understand the basic idea of ReCheck and checkers. And then yes, buy licenses for ReCheck Pro. And eventually recheck enterprise I just learned, I was talking about the beta to somebody yesterday and they were like oh yeah if you do sock to compliance you're required to document your backup plan and then also demonstrate that you can restore from your backup and validate that the backup has. Something equivalent to the prod data in it. document this. twitchtd "validate" = count(*)
I didn't realize that was part of SOC 2. That went straight onto the to-do list for Recheck Enterprise coming no time soon to theaters near you. Yeah, that could be one part of it. And you can get more sophisticated if you need. This is a general purpose tool in the same way this thing loops a Ruby array. You know?

03:44:46Did I call it check based or not? Yeah, no, I rename that. I have bought these names so much. There's been so much polishing.

03:45:01Right. You could like. We could say I'm going to use e-commerce, right, so we're going to say. Order equals fraud dot order by random dot first, right? Like, give me a random order.

...45prod-order.id. Hmm, I'm actually, because it's so oriented, maybe the API is not expressive enough, because I can't really, eh, we could slop it. We can just say, prod is prod-order. In production, I would do this with like a data class in Ruby, because I love data classes.

03:46:33So instead of just validating that there are the same number of records, just sample me a random record. And if you put up recheck, so a very important thing to say that hasn't come up is even as I've pitched it, recheck is not a SaaS because, oh my God, do I not want production access to your database? SaaS is a on-prem software. You figure out how you want to host it in your data center. Do not give me the production password. don't need to paint that kind of target on my back there are still like supply chain attacks to worry about and being in this position makes you a more valuable target but like actually having production access is you know horrifically concerning i don't want that recheck is on prep and so if you set it up and you know You could set up multiple instances, right? You could set up one instance that talks to prod, one that talks to the analytics, one that talks to two services. And this could be your SOC2 type II. It bugs the hell out of me that SOC uses two different numeral systems. SOC2 type II is SOC2 type II. Yeah. That's just a pet peeve. It's a tool. Use it in general ways. Your tools should find your problems for you. That's the whole idea. I never talked about the model checks. So if I said run... I meant to start one of these in the background somewhere along the way. So this is the... very simple validator you saw before that just for each model calls all.findEach and then loops over them. So lobsters has 12 categories in production. They're all valid. It is printing a dot or an X for each thousand records, I believe is what the default reporter does. So there is not The comments have had a number of changes to their validations, there is not a run of 1000 comments in the database that doesn't have an error in it. or a validation that doesn't match. it's going to print out a giant list of comment ids there are roughly 600,000 comments in the database. So we are going to see 600 X's. So that's going to take a second to run, which is why I meant to start that earlier, because then, you know, I could all tap back. We'll leave that chugging for a second. What else is there in the readme worth talking about? twitchtd so every batch of 1000 comments has valid? == false records? wow
Yeah, that is the weirdest markdown issue with highlighting. Oh, yeah, you can pass things. Yeah. Oh, here we go. So finally, a couple thousand records in, we got two batches of a thousand. Yeah. So this is not a like prods on fire, right? Like prod is up. You can see comments right now. We're okay. This is almost certainly we added a validation and we forgot to run a migration to fill in a column for basically everything. But it's not 100%. It's like one in a thousand. the default the runner does have a feature i don't there isn't a great way to demo it but the runner does have a feature that if it starts running a checker and the first 20 checks all fail or all throw an exception it calls that a blanket failure and it stops running that checker because the situation where you add a validation and there is a million old records is so common or the situation where you write a checker for the first time and you have an exception in it it's also common so it just stops There's an example of scripting the runner because, again, it's a tool. Use it. Some notes about how to run it in prod. Admin screenshot to come. And that's about that. Yeah. I would like to let this... All right. So several comments have errors. Where was that little summary, though? Yeah, so 588,098 passed, 5871 failed. So that really is like a 1% failure rate. And when you see something that high, it's probably just a validation or something. We could just pick one of these and look it up, right? While this thing chugs in the background.

03:52:18I hope this comment isn't awful. What are the odds, right? Oh, so this is a validation that's supposed to prevent you from replying to a comment that gets deleted.

...49Now my autocomplete muscle memory has changed.

...57So the validation, yeah. So the way this validation is written is valid at create time, but is not valid later. So this isn't bad data is in the database, that's this validation is wrong and has a bug in it. Because regardless of the business rule, the business logic that you can't reply to deleted comments, I should be able to load the record out of the database into the console, edit something, and save it back. So in this case, recheck has found a bug in the validation. And that's usually the case when you see so many fail. The rest of these are looking pretty good, right? So like common state is going, invitations are going. But again, this is this idea of, can you even just call .valid on it? Lobsters is pretty good about validations. There are more things that could have validations for that it doesn't, I don't know. You know, another example of, I would probably, in production, I would probably delete the hidden story checker, because that's a join table. It doesn't have any state in it. I would probably delete, well, oh, well, maybe I wouldn't delete the link checker if a couple somewhere way in the middle have errors. I expected link was just going to run clean. It's sort of a generated, extracted database. twitchtd this seems like a good thing to run against lobsters for the sqlite migration
And the code treats them in a transient way. We store them in the database to save on parsing. But if the comment is edited, we just blow away the records and recreate them. Yes, this seems like an invaluable thing to run actually before the SQLite migration. So let's clean up the data and make it great. Let's clean up the validation, especially that comment one. I don't know what this link one is. So I have run this suite before. This is a very human thing I'm going to say. I have run this before. I started getting good stuff out of it. But alphabetically later, checkers failed enough. And so story is late. Story is important. I have to scroll up to see link. And so... I haven't checked on these link failures and don't know what they are. I'm going to throw up the note that I'm going to wind down the stream here, the little last call note. How do I toggle you on? There we go. Can I put an outline on that so it's a little more readable?

03:56:20There is a checkbox for outline that doesn't do anything. Oh, Twitch. So anyways, we're getting towards the end of the stream. I'm curious to see the link checker. Yeah, what is this?

...44Come here. It's also just kind of fun to use this because... You find all these little mysteries. I don't love having bugs, but I do like investigating them. And I do like fixing them. URL is not valid? What's your URL? Why is there a year at the end of your URL? With a space? Okay, so this is a bug in the link model. Oh, that's new. So I said links are automatically, come here, app model link, are automatically extracted from comment records. Yes, we say link where from comment, we call parse links, which is a method on comment.

03:57:49All of those files are newer and the autocomplete is tie-breaking on newer. So that's why they keep jumping in front of our links. So we scan the href. A bug in a regular expression. Who would have thought it? So yeah, Recheck just discovered another real bug in lobsters. I don't know what this is offhand, but it's grabbing some text after a link. So if I said l.comment.shortid, really? It doesn't have a... I'm forgetting the join model here. Oh, this one is...

03:58:46This one is a link to a story. That's an even sillier bug. I know what that must be then. If it's assigning a link from a story, is it just taking the title? Maybe it comes from the story model.

03:59:30there's there's a different regex with a bug in it yeah that's a little suspicious so anyways yes recheck has found another real bug in lobsters i think that's three oh yeah i didn't even write a regression check but let's hammer that out actually that'll be fast let's finish there so if i wrote Don't need this. I wrote.

04:00:10Class thread ID, but this is a real bug. I saw this in an early version checker base and. That query, we're going to find a we want to find any. Comment where. Was it the thread ID that was wrong? Where did I put this?

...51I made a note somewhere. Ah. Yeah, I didn't actually query like that. I don't remember when I saw this a while ago. So if we find a comment where the parent thread ID, that's actually cleaner to express. That's fine. couple of comments in the real production database check invalid this one also is false because if you can find anything like that let's give you a file name you are recheck regression thread id and if i had a like a github issue you know i would say this is github issue one two three and i would also say like you know should have was reported in bug one two three fixed in pr number four five six so when this runs in the background and this situation happens again and i come look at the failing check i have left myself the breadcrumbs to actually go fix the bug to understand it regression oh no what did i do wrong

04:02:35An initialized constant comment is not... I didn't load the helper. It's got to load the helper to boot the Rails environment.

...57so here is a real record again this is all in the production database this one was a mystery because this was one of those before my time kind of things we look at these this comment there is a discussion where this was supposed to be a reply to kyle jcs if it seems if i click reply and then preview and then post my comment it loses its parent so tedu posted this comment as a reply And 12 years ago, so like, this is early on in the site, actually later in the site than I would have thought in the first year of the site, at least we'll say the comment got posted as a top level comment without a parent. And JCS came in and he fixed the production data. He also exercised it on his own comment. Yeah, so we're highlighting a couple of these in the same discussion. So this is that example of as long as humans have access to the production data, it's not safe, right? So he fixed the parent comment ID. He did not remember to fix the thread ID. This is an implementation detail of lobsters that we could probably drop now. but each reply thread starting from a top level parent has a unique ID. And this mostly exists because MySQL didn't really support, well, and Rails definitely didn't support recursive common table expressions at the time the site was originally written 13 years ago. So this thread ID worked around that limitation And when he fixed this bug, he fixed one of the fields, but not both. And so this one, I have a really confident understanding of what the bug was and totally why it appeared and what happened. But if this was like a rare intermittent kind of thing, if this was, because this is, you know, literally two comments in 600,000, if I had a bug with a one in 300,000 Error rate, especially in a distributed system with background jobs with third party api's with all of this other stuff if I fixed a bug at that kind of frequency. I would almost always write a checker for it. Because it's going to take days weeks months to happen again, and this one, even if it only happened because a person went and edited the record. I mean, look at this. This is literally one line of code and one, two, three, four, five, six, seven, eight lines of boilerplate. That is a nothing to maintain. Even if it is a human error, I would just Leave this in. Leave this on. Maybe five years from now, I go and edit a comment and I goof this up. Having a checker is so cheap. Having a check is so cheap. The slowest thing about this check was booting the Rails app. Actually running this check and you know it doesn't break out a timing profile, but actually running that one query is a you know it took a millisecond against the database.

04:07:00checks man they're great. I just closed the terminal with the... So Vim, when you resize terminals, it's not really made to be a multiplexer and it chopped off the output of running that whole suite. I don't want to keep the stream running long enough to get that all again, but you could see that these kinds of issues, like something is wrong in this regex that needs to be reported and debugged and get some tests around it, we caught that with a check. Someone fixed a different bug in comment and didn't quite fix the data. We caught that with a check. Our domain name expired. We could have caught that with a check. There's a lot of value in this, and I hope also that businesses see there's a lot of value in it. So that's the demo. Oh, big stretch. I'm going to call the stream here. Yeah. Little disorganized, only got one exception, two exceptions out of the tool. That's not bad. That's a pretty good demo. There's a place folder site there, but I want that in the logs. pushcx https://github.com/recheckdev/r…
And then let me grab these. pushcx https://github.com/recheckdev/r…
You know, tell your friends, tell your boss. What would be most useful actually is if folks if you have a like smaller sass and here i'm thinking you know a dev team under 15 or so where things are a little less formal and you have production access and would like to beta test this i would love to have some beta testers i will pair with you on adding it to your code base Although you saw it's just one command. Takes about 30 seconds. And helping you start finding your bugs. Because you know they exist. This kind of stuff happens. Anyway, that's my big pitch. That's my demo. Thomas, I hope you enjoyed it. Because you especially... I keep saying you because you've been like... chamlis_ thanks! recheck looks great
twitchtd thanks for the stream
TheYagich thanks for the stream!
prompt in the twitch chat every time i've mentioned it the last couple of streams which i really appreciate that kind of encouragement thank you and that's all i got all right so the next scheduled stream will be thursday morning 9 a.m chicago time and i didn't there's a yeah we're not to it yet in a couple of weeks in mid-august i will be out of town for a while and there won't be streams but that's like three weeks out so i will hopefully see folks on thursday thank you chanlis i appreciate it and at some point here the beta will be mature enough that i will just add this into lobsters and lobsters will be one of its first users. So we're getting there. Thanks for tuning in. See you next time. Take care.