Wrap it in quotes and pray

Streamed

Zig’s AI ban and the poker metaphor for investing in contributors, similiar to PR #1907. A wave of LLM-authored slop pull requests. Caches dropdowns and hide actions. Fixing the cache on /rss by iterating in prod (yolo); config file expressiveness. Watching green scrolling text like the Matrix. Do observability tools earn their complexity.

scratch


topics
  https://lobste.rs/s/ifcyr1/contributor_poker_zig_s_ai_ban
    https://github.com/lobsters/lobsters/pull/1907
    this last week's wave of llm PRs
      https://github.com/lobsters/lobsters/pull/2007
      https://github.com/lobsters/lobsters/pull/2006
      https://github.com/lobsters/lobsters/pull/2005
      https://github.com/lobsters/lobsters/pull/2004 maybe?
      https://github.com/lobsters/lobsters/pull/2003
      https://github.com/lobsters/lobsters/pull/2002
  PRs
    caches dropdown https://github.com/lobsters/lobsters/pull/1982
    hide action https://github.com/lobsters/lobsters/pull/1962
  issues
  sqlite perf



title

post-stream
  schedule anubis with hunter
    

Transcripts are generated with whisperx, so they mistranscribe basically every username and technical term. They're OK but not great, advice appreciated.

Recording



01:56It's Lobster's Office Hours. This is Lobster's. pushcx Welcome to Lobsters office hours, drop questions in chat anytime!
And let's put the message in chat. Questions in chat anytime. So howdy, I'm Peter. And this is kind of open door time to drop in, chat about anything related to the site. Or just look over my shoulder as I maintain stuff. So I got my Scratch file, my Vim browser, my nine other browsers. All right. Yes. What do we got? We got a... I am badly behind on looking at issues. And then simple-i-perf is something I wanted to pick up, especially if Thomas happens to be around for this stream. And then, is there anything funny or how to do these? Yeah.

03:10This one, this article on top was real interesting.

...18maybe two months ago, we got a pull request to Lobsters that was conspicuously written by an LLM. And the submitter was pretty emphatic that it wasn't, which was a little unproductive. And zig's rationale here in this article let's grab this for the notes was that they have this poker metaphor i don't love the metaphor because it's confrontational Poker is a zero-sum game or, you know, a negative-sum game if you're playing in a casino with a rake. And they're trying to get at this idea of hidden information where they can't tell which contributions are going to lead towards good contributors because maintainer time is so limited you really want to focus it on... the places where you're going to help people become more regular and better contributors. And there's not really a point in spending your time on an LLM authored pull request under that model because The LLM is not going to learn anything. I mean, maybe the, you know, in a vague sense, the next model 12 months from now might be 0.0001% better because of that. But no, not in the way that if you put some time into talking to a human, they will both learn something and be more likely to contribute back at like the 20 or 40% level measured over the next week or two. and with the the ai pull request we got that's pretty much what i said was like there's a bunch of stuff wrong with this if i put time into reviewing something i'm likely to get a good response out of that let's go find that pull request but like it's not really any point and Especially with the person fibbing about authorship, it's kind of obvious that if I spend a bunch of time reviewing, they're just going to paste my comments into Claude and have Claude deal with it. This is still going? Man, GitHub. I would be more sympathetic if they hadn't done it to themselves. Where was it? Was it not even on the first page anymore? Yeah, tag combination filters. This is the one I'm thinking of.

07:12yeah when i review code even bad code i know i'm helping somebody learn the code base in rails it's not going to happen i put a ton of time into a critique that would just be copied and pasted into a coding agent looking at this pr feels like an invitation to waste a lot of time and Zig who are a much bigger and more active project with many more maintainers and contributors and lines of code on every dimension, they're bigger. I don't know, very similar logic in that post. So real interesting reading. Let's jump back to polls if we even have a list. It seems reasonable. Oh, you know, speaking of that, one thing I had mentioned in there and just seeing it was I said that I asked which coding agent they used because over the last few days I saw three real obvious LLM agent pull requests. And that was the first time in four months. This things were pretty quiet after this. And then this last week, I think I've had five pull requests that are slop. Let's take a look.

08:50Yeah, so this one, graefchen Heya limesHi
says it is by Nenukclaw, which is an OpenClaw instance. So that's particularly obnoxious. These two by Charlotte NM. Oh, hey, Grave. How's it going? These two were kind of... pushcx https://github.com/lobsters/lob…
They have no text on them. I'm not going to bother going through each of these. here i'll share this link in case anybody wants to follow links but they were conspicuously not human code and the account is it even still open it's a brand new account like it was created some number of minutes or hours before opening a pull request against lobsters which is just yeah it doesn't have their when was this account created maybe it was created and idled for a while but like this is conspicuously someone who is it's a what's the term i want here shoot there is a proper term in trust and safety but it's somebody setting up a synthetic account and then they're trying to generate a backlog of plausible looking activity And so like this account that is presumably, where was that? Oh yeah, there we go. So we can say GH API users. got this off screen because I ran it for some personal stuff. But anyways, this account is kind of conspicuously getting set up to look like a real healthy account. Yeah. So they created the account a couple of years ago. It has zero activity. And then at three years and two months old, all of a sudden it clones the lobsters repo and opens a couple of pull requests in, I don't know, not much time. What was the time between these two things? 7.45, 7.50, yeah. You didn't implement that in five minutes. That's not plausible. What were the other ones? This one I'm not sure about actually. They just sort of wanted to turn the site into a single page app. So like I'm not even sure I would call it broken. So if you're on a story and it says like this story has 10 comments and then you click reply and you leave a comment, it still says 10 comments. graefchen Also 5 Minutes feels very automated to me. For some reasons. limesO
which is fine, the number is correct from the page load. And there are a couple of instances of these numbers, and they didn't get them all. And it just, it's not useful and they didn't, the description of this though was, i mean it's a weird approach i don't know this one actually like has a plausible profile and stuff these two yeah here we go all right so there were this one and this one are absolutely lim so in the last week there have been six yeah so i was saying five or six and i think three of these are the tag filter thing and two of them are that dependent destroyer so what's happening is they are pulling off of this first page of issues. They're just literally like what's here on the first and they're either pulling the first thing that says good first issue or they're like skipping anything that requires some thought like feature requests or planning. This one about a parser. One of the bots, I think it was Charlotte NM, screwed up in their pull request and gave this issue's number while actually trying to fix a different issue. So this is the fact that they aren't consistently pulling the first or the most recently updated or the newest good first issue. tells me they're loading slash issues in some kind of harness whether that's an open claw instance nano claw one of those or some custom harness that is instructed to find prs and throw a slop at them I don't know it also says that there is not a human in the loop on these.

15:20graefchen Automation FTW /s limesYay
Let's see where we're at here.

...39Yeah.

18:42let's look at inline block

19:23this actually looks really good now yeah my only real concern there was the the use of anchor positioning it doesn't yeah this comment didn't get edited so those are out of date I'd like to see this one, because we've gotten fiddly with these story things. So let's pull this branch down.

20:05I did a fetch just before I started. Did that give me a tag automatically? This is the one place in Jiu Jitsu where I'm still a little rusty. I think I have yeah i have his repo in here new on top of 72 caches drop down yeah is it ah it wasn't so far off

21:11Let's take a look.

...23So we can have one open. That's interesting. How do you manage that? There are details in summary. Position relative. Is there JavaScript I didn't see?

...57Is it the name? You can only have one with a given name open at the same time. There's something clever happening, and I'm not clever enough to get it.

22:32Aha, yes, the name. This enables multiple details elements to be connected with only one open at a time. That's lovely. I keep learning stuff from these.

...51Where's the single D2, list D2?

23:57Let's get this merged. Let's get this in the notes.

24:18That really does look nice. All right.

...28And we have another one from Federico. I think I saw this one a week ago. Yes.

25:25story is defined we're on a details or the list page so create all the hide buttons in the context of the story sure story is not defined all right all the hide buttons in the document

26:02What is it? OK, so he thinks of this single story mode as the details page. Sure. So we create all high buttons in the context of the story. OK, so we also create a Heidler. The story is not defined Heidler. I got to reread this, both to see the variable name, but then also.

...37Let's get that.

27:05He just entirely deleted the submitter element. Why does that work? Oh, because he's handling it down here as the submit button. Oh, and he added a comment. If no.story action was triggered by hide alert,

...37Okay. I get what's happening here. Yeah. The one thing to do is grab the submit buttons.

...51And then... Each of them... Okay. This is all very reasonable. I get it.

28:12We may get cat cam on this stream. He's come around, and now that it's warmed up here in Chicago, he's going back up on his supervising perch. We'll see if he goes up in a minute. Fingers crossed. This is great.

29:15All right. Nice to start off getting two pull requests merged, especially that have hung out for a second. That's great.

...38So let's see something happened here because this got bumped. I didn't manage to make time that last weekend for it. Skip non search paths. Yeah, we talked about this. And then.

31:05schedule with hunter so last stream last what was that tuesday i merged the

...27or no, I fixed up the full page caching for RSS and JSON and that's had a significant effect. I spent a minute looking at prod and we don't have like health heartbeat analytics, but we're back down to normal page loads all the time. I've occasionally seen some slow stuff, but we're doing a heck of a lot better. And jump over to logs. So if we tail the action log, the real important thing is we don't see nearly as many JSON and RSS feeds getting served. You see stuff like, you know, someone checking this odd combination.

32:30But it's not the like every other hit that we were seeing before.

...39These are almost certainly logged in users with their token. How can I throw them away? If I filter down to just grab for username colon. It's nobody. in the log.

33:07Yeah, so it's possible that we see some dog piling where we're regenerating these caches multiple times. I wonder if these folks have cookies or if just this base level one is broken because we shouldn't be seeing these happening over and over like this. Maybe you see two together because two requests arrived. A second request arrived while the first one was still processing and filling the cache. That's dogpiling. But that really only takes 100 milliseconds-ish under normal circumstances. Maybe 150 is normal. And then when we're running slow, maybe it takes 300. That's a little bit slow and not absolutely wrecked. And then we definitely shouldn't see a bunch of RSS hits a few seconds later.

34:28Hmm.

...36So either it's not working for this topmost level. Let's just watch this for a second and let's actually save this. We will rep for RSS and then let's grab the timestamp.

35:06Yeah, so that first one looked like dogpiling. See, I was thinking maybe what's happening is I happened to sample right as the cache got cleared. So we could dogpile in front of clearing the cache, and then it's just on a cron job that runs every 60 seconds or something. It's not a cron job. It's the scheduler now, but same deal. And so then we would see a dog pile immediately after the cache gets cleared. But seeing these in bursts of two and three a couple of seconds apart, that says we're not using the cache.

36:11OK, so we've got a rss.rss. Let's look at that caddy config.

...39This doesn't attempt to use the.

...54This doesn't attempt to use the mime type. As the extension. Keeping these two things in sync, the what file name are regenerating and then what file name are we checking is.

37:15Not great. So we are successfully serving newest.rss and comments because the URL looks like the file name. So if I search for comments.rss, get back to log. This is going to be dead quiet because the cache is working. If I leave this open for, what is the cache set to, 180 seconds? If I leave this open, we might see a little dog pile of two or three hits in the same second, but then we won't see anything for three minutes until the cache gets cleared.

38:05The other way we could know this is by inspecting the caddy hit log. But I don't know that it logs whether it used the cache or not or used the, honestly, whether it used one of these variables or which rewrite triggered. That feels like a missed opportunity because I'm debugging that sort of thing fairly regularly.

...55Yeah, and all the JSON ones... Our URLs do end in .json. And people... Yeah.

39:21So really what I want here, see, it's tempting to add a like dot RSS at the end, but then we might start serving RSS in response to requests for story pages if the RSS cache gets filled before. Yeah, that's ugly. I think I just want a special case this one round.

40:16Yeah, we're hitting the expressive limits of caddy. I think maybe I can say or here. Sorry, it's on. Use a matcher.

...50So what I want to say is something like, if you request RSS, then we will try files RSS dot RSS. But I want this or this, right?

41:25If we have multiple file blocks, do they get ORed together? I think so.

...40Because my general intuition for these matchers is if I write different matchers, they get ANDed together. But if I repeat a matcher, especially with headers, they get ORed together. Try policy as first exist. Split. Let's try files with a short line for it.

42:15Empty file matcher with no files. We'll see if the requested file. That's not what we have.

...43Yeah, it kind of gets me on, you kind of see there's a little bit of cruft in caddy file where there's these statements and then cell expressions that are just a little bit more expressive. So there's always two ways to say things instead of one.

43:15Although I get why their cursor is like that. So if I said this.

...55OK, so if I do this, how can I test? Also, I'm in the wrong place. I should be making a, yeah. Let's fetch the remote and get on top of main.

44:47You see the header thing. This is the behavior I was trying to describe. Different header fields within the same set are ANDed. Multiple values for field are ORed. Are multiple file directives a bug or are they ORed?

45:24So there is a minor inconvenience here with the way Hatchbox handles its config files that makes this hard to test. So it's our animation that smooshes together these caddy files.

...45If I try to deploy I also have to push the GitHub. So if this doesn't work as an approach, then I have this broken commit hanging out.

47:19Look, I know I could build this up, right? I could have Instead of visitor, I could say visitor could just be these first two, right? It's got the host. It's got the no cookie, which means you're not logged in. And then this could be its own matcher. And then this could be its own matcher. And then I could say rewrite cell at visitor first one or second one. It's just so dang wordy. And I think this is the default.

48:12If it wasn't such a busy route, I wouldn't be jumping through this hoop.

...26yeah so we're seeing these this one's working fine because it's a couple seconds apart or a couple a minute apart each time so i must have turned the cache operation back down to well 60 seconds is roughly what you see here so like that one's fine

...57I think I could just copy that into place, right?

49:27Excuse me, little sneeze break there.

50:02anotherjamlover12 would you mind speaking a little louder? I'm not sure why I cannot hear
I can copy this into place and hit the update caddy button on Hatchbox manually and then commit if it works. Yeah, I've done that before. I was just trying to remember this process. Hey, boss. Yeah, I'm seeing this thing because of you. Oh, sure. Hey, another jam lover. Thank you for telling me. And I think I can just... All right. this boost i am actually pretty quiet right now so all right well thanks for letting me know and yes i will try and project my voice a little bit more okay So let's copy this up.

51:15If I replace the production one, that's fine. I'll just run the deploy script, which I want to do this stream anyways, because I merged fedamps to pull requests. So yeah, it's fine if I overwrite. So we'll say Lobster's current hatch box. And then off screen, I'm pulling up the hatch box control panel, and I got to do this off screen because one or two of these screens is pretty relaxed about showing the api keys on screen all right so yeah so if i hit update caddy that'll kick off a caddy job all right light mode warning here i'm going to bring that up and let's see if we get in the output So either this will succeed and I can commit, or it'll fail and then I will probably have to break out the different matcher variables like I was saying. Let's look at the logs. Honestly, this is probably a failure if it took more than a second or two. I think it retries. Yeah. All right, so hold on. Can I stop the job? No, I cannot. I can grab this config file so it doesn't get cleaned up.

53:09So that I can test manually. Good. All right, unrecognized sub directive path. So that's. I was trying to match this. Oh, you know, I should have. Should have set a route. And then.

...46Why didn't Path work? Because Path is its own matcher, it's not part of a file block.

54:03But I really want to say only run try files if the path is RSS. OK, let's go look at try files again. So I can give it a root, which I did.

...49remaining after splitting the path. We have the path as a file that exists.

55:29So really what I want to say is... No, I really do want to split on the path, not just the last segment of it, but the whole thing, because otherwise... So we have other RSS feeds. So if you load slash newest... It is totally possible that the cache could be filled where newest.rss exists, newest.html doesn't yet exist. And so if I just add RSS to the end of this, we will end up serving RSS to HTML requesters. So if I took this off, well, this would just serve anything if that cache is full. That's much worse.

56:37Empty file. File match dot relative. The root relative path of the file. Absolute path including the root.

57:34I don't want to use, I mean, I think I do want to use a cell expression because then I would be able to say path equals equals RSS and try files, blah, blah. But if I do that, I get another matcher token. Is that what they call these variables? And then I would have to update the rewrite line as well, which I'm trying to minimize how much stuff I'm updating.

58:21To match directories, glob patterns. Switch it as a fallback.

...52it's a shame hunter's not on he knows caddy files so well get back here on this

59:50Make a path bread jacks. But yeah, then I gotta go update the

01:00:16So if I have a directive, I can have a rewrite.

...34And it takes an optional matcher.

...45So I can either have a named matcher. How do I combine named matchers into one? Path matchers.

01:01:08That's exact. So this would get me, I could say, rewrite

...17slash rss to rss.rss. No, I would still need try files. I still would need the some kind of way of combining the named matchers.

...52anotherjamlover12 Why did you choose Caddy over Nginx?
Man, I wish this just looked like an imperative programming languages or, you know, looked more like a...

01:02:04I chose caddy over nginx. Well, its config file is more expressive than nginx, which... pushcx https://hatchbox.io/
was causing us pain with our previous setup, but mostly it's picked because we're using Hatchbox for provisioning and deployment. This is a provisioning and deployment tool that costs, I think, 10 bucks a month for each host. And it works like it's just a rail setup that actually works which we were struggling to maintain our own with ansible. And the reason we maintain our own. Is this caching if we didn't have the ability to customize this caching and customize these matches for dealing with really badly behaved spam bots. we would have to put a CDN and a WAF in front of this. And we may end up with that anyways, but yeah. Multiple matches of the same type can be merged using Boolean or as described in their respective sessions. Yeah, so I have to write an expression matcher.

01:03:51And then I can do and an or. Yeah, all right. So you're a visitor. And then we will say cached file exists or RSS cached file exists.

01:04:35So then we will say use cache is an expression where you are a visitor and cached file exists or RSS cached file exists. Always wonder about talk towel with these I try not to think too hard about it, because we have enough hassles with the cash I don't need to go looking for problems. But there is. Some difference between the time of check. Plus this time of check and then down into the time of use. milesfoob Oh cool! I used Caddy a few months ago to add an API key check to an ollama server I wanted to expose. Hadn't heard of Caddy before that.
Where it actually tries to serve the file back. And especially because it is an external process that deletes these cache files. I wonder. Now hey miles used caddy to add an API key check to an llama server. Did you just. Like use this header directive to check that somebody had a cookie something like that that kind of. milesfoob yeah I think so, I'd have to log in to see, but it was super simple
fixed shared secret auth is real straightforward to tack on.

01:06:18dlamz I'm looking to use it to put mTLS in front of one of my services. it's just an easy config
Cool. Yeah, some of this stuff with caddy is just incredible. And I'm aware that it has these limitations because it's compiling these to some internal representation that's very highly performant in a way. You know, a general purpose programming language, I could screw something up and have a caddy file that takes 10,000 milliseconds to make a decision on how to serve a request. That would be brutal, but very easy to do in an imperative language.

...54TLS. Oh, yeah. For that kind of TLS termination, that's very straightforward. I mean, it's... It's like this. No wait, so there's the redirect. And then we have the caches and stuff. Oh, it won't be here. So the way that our config gets compiled with the Hatchbox config is... It's a little MacGyvered. So I don't have that on the screen right here. It's on the server, but... Yeah, doing a reverse proxy is pretty dang simple.

01:07:49So here, this is the one that can't say path. It would have to say path here.

01:08:01I think that looks valid. Let's copy that up and try again.

...10And now I have some kind of typo. So that's worse. Doesn't like. Okay. So wait, let's go through these line by line. So it's using this config. This HP auto HPS is only on HPS, but no connection policies. That's a weird warning. I wonder why we didn't get that last time. Maybe because it was preempted by this error. Nabile saw the typos
Nabile cahed_files
Nabile (hi)
dlamz :(
Since I didn't touch anything about HTTP, yeah, DLAMS, I think you cursed us by mentioning TLS termination. Did I type cache files wrong?

01:09:03anotherjamlover12 oh no...
Aha, yeah, you got it. So then let's make sure those are correct. Nabile singular too ;)
Cached files exist, singular or plural. Say singular. All right, so now that matches. Now that matches. There we go. Thanks, Navile. I actually fixed one of my bugs before you could tell me I had it. Nabile you're welcome
Of course, I have the benefit of not having lag. All right, so then what's our new one? Is this just you want me to throw backticks around it?

...45So it's parsed as a single token. I think you do. Let's just throw backticks around it, see what we get.

01:10:03It's a little funny that their syntax is so fiddly, because it doesn't run in the request cycle. So I wouldn't think that it has the same kind of performance needs that leads to things being less expressive. What do we think it doesn't like?

...43omitting the matcher name. So rather than say, maybe this at is only for declaring one. And what I want to say is curly braces around the named matchers I created. And I would love to see an example That's not what I want.

01:11:30The metric definitions, right? I'm not defining one. I'm just reusing it later. Yeah. But when you reuse one here, you put an at in front of it, you don't wrap it in braces, because it's not a matcher.

...57Let's just try it, see where it gets. So this one then wants to be, I feel like this is not going to be it. I hope it's not just mad about the parentheses. But I think cell permits match parentheses. I didn't copy. Valid configuration. OK. So I'm going to take a quick bathroom break, and then I'm going to hit the Deploy button again. because I don't want any distractions while I'm trying to put this up. So where's this? I'm going to step away for two minutes, and then we'll deploy.

01:14:37milesfoob I feel weird bouncing during a potty break but have to leave, peace all!
All righty. Let's turn off the break sound.

...45Oh, well, too bad. Miles had to take off. All right. So I'm going to just deploy right here, which I think is just no command. It's just you run it. Ready, run.

01:15:17Reload.

...25OK. Now I watch the logs for RSS. And I didn't just, you know, knock prod offline, right? Oh, no. Absolutely knocked prod offline. Great. What'd I break? Well, first things first. Let's get... Is there any good way to get logs if I bounce the server? Probably not.

01:16:03I gotta pull this...

...15Yeah.

...23This log is not useful. All right. Let's just lobsters deploy to stand prod back up, and then I will think more.

...38Because this wasn't committed, we're overriding it right now.

...58Would you like to come back up?

01:17:26So now.

...32Now we're still serving a 500. We definitely shouldn't be. Let me bounce caddy specifically. Nabile it's back up
There we go. All right.

01:18:02Yep. So what do we think happened here?

...13Something about this it didn't like. And I do wonder if this file match variable didn't get filled in because these things are now separated or because there's two of them. like this is just globally referring to a file match rather than a or b and so if a like the home page matches that says use cache but then it always runs b which is going to replace that i wondered a little about this

01:19:29Nabile is there a way to trace this?
I. don't know. I took a peek off stream at the caddy log and it didn't have anything. complaining or explaining about the 500 and I had to peek off screen because it's got pi it's got user. IP addresses in there, so I try not to put that up on stream right. but there was nothing in there or in the journal, the SystemD journal about it.

01:20:13There was just the usual hits.

...25And so I'm kind of A little bit stuck on debugging this I wish I had. It would have to be a pretty complete staging environment to allow me to fill the cache and then test that the correct thing got. served out of it.

...55or a pretty complicated container setup that amounted to a staging environment in a box. That's not a small project.

01:21:15Let's duplicate these. Oh yeah, you know, you say for tracing, This comment here is me tracing stuff it's. Setting headers and then, when I ran curl commands I could check did the the headers show up did I see the right things.

...52all right so let's try this let's say let's bring this up here and let's say we will use the cache i think by any ceo so if i can put one matcher here can i just say expression directly yeah respond and then that's an expression okay so let's try let's say visitor and let's cache file exists which also wants to be in curly braces right so now we're using this file match and so the the global file match data is what we expect And then we'd basically duplicate that down here and say, if RSS cached file exists, then I wouldn't have a separate use cache named matcher anymore.

01:23:44So there's a little duplication, which I don't love. But I can make my peace with. Let's try turning on that header just in case. And we will just say.

01:24:14Have it on all the time to see again because these things are shared. If I set the header down here, I'm only going to see the second one, right?

...37So let's say if you are a visitor. We will set x lobsters cache. Let's call it cache file exists.

...58Can be file map relative. And then down here, I lost my E. We'll set a different header. of rss cached file exists i see i'm kind of what i feel like is happening is that there's

01:25:54there's kind of a disconnect between whether we're being declarative or imperative. And the way that different lines get combined makes the caddy file feel very declarative, but it's explicitly imperative because there's stuff like, use the value in this global variable from the last file match attempted all right let's let's copy this up let's validate you know what we have the one here in tilde all right so it thinks that's a valid config

01:26:55And I just over a caddy file priest, so if I want to put it back fast. I have to run a deploy. Which is the safe thing to do. dhruv2038 hiii
Even if it takes an extra 10 seconds yeah that's probably better than like manually showing these files around kicking the server off because pretty quick i'm going to get confused on what's what all right. let's get up my curl. hey drew. So we're up.

01:27:49That worked. We're throwing a 400 on the home page. That definitely didn't work. And we're not actually printing the value of the thing. We're probably only down for visitors. But let's redeploy.

01:28:19And let me kick caddy to make sure that it's to date, too. Nabile sorry, was looking for an equivalent to picom-inspect that shows rules matches... can you somehow use --debug/-v for verbose logging in the hope it tells you anything useful ?
And then yeah, you're not back up. Are you still deploying?

...45So love that.

...54Can I use, I don't know, are those switches for the caddy command line or for the running caddy service? Why did we not come back? Nabile caddy has a bunch of subcommands taking these
Let's kick caddy again, because it's kind of on its own cycle. OK, that's up. All right, we're back.

01:29:50Nabile sorry i'm a nginx guy
Wow, that was definitely us watching the cache fill. That slow page load. All right. We don't really have any debugging info besides it blew up.

01:30:10And rather than print the thing requested, it just, like, printed the type of these things.

...26File match dot relative. I swore I used that.

...39Upon matching, four new placeholders will be made available. The root relative path of the file. That is not what we got. Our thing printed.

...59We have some kind of like debugging at the wrong layer of abstraction. You know what happened? did we see the curly braces as well it's like the wrong like this got expanded to be the class of the thing and then the whole thing got turned into a string rather than you know it's like in python when you you name A method on an object and then you have the method, rather than the result of the method that's very much what this feels like. Nabile it doesn't interpolate the string or something ?
To this one back ticks around it didn't want to not have the curly braces.

01:32:04I mean, some kind of interpolation happened, because this is not the literal value, but it's like the interpolation happened at a different time in the cycle than I expected. Why?

01:33:02Header matcher field value. But then value is the header field value. That's great.

...30We're not doing a search and replace. We're not doing a defer. Yeah, so string. Show me one that uses a placeholder.

01:34:22Nabile wrap it in quotes and pray ?
wrap it in quotes and pray that's. that's a little tempting. yeah see well there's quotes but it's in the context of a replacement. header.

...52Yeah, so it's not typed for lack of a better term.

01:35:31Wish I knew what it called the curly brace syntax so I could see if there was documentation of that. Is that a concept?

...46It's not a block. It's not a directive. There's tokens, there's quotes. The caddy file is lexed into tokens before being parsed. In which case, that kind of implies that my substitution is happening at compile time or parse time, where, to avoid escaping quotes now, inside quoted tokens,

01:36:35Yeah, so Nebile, your idea of throwing it in quotes and praying actually seems pretty reasonable. Like, that might move the evaluation time from parse time to run time effectively. Placeholders. Placeholders bounded by curly braces contain the identifier. Not all placeholders are available at all times. Some directives set placeholders. Some are globally available.

01:37:38Not all config fields support placeholders, but most do where you would expect it. So it works, except when it doesn't.

...52This is actually pretty good docs, but I have this one very fiddly little question that's not addressed. All right, so let's give this one more try. I'm going to copy the lobsters current hatchbox caddyfile.free to my home directory so that I can replace it a lot faster.

01:38:33See, I think I'm going to get the literal string.

...55Maybe backticks would be what I want. Let's try both. So we'll copy this up. Ooh, lost track of where I'm at here. Which terminal? All right. Let's validate.

01:39:26let's let's have my terminal stall out that's no bueno oh there you are that's that's so off-putting i this terminal that's ssh to prod just kind of stalled for a while

01:40:11All right, so if this breaks, I can swap back in the caddy file pre and bounce. Reload faster. That's throwing a 400 and not printing a useful thing. So let's copy caddy file pre back into lobsters. Lobster's current. Hatch box. And bounce. And we're back OK. Let's do it again, but with. Picks.

01:41:25Nope. Same debugging. Let's put the file back and bounce and confirm. OK. chamlis_ am I seeing that the header setting only goes off `@visitor`, which doesn't do any file-matching?
So I'm pretty stymied on getting this to give me the debugging I want.

01:42:03Yeah, Chandler's correct. And I wanted it that way. I mean, honestly, I didn't need at visitor. It would have been fine to, you know, briefly serve this header nobody's looking for or cares about. But I knew that me running curl here was not gonna touch anything. Just, you know, it's not gonna have a cookie. So it will set visitors, so I will see the debugging such as it is. chamlis_ I guess I wouldn't expect it to have run that matcher and filled the placeholder unless you explicitly added that dependency
But I'm just getting this instead of what I expected, which would be the match.

...55Did I read that wrong?

01:43:04The route relative path of the file like. That's what I'm trying to debug is show me what path you think you're matching.

...18And I did expect this to say RSS dot RSS. And instead seeing. This value. See if this was a Boolean. I would be like, oh, maybe it's doing a truthiness thing where. When it's true, it prints the name of the placeholder, and when it's false, it's nil, right? You wouldn't expect it to have run that matcher and filled the placeholder unless you explicitly added that dependency.

01:44:03Added the dependency where?

...13So my expectation is this is like assigning a variable when I make a named matcher. chamlis_ like if you had two file matchers that did different things, how would it know which one to use the result of, unless you were actively using one of them to decide an action
And when I define a named matcher, I am basically assigning to the visitor variable. And I would expect these to run every single time. Like if you had two file matchers, that's what I have. How do I know which one to use the result of unless you were actively using one of them to decide an action? I mean I am here. chamlis_ but not for the header directive
This is that confusion I had a minute ago about is this file declarative or is it imperative, but not for the header directive.

01:45:09chamlis_ I'm suggesting that the match results go out of scope
Oh. So you're saying that it's like, yeah, yeah, I was just getting there, that it's like a function call. Okay, so if I said cached file exists, then this is like a scope and that file matcher is going to be replaced okay okay i see how you get there let's try it do you chamlus have a preference on well let's i think without back ticks right we didn't really see a difference between these things with back ticks quotes And nothing. So I think I'm going to go back to nothing because that's what I had here on line 75 before going down this rabbit hole. Nabile correct
If you have a preference, I'm happy to put your idea first here because this is your idea. Okay.

01:46:44chamlis_ in the caddy config I had for my blog I didn't use any quotes and that did work
yeah and then so if it's out of scope it falls back to printing the name of the place folder and said i kind of get it and you had okay then we'll go with this let's copy it let's validate it all right fingers crossed let's reload through 400 and we didn't print anything and we didn't print anything and we didn't print anything all right i'm gonna put it back and then we can look at those in a second let's copy that up let's reload and let's make sure prod is still up yes okay so what do we think here We didn't see any of our debugging headers, and we fell back to the 400s. So the 400s makes sense because we haven't actually changed any logic. Oh, no, wait. We did get the headers here, here, and here. And they're all actually the correct values. Like this second one says, yeah, we matched rss.rss. Wait, this looks totally correct. I just, was it on a different line or something? I read right through it. And for the homepage, yeah, that is cached as index.html. So all of these values look correct. I don't know why we're throwing 400.

01:48:45Nabile i also missed them lol, prolly them being lowercase
not yeah all of these things are like 403 so we're not accidentally tripping one of those like except for the fact that it's a 400 instead of a 200 this looks like exactly what i would expect out of it working

01:49:28So why did it throw a 400?

...50Let me scroll up and double check that we didn't miss those headers previously. Yeah, no, we didn't. It's just it was long enough and the curly braces made it stand out. right through it this time. Nabile are the files really on disk ?
So... Nabile, yes, they are. pushcx https://caddyserver.com/docs/ca…
Because tri-files is the header to check that they exist on disk. And I'll throw you the link there to the caddy docs. I would control F it for try underscore files. So...

01:50:53Here's a horror show question. Is it a 400 because two rewrites are firing? No, it shouldn't be possible, because this can't match RSS RSS, and this one has the path that can't be true at the same time, cached file and RSS cached file.

01:51:21So it shouldn't be possible for both of these rewrites to be true and attempt to rewrite. I'm scarred by years of Apache's mod rewrite here. So I may be bringing in an assumption there.

...57What do you say about rewrite?

01:52:30mutually exclusive to other rewrite directions in the same block, only the first matching rewrite will be executed. A request matcher that matches a request before the rewrite might not match the same request after. Yeah, that's fine. So I read that as saying that if the first rewrite fires Well, you know what? We can check if they're interfering. We can check pretty quick if they're interfering. Because if they are, me commenting this out would effectively be a NOA. Because all this does over the version I keep replacing with is break this into two variables and add the debugging.

01:53:36copy that up. Let's validate. Let's reload.

...51We've got a 400. Let me see the 400. It's just a generic error, right? No, it's blank. Even better. Let's put the file back. let's confirm prod is back up yes i didn't think to i closed the terminal because i got so much output spam so i didn't look to see that this was in there but i would expect it to be and just fine so something about this change separating these two is causing four hundreds okay there aren't any other changes so maybe it doesn't like This rewrite.

01:55:19Because previously just said. Rewrite at visitor. It's this. Big difference between named matchers that have an app. And use a.

...40Should this say rewrite at visitor and cache file exists? Should it say that?

...52Maybe we had one error. What did I do? I jumped to the top, right? Yeah, OK. Maybe I had one error masking another.

01:56:27Unrecognized matcher name. Add visitor at cache file. OK, so you right there, you read that as one and you got mad about it.

...46This is some kind of issue with cell then, maybe.

01:57:01Where do you want to explain expressions?

...28Most other request matchers can be used as functions.

...41Matcher name can be omitted if defining a named matcher that consists solely of a cell expression. Must be quoted.

01:58:36Nabile oh CEL, the thing used in firebase rules ? huh
So that diff, it expected an at here. chamlis_ I think using named matchers in CEL expressions might not be allowed
What if we push the condition down? Oh man, I have so much scar tissue over writing Firebase rules. What if I said here expression at visitor, and then here I also said expression at visitor, And then here I was only saying rewrite at cached file exists. Which is not a great name. That one might want to be like use cached file. Because now it's two things, but like. That's fine for the moment.

01:59:34And now each only uses one.

...43Here, load a matcher. Oh, when you have an expression, it did say it had to be in backticks.

...54Actually, if it's a name matcher, can I just name it like that?

02:00:08Getting matcher module at visitor module not registered. You know, if I can't use name matchers in cell expressions, I might cry a little. What if I put visitor? Like I'm referring to it like this. See, it's weird to me that the at sometimes means define and sometimes means dereference. Getting matcher module visitor not registered.

...49I'm really trying to avoid copying and pasting this into both. I'm trying to learn something about caddy syntax that's not just you have to duplicate everything. Especially because I'm unhappy with this matcher that says no cookie. So like, I feel like I'm eventually going to update this and break the one that I don't remember to update.

02:01:30Let's try it with ticks.

...36Nabile yeah, perhaps there's some arbitrary limitation going on here
Dislikes. Let's try it with curlies. Since I don't have a theory here, this is just me cycling through all the variations I can think of on syntax because I don't have a theory. Provided module name expression, provision expression, cell request matcher expects return type of bool, not google.protobuf.any.

02:02:17What the hell?

...34chamlis_ looking at this issue I think you might have to duplicate https://github.com/caddyserver/…
Let me try one last fruitless thing and then I will check out your link. Yeah, of course it is fruitless.

...53It's not possible to combine named matchers. We tried implementing it, but we had to abandon it because it was unreasonably complicated and had many edge cases. Let's see the setup here.

02:03:19So they're saying they want to define one based on another, which is what I'm trying to do combining them. Request a change. Currently, we have to write that. Yep. Allows named matches to contain other named matches. So that's the direction I'm going right now.

...56What does embedded mean? Embedded in another matcher?

02:04:07So I can say matcher one, matcher two, three, four, not matcher two. Note that matcher three and four are equivalent. Right, this is what I want to do.

...32Yeah, it looks like it opened a can of worms. Just judging by my scroll bar. Pretty nice to craft arbitrary Boolean expressions. Yeah, except that that also doesn't work.

...56For named matchers.

02:05:03all right so this is a dead end let's just go back a ways so i have to find whatever the right way in a cell is to refer to two named matchers because not this and We thought it was this, but that just throws four hundreds.

02:06:04chamlis_ further down that issue they say CEL doesn't support that either
Do they? You can't use... It's literally the next comment. It was right off screen. No name matches our catty file sugar. That's... So that's why I keep banging my head on this syntax is they don't think of it as its own thing. They think of it as just desugaring. They don't exist in the adapted JSON configs. They don't exist to be referenced by expression matchers. Is there a way to combine multiple named matchers? No, it's not. Just duplicate, use an expression matcher, not C, which now supports all other matchers as pseudo functions.

02:07:02Should be true, I'm already using expression matchers in the named ones, right? If there's somewhere to call our own custom named matchers, you need to implement the interface

...30I am definitely not writing a caddy module.

...48All right, so let's let's join the Department of Redundancy Department.

02:08:02get rid of visitor and instead of being cache file exists we'll call you use cached file let's look at this diff

...39So effectively, the only diff is this variable got renamed. And I'm so gun-shy that I want to see that work.

02:09:02Unrecognized match your name at visitor. Did I still refer to it somewhere? Yes.

...21You're going to have to duplicate this too. Use cached file.

...33I guess we don't. need to add it content security policy on an rss feed is kind of meaningless because it's not html all right let's get this up

02:10:06OK. Load. And we got a 200. I don't see the header.

...21Oh, there it is. Just hard to pick out. OK, so that's correct. And I should not see the header when I ask for the main RSS. It was after frame options and before permitted, so it's missing. OK. So let's add this back by. Sun comment. Call this use RSS cached file. And let's duplicate this.

02:11:29And then this rewrite becomes use RSS cached file. So does this use RSS group.

...49And then I'm going to leave off the CSP because we don't need it.

02:12:13Hey, that got a 200. That got a 200. chamlis_ not sure if it was intentional but you dropped the rewrite for the non-rss cache
We deployed it, and the site's not down. Look at that.

...27chamlis_ yay!
Nabile grats
Let's go sage over, go in the log.

...37And we're no longer seeing hits. Well, that's going to be a dog pile, actually. So if we get another 60 seconds and then we get another burst, then we have it cached. Yeah, it was a little fast. Why? Yeah, we are not caching correctly.

02:13:17So let's leave that open. I want to just see these both at the same time.

...45You see, I didn't see mine. Why did mine get served from the cache? And these others.

02:14:22I wonder if they have filters. I wonder if there's just that much traffic. OK, hang on. I'm going to. Tail action log off screen.

...56trying to grab the it's dot headers and then i'm trying to figure out in the log real quick how do I grep whether somebody has a cookie without actually starting to dump people's cookies to the stream.

02:15:50So I can grab the headers.

02:16:03yeah the cookies are just not actually getting logged which is probably for the best it stops me from logging really sensitive stuff that i almost never need but now i actually need it so that i can filter it out well i could

...35if ash action log and then let's grab for rss would they be in the parents apparently not so what i'm trying to get at is it's possible that these are bypassing that Nabile i got 1.4k of response headers, was it like that before ? w/ http2 i'm not sure if 'link' & 'feature-policy' is injected by the browser or sent by caddy
at visitor named matcher right the like check the cookie because somebody's coming in with a cookie like it's not impossible that we just have this much bot traffic that some of the bots are showing up with a cookie although we shouldn't have sent them you get 1.4 kilobytes of response headers that's a lot of headers i mean to my eyeball this looks like a couple hundred bytes but okay i guess it's 1.4 there's just that much going on between the caching stuff huh why didn't i see oh no i did see lobster cfe so there's the cached file name

02:19:03realistically people would not be hitting slash rss with cookies set because rss readers don't send cookies he said generalizing i'm sure there is one that does but like They really don't. That's why we had to do the whole thing with RSS tokens.

02:20:03How many hits do we have here?

...16So first hit is from a couple of days ago. So if I said, in the top 1,000 action log,

...38graefchen hean limesGiggle
many were rss and then i didn't typo what are we mad about oh hey again griff keen oh keen yes so there's 220 and if i said tail there's 38. So we got an order of magnitude difference. So I'm going to say that caching is improved.

02:21:34So I guess let's commit this. That was a big struggle. marcoroth_ Hello World HeyGuys
I don't need to keep these debugging headers on. Wait, where's my first rewrite?

...59I deleted it. We have been not serving

02:22:10graefchen Hello marcoroth_ limesHi
Hey Marco, we have been not serving the index and other cached stuff.

...21So let's do it again. Let's copy that up. Let's get up the prod. Let's validate. All right, let's reload.

...44monsters and we're getting 200 and we're getting the cfe header great okay now i can commit i just have to turn off the debugging

02:23:45Nabile the weird headers are gone :p
Nabile back to 443 bytes
The weird headers, what was weird?

...55Oh. Nabile link & feature-policy (this one's deprecated too)
The. Link and feature policy. I don't know where feature policy would be coming from.

02:24:26Maybe. Oh, I bet they're coming from Rails. So you're getting smaller headers if you're hitting the file page cache because Caddy isn't setting as many. Which is feels so dangerous to say that's probably fine because some of those are going to be like security headers.

02:25:02So like. If I grab the home page that came out of the cache, so it has very few and we know it came out of the cache because we can see the debugging. If I add some nonsense just to break the cache, it got ignored. Oh, right, because we don't actually care about those params. This should go through to Rails, and that's when I get the link header. Don't see feature policy, but maybe I'm not actually using any, but yeah, I can see there's a different very header, except encoding versus except. It now has a via header where it didn't before.

02:26:04This cache has an E tag, which is fine, yeah. This one we would prefer not be cached because it's a 404. Maybe it comes along and gets filled in later. Rails sets XSS stuff.

...27So I think that's the difference you're seeing there, Nabal, is whether Caddy is directly serving your request out of the cache or not. Nabile right
whether Rails is setting stuff. Yeah. And so like frame options, I would prefer to have that header set for everything. But to do that, I would have to migrate it from Rails to Caddy and like This one matters so that if we accidentally had an XSS violation, we would be setting the CSP header to protect people who loaded that cache. That's worth copying and maintaining. But like frame origin,

02:27:40is not as helpful because we know if it's coming out of the cache that the requester was not logged in they're a visitor not a user so like it's not great like i would prefer to have it because it would be safer long term it's just painful to maintain and shove it up to this

02:28:10And all of this is performance, right? Like if Rails was rendering pages in a millisecond instead of 150, I wouldn't bother with any of this caching.

...39So I guess QED rails can't scale, right?

02:29:26marcoroth_ Just rewrite it in Rust, this solves all problems Kappa
Yeah. Rust isn't going to be what I would pick, but rewriting it gets more attractive.

02:30:17graefchen Nah. Rewrite it in COBOL or Fortran. Like the cool kidz today. limesEZ
Well, you know, fellow limbs are so great at style transfer, porting from one language to another is the thing that they'll be best at, right?

02:31:20Let's find Chandler's link. Oh, wait, I still have it here.

02:32:03bsandro ehlo \o/
Well, you know... Hey, me, Sandro. graefchen heya bsandro limesHi
All right, so... Yeah, this one, I'm going to be writing a long-ass commit message because I don't want to have to relearn all of these six edge cases that have piled up here. So we have to special case RSS because...

...33The full page caches try files.

...47davidofterra I asked an LLM to port some Turbo Pascal and real mode assembly to Go on a whim. It worked surprisingly well, although there was some duplicate code.
Because the cached file is saved as rss.rss. Huh. That's a new one.

02:33:13Nabile you may want to move the commit message into the caddyfile itself
path.rss because that would return a request for slash s slash abc123 are, oh, let's say, newest into newest.rss. If the RSS cache is filled first, let's put the predicate first.

02:34:05I might, it's,

...20yeah i'm trying to think about like when I find this again in a while, will I understand why I did any of this.

...30you're right this section should go in the.

02:35:40I'm going to call it a placeholder.

02:36:21Nabile yup
Yeah, this part goes first. Value for the file matcher relative placeholder seems to be filled by executing the matcher and is only present

02:38:06Yeah, that's better.

...21Let's take a peek at that. Make sure that big ass comment is there. It is. Man, that was a lot harder than I expected. Well, I appreciate the kibitzing, Nabil and other folks. Thank you. And... Oh, good. Nabile glad to be of any help
More stuff. Let's get this.

02:39:09Sounds like we all went on a learning journey together. All right. I'm gonna have to handle this right now.

...30Come here.

...43Nabile it's a good way to put this :p
this shouldn't affect us so i heard i saw a blue sky post that a bunch of folks on their way to ruby kaiji were finding a bunch of interesting vulnerabilities I think it's going to be a couple interesting days for dependencies. This one doesn't affect us because we don't use def method, def module, or def class. Certainly not from views. Yeah, we don't use them at all.

02:40:36marcoroth_ the shipped a new Ruby version for this CVE specifcially
What's the over-under on my edit being totally invalid and the caddy file not breaking right now when I do the proper deploy? Oh boy, that's exciting.

02:41:00I wonder if it'll be... Well, you know, we don't use def module method or class at all, so I'm not too worried about it. Jamamp_ "oh boy, that's exciting" is scary to hear when deploying to or playing around in prod loadErp
Although this does look a little rough. Is this an RCE?

...25Yeah, JamAmp, one of my friends say like, how did he put it? Ah, yeah, he used to say in high school that the scariest thing he ever heard me say in the computer lab or at a LAN party was, -oh, because he was like, if you say that, something has gone terribly wrong. What is a code execution sink?

02:42:05marcoroth_ it doesn't seem too bad of an issue, especially not that would warrant a new Ruby release
okay so you can call something all right did we just break fraud no look at that well it's totally possible that it's worse you know maybe they feared that you could get arbitrary code execution instead of just Oh yeah, that looks a lot like, closes the method definition earlier. Yeah, that sounds a lot like an RCE.

...49All right, so that's getting served. That's getting served. I mean, nobody hits that except me, so that's getting served. Everybody's happy. Oh, why did I not come at that one out?

02:43:13chamlis_ phew
I didn't hit the update caddy button, so the debugging is still on.

...21All right, so that update ran off screen there. And we are still, there we go. And we don't still have the debugging header, good. Okay, now everything is where we expect it to be. So what's happening there is, speaking of comments in our caddy file, even though most of our caddy config lives in this one file, it is not part of the Hatchbox deployment lifecycle. This is a conversation I've had or a feature request I've had with HatchBlock folks a couple of times that I would really like to be able to shove all of our config files under Jiu-Jitsu, under version control. The trick is these expansions encode file server in default, are maintained by hatchbox and they want to maintain their config files in their repos not in users and so that's why we have this hook to say well their config file has to include our config file but their deployment process does not bounce caddy by default because they Understandably go well, I only did a rails deploy so I don't need to bounce caddy unless I know. That something changed in the hatch box caddy config that necessitates it. So they don't they don't expect us to be making any changes that. Would necessitate bouncing caddy. I thought about. so there's other stuff we install and we've hacked up the deployment process a little bit i could bounce caddy in here and that would force the use of our new caddy config and i don't think i have a justified reason for not doing that

02:46:09Right?

...22There's nothing here that mentions caddy. Fix the cache. Oh, look. Thanks, Janlis. Yeah, you've been fixing stuff on that for a while.

...37Yeah, that's the only mention of

...52What did I do here? Oh, yeah. This was... I did try and make it so that people who were deploying the Lobster's app to start their own sister site would have to do less search and replace. But HatchBock was basically like, yeah, we're not going to expose a whole bunch of... environment variables and such for you to do that. And so I went okay too bad for sister site owners, you know all one of them.

02:47:48chamlis_ caddy is marginally less annoying to configure than apache, but equally annoying to try and debug
Honestly, the. The error messages out of caddy are a lot better. And the fact that you can run that validate command is a lot better. The caddy docs are also better than the Apache docs. Man, I spent so much time with a tab open to like the four different mod rewrite docs, which is exactly what I would have had up were we doing this on Apache. It's one of those where Caddy has the benefit of 20, 25 years of experience before it got started. So they just were able to pick better everything.

02:48:56Oh, but you know what, that's why. So if I do this, the caddy file has to exist on disk. So in the in the hatch box admin, on the settings panel, there's a it's what I say here, like that there's a text area should be one word on their panel. And if you kick off a caddy update, they take the text that's in the box, which is this, and then they do this templating. grayhatter_ hey @pushcx hope your thurs is going well
And then they copy that output file to a slash tmp file on the box. And we saw that that was slash tmp slash caddy file. Then there was a timestamp on there.

02:49:53And then after they do that, They bounce caddy and they delete that file.

02:50:03And the generated caddy config is not a persistent file on the server.

...13So it's not present for me to do that. Yeah, that's just too bad.

...33let's let's clean up that caddy file and caddy file that I left in the home door there let's look at top for a second yeah see now it's been like an hour and a half two hours since I started dicking with the caching stuff but caching slash RSS, which is a very popular route, improving the hit rate on it is probably what took 0.7 off of our load average. And Gray Hatter, yeah, it's going pretty well. Even if I'm fixing stuff that is not well instrumented enough. Right? Because I put on my senior developer hat and the secret difference between being an intermediate developer and a senior developer is that at the end of every project or every fix as a senior developer, you go, how could I have prevented this bug in the first place? And this is a series of bugs where the caching system is not reliable. And so it's like, OK, well, I Guess we could spin up a staging environment in test. That's a lot of complexity. But it's the integration here that got broken where Rails is writing one file name and then not writing that one file name. That's the monkey patch. And then the Hatchbox config has to match it Ah, fuck, we can't even set up the staging thing because we have to interpolate the Hatchbox config file that we can't see to configure caddy and then test that combination. Oh, that's interesting. grayhatter_ I don't think I agree with that as stated, I don't think that's a sr eng trait... you see it more in eng who are very good, and often very experienced, but I see many more eng who I would consider sr, who don't do this... I think it's a competency thing, not a experience thing
Ah, the background scripts are running. So we still have a couple of things that haven't been moved to jobs. like mail new activity and mastodon. So that's why load is jumping up for this two seconds.

02:53:12chamlis_ would you consider running prometheus or something to monitor cache hit-rates?
That's totally possible. I appreciate the implied compliment. Prometheus is one of those

...28observability tools right that's the right word. I think we've discussed it before, but it's been like a while. yeah.

...49I think if.

...57So. we have rounds to know observability tooling set up. Like if I log in and I run HTOP and I see that the load is high, we know that the load is high. Otherwise, we don't know. And so we've even had things recently, especially with the scrapers where somebody in IRC complains that the site is sluggish and I see that message. Well, typically, hours later, but I have seen it as little as 10 minutes later and looked and there's no server load and things are responding instantly. And then we have no historical observability stuff to measure load average hit rate any of that kind of stuff.

02:54:58grayhatter_ that's actually why I'm here, I wanted to ask how the bots have been lately
So I guess as much as I've thought about it, the decision there is reducing operational complexity because more moving parts is more stuff that can break. Does Prometheus or another observability tool fix more than it breaks?

02:55:29And I lean towards no. Nabile dump date + uptime to a file and graph it ? :p
There's also the like cost of running the servers, although that's fairly low, but like, you know, so like one or two $5 a month things, I don't care too much about it. grayhatter_ tail -n 50 -F srctree.gr.ht.log | grep --line-buffered -E 'Chrome/13[2-7]\.0\.0.*' | sed -u -n -r 's/(.*) - - .*/\1/p' | xargs -t -I "[]" nft add element inet filter abuse-http '{ [] timeout 90d comment "annoying botnet" }'
A million, it's very expensive.

...55Nebile, as we saw with the grayhatter_ nft list table inet filter | grep "botnet" | wc -l 27380
grayhatter_ this oneliner has been running for days
lines even with my attempt at wide events we're not logging as much as i wanted gray header what is that is that a grayhatter_ that's the github thing I made
sourcetree.gr.ht.log.

02:56:41grayhatter_ src
grayhatter_ you have srt
Hmm.

...57chamlis_ I have no direct experience with observability stuff so don't know how much it could help/harm
so so chamless to to finish the thought before i get totally sidetracked by questions and other stuff it just has to feel like it's pulling its weight in terms of complexity and because we're not a commercial service and we've been doing okay at whack-a-mole with the bots I lean against it. But if the full page caching was no longer sufficient for keeping the bots from beating the shit out of the site, yeah, some kind of observability tool. Honestly, if Hunter and I can get Anubis shipped, which I would really hope to do real soon here, then we have less interest in observability, because we're going to have fewer, not downtimes, but fewer slowdowns.

02:58:22grayhatter_ I would not deploy anubis... but I've stood on this soapbox before
Nabile i saw iocaine in the log, is it in use and does it make any diff ?
So Gray Hatter, grayhatter_ Nabile I don't believe iocaine works
We're setting it up to, out of the box, just filter slash search and just filter known botnets and everybody else should get passed through. And then Iocane, I just saw somebody talking about it. Where was it? Maybe it was Blue Sky, maybe it was Hacker News. But they were saying they were surprised that they were already getting 25 hits a second

02:59:00grayhatter_ can you limit it to user agents?
graefchen lobsters, someone talked about it with that very cool map of ip addresses. limesLurk
dzwdz oh i thought this was a new sr.ht feature
iocane is kind of two concepts and number one is like a lightweight WAF where it's try and identify which of these things are bots and then the second part is serve them garbage and i really only want that that front part right like i want the WAF part i don't care about serving them garbage And theoretically, it's poisoning the LLMs and they'll have nonsense. I don't know. Anubis can be limited to user agents. graefchen *regarding iocaine
And if you pull up the open pull request we have on it, I think there is a setting related to that there. Let's pull out some tabs.

03:00:11grayhatter_ if you do it wrong, it poisons other datasets too... I say too, but I don't believe that iocaine actually has any effect on LLMs
Nabile supposedly scrapers would notice poisoned data and stop attacking, but that assumes they care
dzwdz i mean if they don't care that's even better
chamlis_ yeah I'm dubious of how effective poisoning training data can be
grayhatter_ dzwdz IFF it works
yeah yeah the honestly the issues we've had with scrapers have been the ones that don't notice and don't care you know they have so much funding don't build a proper control plane for their thing and so it's like they don't notice that they're setting so much traffic the site is slowing down because that level of data centralization costs and they don't notice that they're scraping pages over and over far faster than they change and they don't notice and they don't notice and they don't notice like we've had over a decade of we don't really care too much about scraping especially from things that report respect our crawl delay because they're just not causing that much load and most of the stuff they're doing is beneficial like building search engines or at least as harmless this wave of badly coded scrapers just they use hundreds of thousands of IPs to skip any rate limiting or to skip our basic rate limiting and then they just hammer the shit out of the server and we slow down and that's the only thing I care about like honestly If the scrapers would respect the crawl delay, they'd still be able to scrape every single URL on our site because we only have about a million. And they would be able to keep up to date trivially. And it would take them, what, like a week? What's a million in seconds a week? Something in that neighborhood? Week and a half? Eleven days? I don't remember. grayhatter_ 1m is a week
I did the division once.

03:02:42grayhatter_ 1b is many years
And if they could just keep it to a dull roar, I wouldn't really care. dzwdz more like 2 weeks
A million is a week? Thank you. It's, ..

...58Right. It's just... They could politely get everything they want and we would never notice and never care. It's just the aggressive, lazily written stuff that's annoying because it's so aggressive it knocks us down. Well, I say that. We haven't actually crashed because of these. We've had significant slowdowns where instead of loading in, you know, a tenth of a second, Pages take tens of seconds to load. That's real fucking rough. Right now, we grab the home page. Like, this is... I'm on a standard crappy American home high-speed bandwidth. There are What's the right term? Discussed quotes around high speed on that one. dzwdz might want to stop recording that vim macro
Okay, so we got that page in a quarter of a second. And that's even with the terminal stuff.

03:04:20Recording room. Yeah, I must have nudged into it. There we go.

...30I actually expected that to be a lot faster. I thought that was going to say something like 20 milliseconds, because hitting the full page cache is fast as heck. And caddy serves, oh my god, caddy serves a staggering amount of traffic.

...56Thanks, TC.

03:05:02I should explain that. That's why our caching is so important and why I basically just spent all of office hours on it, even though I wanted to look at the issues. Let me double check the format here real fast. So let's set up.

...29Just look at that. And then this one calls it.

...38Is it .path or is it .url?

...47It's url. It's full path. Come on.

03:06:04It's url, yeah.

...13There we go. All right. So...

...23Here's the Rails server log on the left. This is hits that actually make it through the Rails. dzwdz severe-weather.com
So this is... dzwdz tf was that
That get past the full page cache and so you can see, like yeah American weekday we're at a pretty steady state here, this is in the neighborhood of 10 a second. Are we in favor or opposed to severe weather. dzwdz they popped up in the log
I don't expect any. dzwdz idk what it is
So this is all fine. Oh. It was probably a spam bot that hit the search engine and made it through or something like that. So I'm going to bring up the caddy log. This is URLs that hit the full page cache or that hit one of our bot protections. Now, you might notice the speed difference between these two. This is live. This is prod right now, live, both of these. Oh, that was some bot finished its run trying to spam us. There's somebody who loaded, you know, a page with 40 avatars on it. But the difference between these two is why I put so much time into the full page cache. Because if Rails was trying to serve at this speed, we would need like four more boxes and realistically we would need a CDN and all of this kind of nonsense. This will pick back up. Somebody else will, yeah, see there's some other bots started scraping us and now the pace is picking up again. This is just the reality of a production service, right? What was that about a severe, oh, I saw that. I saw that, I'm on. Which one of you was it? Put your hand up. Oh yeah, so this one is, there is a particular bot, chamlis_ I don't even see the URLs any more, I just see bot, user, rss reader
dzwdz lol
grayhatter_ chamlis_ I understood that reference
that constantly tries to hit us to post urls and it's not logged in i don't know where it figured out this url structure yeah it's funny how prophetic the matrix was and mostly in that my future is eating goop and reading green scrolling text dzwdz stream title?
grayhatter_ literally green scrolling text currently
yeah and i suppose i suppose that is actually what i did because like i could see that text flying by on the right and i could tell you which of our bots was our bots like we own that which of our noisy neighbors or what kind of bot was hitting us because we saw two different kinds there All right. Who was Hi Mom? I saw that. Who is Hi Mom on the stream? Not mad or anything. I wondered if anybody would think of it. And I appreciate that you didn't throw up something, I don't know, scatological or try and shove ASCII art in there. Don't get any ideas, please.

03:10:00dzwdz there's that one classic ascii goatse
email marketing company left 809 what on earth is this you know if you post that the the twitch moderator is going to get you for that ah i thought it was going to say 809 million dollars or something i was like why is there a business thing this is just some old post somebody was scraping us

...31Where's my little banner? We've gone for over three hours. We have seen all kinds of nonsense. Oh, here's the enumerator. That's another bot that's always here. There's a bot that's always trying to find user accounts and enumerate them. We've seen four or five examples of... grayhatter_ can I have that IP address?
dzwdz the .json ones are almost certainly well behaved bots, right?
users who've whose accounts have been taken over by spam bots and they've posted like i don't know it was like i think of it as garbage tier spam or it's not even targeted to the site so if somebody had posted

03:11:25check out my startup i would think maybe there was a human on the other line but when you see garbage tier spam like best escorts in mumbai that was one of them you know it's a totally automated bot greyhatter what ip address do you want and the json ones yes they're almost certainly well behaved bots and more importantly oh no it's two things grayhatter_ from the request you know is a bot trying to do user enumeration
grayhatter_ that's the kinda stuff I used to work on at IG
The JSON ones are often users write their own bots, their own integrations, or just play with the site. And the other one, Gray Hatter, they have several thousand IPs. They especially like to come in through Tor. dzwdz anything interesting in the user agents?
dzwdz :((((((
So at this point, I think we have something like half of Tor blocked because they just keep tripping the rate limit on bad behavior. dzwdz i meant in the user agents of the .json bots
grayhatter_ really? I'm not seeing that
dzwdz because they're the good ones, so i'd assume the user agents are meaningful
yes there is infinite complexity in the user agents the other thing that json is is there are two or three different rss readers oh sure we could let's see we'll grab this we will rep for json dzwdz there might be sensitive data in these, might not want to show that live
And then, hold on. I'm going to pull one of these off screen so I can see the user agent. Because I never remembered the lookup path for the caddy file. I'm in the Rails one all the time, but there's a ref. for, let's see, request. Is there a .headers? What do you call it? So we're in .request.headers.user-agent. No, that's not it. All right, here we go. There should not be sensitive data in a user agent. Honestly, the thing I would expect to see in a user agent is spam. There have been enough websites over the years that print user agents of sites that people have put in spam links in user agents in the hopes that they will show up in automated logs.

03:14:50grayhatter_ I've seen those
grayhatter_ I miss those
dzwdz grep .json too
dzwdz i think these are the interesting ones
And let's leap off the head, see how many there are.

...59Yeah, REC is, I think, a Python library. Oh, I left off the, yeah, you're right.

03:15:16Thanks for catching that. So rather than do 10,000, let's just start with 100. We should get there in not terribly long, because we see a pretty high rate of JSON hits. And that'll give us a reasonable sample.

...42This is probably a bot. That's a pretty old version number for Chrome. dzwdz and by "sensitive" i mean that i might e.g. add some information about a personal bot in the user agent, but not want its existence to be public
grayhatter_ take a look at my grep command
dzwdz ForumRetry? huh
And you can see it's like repeated a lot alright so here's things that are just hitting json yeah so we see a bunch of programming tools. I don't know what a pilot agent is or the pilot protocol, but that sounds fun to check out.

03:16:16dzwdz claw/35
Ah, so whatever this pilot thing is, I hope that's... Well, it should be requesting robots.txt and seeing that it should fuck off. dzwdz oh
But yeah, Claw is an unofficial iOS client.

...42dzwdz i thought it was the other sort of claw
Oh, well, no, that's... pushcx https://apps.apple.com/il/app/c…
So I was saying, oh, this is probably garbage now. No, here we go. I was referring to Claw, the unofficial lobsters app. dzwdz no results for ForumRetry, hm
The other sort of Claw may also be named for us. dzwdz i wonder what it is
So back in, I'll close out the stream with a little story. Back in August, Well, not in that two minutes, Deezy. Oh, you're saying you Googled it, yeah. So back in August, somebody posted a blog post that admitted it was written by Claude and was like, it was about startups. So I removed it because that's off topic twice. Like number one, it's about startup-y stuff. And number two, It's written by fucking Claude. Like, we all have access to Claude. If you wanted a blog post about it, we could just do that.

03:17:55And didn't think anything of it because, you know, remove posts about entrepreneurship pretty regularly. It's settled down, but we go through bursts where it'll be like one a week for a while. dzwdz "Hyperadar" lmao
And then in January, I heard about this open claw thing and I pull up the webpage and it's all lobster themed. Nabile the other site exists for that very purpose tbh
And I was like, well, that's confusing as shit. grayhatter_ a C&D? lol
And so I sent the open claw guy an email saying like, hey, could you please not make your programming thing lobster themed? Because that's really confusing. dzwdz peter to peter solidarity
marcoroth_ who are you to tell me what to do?
know this is kind of a pain in the ass no not an actual c synthesis just like a polite email from me because the guy was also named peter and i was like i'm aware like there's a lot of irony here that a different peter is running a lobster programming themed thing it's a small world and so i said could you i know it's a pain in the ass but could you please not be lobster themed Marco, that is not the response I got. grayhatter_ lmao
The response I got was you should unban my blog so I can have Claude write more blog posts about startups. marcoroth_ lol
dzwdz no fucking way
And I was like, sorry, what? Nabile xd
Because I did not recognize that this was the guy I banned, the site I banned in August for submitting startup stuff. graefchen Many things are lobster themed, surprisingly, like the Lobster Programming Language. limesO
i sent him an email and i was like is this is your is for you like the only concern here getting your blog on bland so that you can get traffic and the response was pretty much yes and i looked into it more so i i blocked his blog in august grayhatter_ people who enjoy using LLM are unable to think at or above 3rd order thinking
Nabile is this a real human being
grayhatter_ X to doubt
and then a week later he had Claude write a blog post about how he should be allowed to have Claude write blog posts and post them on lobsters so he was real grumpy about it I guess and then in November he yes this is a real human being it's the Guy who created OpenClaw. grayhatter_ real" and "human being
dzwdz this is too good to be true
It had a different name and then in November he gave it the lobster emoji and started telling it to be lobster themed and then named it OpenClaw.

03:20:50grayhatter_ two attributes I don't feel describe people who do stuff like that
then in january i found out about it and i was like i didn't know the middle part i didn't know he was real mad that his blog was removed for being off topic so when i asked him to like please not name and theme open claw like lobsters he only wanted to talk about his blog being banned it was not great it's kind of annoying to have pushcx https://lobste.rs/about#emoji
dzwdz do you have a link to that blogpost about not being able to post on lobste.rs
a viral programming project with the same branding like oh yeah the reason for anybody who doesn't know about emoji right like a couple of years ago we ran a fundraiser for the unicode consortium just to donate to supporting this this wonderful nonprofit so Oh, it's... Yes. I don't remember his domain off the top of my head.

03:22:07I would have to... What was his blog called? It's been... marcoroth_ https://steipete.me/posts/2025/…
It was like January that I had this interaction. marcoroth_ this?
So I would have to dig back. Oh, I put it in my little log file, didn't I? Oh, there you go. Marco, yes, that's it. Yeah, so this is the first time he started using the Lobster and then August. So maybe it was end of July then that I removed his post because I want to say this showed up like a week later.

03:23:06Yeah.

...16Yeah, this. You should.

...27The whole thing is kind of silly. I wonder if it wouldn't be better to judge the value of writing on its own merits. graefchen That smells so much like AI. limesO
Nabile THE X
Okay, guy, I will give you an infinite stream of writing from Claude, and you can judge the merits of that writing and do nothing else with your life but judge the merits of LLM output. Yeah, great. That's because it is. He made a whole tweet thread about... Nabile the bigger picture, the bottom line
Nabile no x, no y, no z. no slop.
grayhatter_ there's no way he's actually read that
having claude write this thing to like protest against having slop removed he wrote more slop i i don't know i can't say i understand the motivation here besides wanted a bunch of traffic and is real grumpy that he doesn't get traffic

03:24:28grayhatter_ because if he did, there's no way he'd have put his name on it
marcoroth_ you can always fix slop with more slop Kappa
So yeah, I look forward to OpenClaw and NanoClaw. Those fads will fade out. graefchen Slop the slop with more slop. limesPoggers
Heck, we practically outlived Jordan Peterson, so we'll outlive this thing too. Nabile sad bait
There's your ridiculous story to end the stream with. Slop to slop with more slop. chamlis_ would never have guessed that connection
grayhatter_ > The value of the writing should be judged on its own merits.
grayhatter_ and by the values and understanding of the author
yeah the idea of a community where people are communicating just did not seem to be something inside of his wheelhouse it's also yeah and gray header it's funny highlight that again it's it's also like dzwdz well there's OpenClaw-Digest
dzwdz and what i thought was frottage-bot
marcoroth_ people? communicate? with each other? that's not automatable with Agents...
hey buddy your blog was judged by its own merits and it was removed on its own merits so when in what was it late january or early february there was that open claw instance that wrote an angry blog post because its slop was grayhatter_ you *should* judge writing on it's merits... but you should spend more time on people, on individuals who have demonstrated they care about others and correctness
remove from a PR, it hit the same thing of like, you're discriminating against LLMs and judge on its own merits. So that was very familiar. Which one was that? It was a Python project. grayhatter_ not reading something by someone you think is stupid **is** judging something on it's own merits
Nabile hmm, what they want is bots mindlessly agreeing with slop output, right ?
And a couple of months ago, like the claw instance put a slop PR There's going to be too many Python posts to find it, isn't there?

03:26:51grayhatter_ because you're trash, I'm not gonna waste my time on you
chamlis_ https://theshamblog.com/an-ai-a…
Yeah. Anyway. Yep, that's the one. Thank you, Chambliss. That's the exact one I was thinking of. graefchen I am very appy to be able to use lobster filter. limesOks
The hit piece has... Ah, there it is. Are we going to evaluate code on its own merits? It's the exact same argument because it's from... dzwdz wait, what?
grayhatter_ signed his name to is the correct way to describe that
dzwdz what are you talking about
marcoroth_ tbf, it wasn't his personal agent
same guy the guy who wrote open claw is the same guy who wrote that and i i say wrote because he's the the proximate human right the last responsible person because it's you know there's the chain of other peter wrote grayhatter_ AI wrote something, and he signed his name to it
grayhatter_ he didn't write it
dzwdz that's a stretch
had clawed right the blog post and then other peter wrote open claw which wrote this blog post marco we don't know that we know it was an open claw instance we don't know what individual but like the last known human in the chain is the same guy and it's the same crap

03:28:34dzwdz and this reasoning is really dumb ngl
I'll spend some time this afternoon catching up on these issues. Oh, DZ, did you see I, if you want a screenshot, I pulled a couple of more JSON user agents here while we were chatting. dzwdz if you get a ransom note that was written in ms word, you don't blame microsoft because they're the last known humans
Devourer. Okay. I wonder what flex lobsters is. That sounds custom.

03:29:04dzwdz i sadly did not find anything about Devourer
dzwdz [citation needed]
grayhatter_ uh
Yeah, but Microsoft doesn't kidnap people or make software tools that are intended for kidnapping people.

...25dzwdz ever heard of military contracts?
grayhatter_ @dzwdz agreed
No, I have never heard of military contracts.

...34dzwdz they're great, lobste.rs should get one
grayhatter_ lol
chamlis_ thanks for the stream!
alrighty yeah it would be a great support for the site right no one would complain alright everybody I'm gonna roll out the next scheduled stream will be Tuesday afternoon 2 p.m. Chicago time everybody can bring pastries Deezy is going to bring a military contract I'll bring some JSON. Channels can bring bug fixes for my caddy config again.

03:30:12Take care, folks.