I noticed that you banned my 470,000 IPs
Streamed
More perf issues from aggressive scrapers. Added caching to domain, origin, and category pages. PR #1821 moving vote links above story text in emails. Issue #1692 about comments appearing in the wrong place. Rack Attack rate limiting and firewall configuration. Open source sustainability and the value of paid subscriptions for developer tools. Kleenex testing for getting user feedback on new products.
scratch
topics
performance issues #1814
issues
reply in wrong place https://github.com/lobsters/lobsters/issues/1692
PRs
vote link in emails https://github.com/lobsters/lobsters/pull/1821
more cacheable pages
title
post-stream
Transcripts are generated with whisperx, so they mistranscribe basically every username and technical term. They're OK but not great, advice appreciated.
Recording
02:37How's it going?
This is the Lobster's Office Hours stream.
I'm Peter.
What you see on the screen is lobsters.
And amazingly now, if I reload, you see it immediately instead of in hundreds or thousands of milliseconds.
That's kind of my one planned topic for the day.
is the the slowdown that we've had for the last week.
pushcx This is open hours for discussing Lobsters, feel free to bring up topics in chat anytime!
But oh, yeah, this is going to bring up topics in the chat any time.
So we'll throw that in here and say I'm starting my
03:36We've got the performance issues, which is 1814. You know it's trouble when I've learned the issue numbers by heart. And then do we have any PRs? Do we have any issues? All right. And then... Let's see how this goes. So I'm only planning to...
04:10Only planning to stream for an hour today because I have spent, oh boy, a whole lot of hours the last few days dealing with performance stuff.
pushcx https://github.com/lobsters/lob…
Let's bring that up.
Yeah, let's start there.
So this issue, I kept this up because it's been an intermittent thing the last couple months.
And it got real bad on Thursday and then was fine on Saturday and then got worse and worse.
You know, fine on Saturday, it was okay on Saturday.
It wasn't normal, where normal is like the usual zippy site that we've had for years.
It was sluggish, but it was okay on Saturday.
So I wrote up some of what we've done.
bsandro chikaiLurks
we've tried iocane i just remembered something i okay taking care of a chore and that didn't really work out but we saw a spam bot that be sandra i don't know if you
Oh, there is a just all transparent emoji.
Okay.
So we saw hundreds of thousands of attempts to post spam and I blocked that and we got better on Saturday.
But then the other scrapers that want to do hundreds of thousands of hits per day were like, oh, the site is more responsive.
I'll step it up, I guess.
bsandro heyo! it is an animated gif of chikaiLurk , sorry, forgot they don't render there
These bots are not particularly smart.
They're just really big and aggressive.
And I talked about it more... Yeah, so then...
Animated GIF.
Huh.
You know, I've started in the Twitch stream manager getting some silly toast that says, oh, you have to enable WebGL for the...
stream manager i don't know so obviously not doing that so let's see there have been a whole lot of commits i haven't even remembered to put the 1814 on all of them i think there were like two or three more since that and then yesterday
I recognized a really big, dumb, aggressive scraper.
And I call it big because it had hundreds of thousands of IP addresses, mostly in China.
Like, you know, I looked at the list and the ones.
So it was doing like.
07:31I'm trying to estimate correctly, because I didn't stop and totally line up the logs, but it was like, I looked in a log file that was less than 24 hours and it did over 800,000 hits.
And,
there were some ips that it used as much as i think the highest ip was or the most reused ip was 22 times and when i looked down that list they were all like data centers in china but when you looked at the ones the ips that it only used once in that call it 22 hour period there were clearly residential proxies
bsandro friend of mine fought off the scrapers by just disabling https on his forum
theGeekPirate I do love me some scraping, but it always blows my mind how flagrant some can be
Because it's like, yeah, a bunch of these are just literally T-Mobile IPs or random residential Latin American IPs.
Yeah.
And the thing is, we have something like a million URLs on the site just to put a roughly round number on it or, you know, to get the right sig figs, call it
Yeah, call it about 10 hundred thousand.
And if you are scraping at a rate of 10 requests a second, and it clearly wanted to scrape faster
And the only reason it was scraping it like five requests a second, it was the site got so sluggish.
It was starting to shed load where people were seeing 502 capacity errors or 429 errors that were not the firewall.
They were the site just being so swamped.
And so if you are scraping a site with a million pages and you're going at 10 requests a second,
were going to finish in a hundred thousand seconds which is what 13 hours and so the fact that it was going for five or six days means it was just scraping the whole site and then starting over again and I say that but I didn't really try and diagnose
like, is it scraping the whole site in alphabetical order?
Is it most recent first?
Does it hit every page on the site and then start again?
Or is it like, I didn't really try and run down its strategy because once you get past the first half day, it is clearly like re-scraping the site over and over.
And once you get past three days, it's like, oh, you're just not going to stop.
And it especially got, I think part of what confused it is, I think it's been around for a while.
So one thing, let's just pick a random story.
A little while ago, I changed the comment template to show these short links.
So if you look in the lower left on it, as I'm highlighting random comment, it says slash C slash, and then the short idea of the comment rather than.
the long you know this whole slash s the short idea of the story the title of the story and then the anchor tag if you click on this you get redirected but it's just nicer for like posting on social media or pasting around because it's you know 15 characters or whatever long you can see the whole thing as opposed to it's going to line ramp
The one downside of this is you get a 302 on the server.
When you post to, or when you try and load one of those comment links, you just instantly get a redirect.
And maybe a week ago when this started, I did some performance work to cache those lookups faster so that we could serve them better.
And that helped, but
I think something in the coding of the bots sees that they're getting a 302 link and goes, you know what?
If that's a temporary redirect, we got to check that again.
Because boy, did they really hit those.
And I saw them a lot in the Rails log because those can't get served out of the full page cache, which serves documents.
It doesn't cache headers.
And so part of the reason that we've been slow has been
These couple of big dumb aggressive bots had, what, 650,000 redirect URLs now that they wanted to scrape and re-scrape and re-scrape.
And all of those go through to Rails instead of coming out of the cache.
theGeekPirate Oof
The full page cache is so performant that we have been protected from all of the really aggressive dumb bots that people have been complaining about for the last two years.
12:58Yeah, Geek Pirate, that's...
Wait, Bisandro, could you say more about disabling HTTPS?
Did they just ignore HTTP sites?
Do they just assume they're misconfigured?
There's no way we can turn it off, but that's a really interesting story.
bsandro yeah seems like it, they go for https:// only
If you maybe have a link or something, I'd love to read about that.
bsandro and he doesn't have a redirect to http
So I actually...
That's really interesting.
pushcx https://github.com/lobsters/lob…
So geek pirate, if you look at, do I have that?
Yeah.
We're kind of the target for a lot of, I, I don't mean to be like mean in public, but I call them baby's first scraper where junior developers learn about the idea of scraping.
And then they're like, oh, I want to try writing one.
Oh, I need a site with a bunch of stuff on it.
I know a site with a bunch of stuff on it because
theGeekPirate lol Rack Attack, my buddy used to work there (almost certainly unrelated to whatever you mean there)
Maybe they read the site every day.
So we have a disproportionate number of scrapers that are very beginner coded.
14:13Rack attack was a business?
Rack attack, so rack is the Rails, or I'm sorry, the Ruby,
It's practically just an API or a middleware for web applications.
And then Rack Attack is a common plugin for basic rate limiting.
theGeekPirate Yeah, they sell racks for cars (for bikes, skiis, etc.)
It's not super smart, but it's expressive enough to deal with dumb bots.
So the message that you get when you get throttled says, hey, you can read the code to our firewall or our rate limiting.
And then up here, there's a comment about like,
hey, here's how to be more performant.
The short answer is, if you do a sleep one between your hits, which is what I put in the message down here, you are a pretty well-behaved bot.
And I didn't think I would have to say like, don't just hit us indefinitely, forever.
Remember what you scraped.
graefchen Also remember that there is a robots.txt. limesO
so this has gotten touched a couple of times in the last few days i added yeah so chat gpt just ignores robots.txt this got a snarky commit message because i was pretty frustrated but it's always frustrating when you see supposedly well-behaved and well-resourced businesses just ignoring standards
theGeekPirate I'm certain they'll abide by robots.txt :)
Yeah, you think I should put the robots.txt in the message here, maybe?
I don't know.
They sure as shit don't.
Yeah, I mean, if you look at the...
This is what it looks like when I get frustrated.
Where is it?
An LLM company ignores robots.txt.
What?
Yeah.
And part of it was, I just touched it.
I've got to go look at our...
16:29pushcx https://github.com/lobsters/lob…
graefchen If you dont wanna be rude. So LLM are rude! limesNodders
This also got changed recently, our actual robots.txt.
So I've already griped on stream about companies who lie about respecting robots.txt, but I recently flipped strategies because
Previously, so this hasn't gotten touched much in the last year.
And then...
17:01I flipped it to an allow list strategy.
Because it used to say...
It'll be in the diff.
It used to say, hey, here's a couple of badly behaved user agents that we don't want to deal with.
You guys can fuck off.
And even the companies...
I mean, obviously, some of them didn't follow it.
But the real irritating thing was Anthropic, like they're in here twice.
There's this, or no, I'm sorry, they're in here three times with Anthropic AI, Cloudbot, and Cloud Web.
And then I saw them again in the server logs, and I was like, what the shit?
And I looked it up.
And they now have like eight different user agents they use.
So the impression I get is every time people start blocking their user agents, they create another one and then they just start scraping again.
So, okay, great.
This is why we can't have nice things.
theGeekPirate rofl, of course they do. Fuckers.
Let me list a couple of good search engines and bots.
and everybody else can get disallowed.
I don't know what else to say, right?
I put in this content signal thing.
It's a proposal from Cloudflare.
As far as I know, nobody supports it, but I'm trying.
Honestly, the crawl delay one is the big thing, and it's not standard.
dhdhhdhdhd1wsd hey
And all of this stuff would be fine if they supported the crawl delay.
It's...
I mean, I'd still rather not have the LLM training.
Howdy, keyboard mash person.
19:11But if they're going to hit us at a rate of 5 to 10 hits a second,
Yeah, they've become a significant percentage of all site traffic.
And again, we have a million URLs is all.
Less if you ignore the comment redirects.
Not that I expect their bots to be that smart, but even at a rate of one of second, you finish that in a reasonable amount of time.
Your bot should have heuristics to notice that these pages are not changing that much, especially anything that's older than like a week.
theGeekPirate I can't imagine hitting even once per second, I'd feel bad for most sites which probably don't implement caching properly
It doesn't change.
Leave it alone.
...55Yeah.
Caching, the full page caching is huge for us because even stuff like serving a 302 redirect, if it touches our actual OLTP database, causes a significant performance load at the rate that they're hitting us.
And that was the thing with the spam bot that got redirected to login.
dhdhhdhdhd1wsd hey can i talk about a thing i am working on?
comment redirects i had to make a hot path cache for looking up those redirects so that it doesn't hit the oltp and then on top of that it's just literally if it touches rails we're going to have a performance problem because rails is going to do a database query and the overhead of talking to the database
dh yeah people occasionally drop in with questions about the thing you're working on i'm happy to commit a minute but mostly we talk about lobsters on the stream so anyways you can throw it out here hopefully you're not going to ask about you know so i'm aggressively scraping lobsters at a rate for 10 a second and i noticed that you banned my 470 000 ips
In which case, please tell me your actual name and your home address.
theGeekPirate How do you decide what to cache, popularity + first X pages?
They're almost certainly Chinese, given how many of those IPs were Chinese.
21:43graefchen Same. When I wanted to write a Crawler by myself, the first thing I wanted do is writing a robost dot txt parser because I did not want to be rude. limesNodders
Geek Pirate, so... No, it's... You can look for it in our code base.
It's just...
pushcx https://github.com/search?q=rep…
What's the name of the helper?
Is it caches page, I think?
Yeah.
22:04Oh, yeah, great, Chen, that's nice.
theGeekPirate Oh yeah of course I could just look, heh
So I basically put it on the full page cache on everything that helps.
And the important thing about the full page cache is this is not, Rails fills the cache,
But Rails is not involved in serving from the cache.
We have some caddy config that serves these files out of the cache.
theGeekPirate I see, that's awesome
theGeekPirate Gotta love Caddy
And caddy is so much faster than Rails for this kind of stuff that even just let me look this up by the primary key string of your path and map you this whole file
And be done is just so much faster from caddy, and so we basically full page cash all of the stuff we can yeah and it's not that I mean caddy does have good performance but it's you pick a language that doesn't have garbage collection and has a map and.
Or there's, you know, I mentioned mmap on stream and somebody a little while ago told me that there's a new API replacing it.
We looked it up and it didn't stick because I haven't actually used the API.
But I'm going to say mmap because I'm old.
Yeah.
So this, these handful of caches are why the site doesn't just fall over all the time.
We would be struggling from everything else.
You know, I should throw these on the domain and origin views.
Yeah, let's do that now.
Let's do more caching.
23:55No, come here. Where am I here?
24:09Got to take a note.
...17Here we go. So I got to bounce this Vim session. Because I didn't start it in the right working directory, and that's going to drive me nuts the whole time. So controllers domain, let's add caches page. Isn't there any? Index view here. Oh, or maybe it's for a home controller.
25:19Yes, let's.
theGeekPirate How old are you?" "I'm like Rob Pike and aschew syntax highlighting
Let's do all of these.
I'm going to stand up for two seconds and grab a glass of water because I can tell I'm getting froggy and that's just going to get worse over the next half hour.
One sec.
...58dhdhhdhdhd1wsd hey i am working on this
dhdhhdhdhd1wsd https://voiden.md
dhdhhdhdhd1wsd some feedback
all righty that's a little better so let's grab all of those and then come here ah i'll take a look in two seconds let me finish this thought here holden so let's say grab this oh we can't grab upvoted because that's
by user, like that's a, you have to be logged in to see that page.
So we'll say this and let's sort these.
26:53Okay. There we go. So that's going to be a bunch of stuff cached that was not cached before. Yeah. So we're adding, hidden you have to be logged in to see. We can remove that. So we're adding the category pages, the domain, the origin, the multi-tag. saved you've got to be logged in for and single tag oh man i forgot that i broke these lists all right that'll help let's just say cache more pages dh is there more you'd like more something more specific is there something specific you would like feedback on Let's get this deployed right now because it's slow. So let's take a look. OK. All right, so deploy can run in the background. Build API like you build code. Are you asking for marketing feedback? Because I've done a lot of entrepreneurship stuff in my career. I'm happy to give some marketing feedback.
28:59bsandro is it like grpc/protobuf?
theGeekPirate There's banding on the background gradient on Firefox, known Firefox issue iirc
Yeah, I honestly don't know what this does.
Oh, well, this is a bad.
Don't use this a testimonial or at least don't have it first.
But this looks like someone has seen the product and is just tweeting that they like the idea.
Yeah, same with this one and a little bit the one in the upper right.
This lower left one from Shiva R is actually using.
Put that one first.
Platforms.
Wait, so is it a platform or a developer tool?
I've read it and I don't understand what this does.
Like this banner.
is only useful if someone already knows what voidin is
and is following you.
dhdhhdhdhd1wsd its like bruno
If this is a brand new thing, you have no traffic to your site, and this banner is just confusing all of your new visitors, get rid of it.
Build API requests like you build code.
I don't know what Bruno is.
30:23dhdhhdhdhd1wsd postman
dhdhhdhdhd1wsd offline postman
So I get the idea from seeing VS Code that it's some kind of programming tool.
theGeekPirate The imagine isn't large enough for me to see what's going on
It's like Postman.
theGeekPirate image*
I've heard of Postman.
theGeekPirate animation, whatever
Yeah, this is just like, oh, OK, it's a programming tool.
I see VS Code, but I can't recognize what's happening here.
Build API requests like you build code.
I still don't know what this means.
I mean, you're saying Postman a lot,
31:09dhdhhdhdhd1wsd ookie
I can't actually give any feedback on the tech unless I know what the heck it does.
...23bsandro i'll be honest it was already all done with protobuf schemes
Is it a... Oh no, the features button just scrolls me down.
File centric, text centric, get native.
Oh wait, so it's an API client?
No, Bissandra, I don't think it's a protocol.
I think it's, if you're an API consumer, it's a scripting tool for hitting APIs.
...57dhdhhdhdhd1wsd yep yep its a api client
Is that roughly right?
Use cases.
32:07bsandro no I mean you can define rpcs right there in .proto and then generate everything from there
yeah okay so i can read this so we're saying we're going to do this post we're going to attach a header we're going to send some json okay so it's like a you can find right there in proto and generate everything hmm test api behavior not endpoints
...37dhdhhdhdhd1wsd why dont you download it and try
api documentation wait so is this for consumers or if apis are producers test api okay so it's yeah not that first one but these these second two use cases are for producers oh there's a little button for qa engineers everybody is a developer on an api you got to say
theGeekPirate That text is close enough to what my code looks like anyways =b
producers or consumers.
33:18Oh, here we go.
This sentence.
theGeekPirate Just a few more parentheses
Steal this for your H1.
Build, test, and document APIs the way you work with code.
...34Oh, it's coming back to me.
Is Postman the one where
dhdhhdhdhd1wsd ookie
dhdhhdhdhd1wsd yep yep
olexsmir postman is the gui for curl but it can't work without an account
weirdly it's like a it's like an api client tool but there's a sas involved for no good reason and they just cranked up the pricing is that the one i've heard of all right so what is the obnoxious thing what's new your little obnoxious animated button didn't go away like i clicked on the notification dot and it didn't go away all right
34:15theGeekPirate Bruno says it's an "Opensource IDE for exploring and testing APIs"
And again, this, like...
theGeekPirate (something similar to Postman)
So I'm new to Void, and so all of this stuff is... Like, I don't know what the date on this is, so I don't know if this happened last week or last year.
And all of this is new to me?
You're, ..
so the thing i've done here is kind of talked out loud as i've read the page and tried to understand it so what you want to do dh is kleenex testing kleenex it's called kleenex testing because you can only use each user once and then you have to throw them away so like i can't visit voyden.md again for the first time so you'll have to find new people but
The process of revising this is find people in your target market, sit them down with the page and ask them to explain back to you what it does, who it's for, what they would use it for.
dhdhhdhdhd1wsd where do i ind the audience?
And after you do two or three of these and you'll have to like, you'll find a hundred issues and you'll have to take notes and revise the site between, but after you get a couple of them down and people are accurately
describing back what your tool does and who the users are then you can give them specific tasks like hit this api or you know add one dh are you young because if you're in university go find other students
dhdhhdhdhd1wsd i am young
dhdhhdhdhd1wsd 25
If you're not, go to user groups in your city.
Meetups?
Yeah.
Yeah, I got the impression from your writing.
That's not intended as an insult.
I read a lot.
dhdhhdhdhd1wsd student sadly
So if you don't still have a bunch of...
dhdhhdhdhd1wsd hehe thats fine
classes with folks go to user groups like do you have a javascript meetup that's not sad but just you know find a quiet spot open your laptop your ipad and ask people to look at the site and tell you what voidin is and who it's for and you'll revise good luck
I know there's a bunch of different products in this neighborhood, and I'm using products loosely to include non-commercial stuff.
37:25fact that there's a lot of stuff is not a oh no someone's done it before never have that feeling it's a oh there's lots of people who want tools like these it's a very programmer thing to think that you need to come up with a product that nobody else has come up with before it's better to have something you know people will want Like bots want to aggressively scrape stuff and not follow standards.
38:07Some of this performance stuff is kind of fun, but mostly it is frustrating because if you look at the history, it's just like, oh, Yeah, this kind of took over my Monday, took over my Tuesday, took some time on Wednesday. That's a whole bunch of regular site maintenance I don't get to do because instead I'm dealing with aggressive scrapers.
...49All right, so no, these are the useless views.
dhdhhdhdhd1wsd what do you think of the current yc companies?
Let's look at the updated issues.
Updated two days ago.
Did I leave a comment there?
And this sounds like a new book.
I think I added this book.
What do I think of the current YC companies?
I don't know who they are.
dhdhhdhdhd1wsd hehe
theGeekPirate DHH had a wonderful talk fifteen or so years ago, about everyone wanting to build the next Facebook, not realizing they'll be rich by instead finding the smallest niche they can think of and catering to them instead, with a far higher success rate
I mean, I don't really follow startup stuff close enough to be like watching this year's team like fantasy football.
39:40Yeah, Geek Pirate, if that's interesting to you, I would tell people to, well, start with something like the...
pushcx https://www.startupsfortheresto…
Oh, I haven't talked about this on stream with the rest of us.
There's this podcast by Rob.
walling.
He's been running for 10 plus years with this, but oh, he's now filtering down to SAS.
Because SAS is such a great business model for programmers.
But yeah, there's a lot of stuff here.
Oh, I bet the fact
theGeekPirate Yeah SaaS is insane
Oh, no, it's facts about the podcast rather than facts about entrepreneurship.
theGeekPirate I'll check it out, cheers
One of these days I'm going to write the, like, here's the 10 things I keep ranting at programmers who start startups.
I had a nightmare the other day that all of my blog drafts went live unedited.
theGeekPirate LOL
What a silly, like, it's the grown-up version of the I went to school and I didn't know I had a test or, like, I went to school and I didn't have my pants kind of dream.
I let all my drafts go live.
One of them is, here's the silly things programmers keep doing with their startup ideas.
dhdhhdhdhd1wsd https://voiden.md/download
Stuff like, it can't be a development tool, or nobody can have done things before, or I'm going to write a scraper, or what are the other big ones?
41:30theGeekPirate Writing a huge rage comment just to delete it afterwards... except you hit "Enter" instead misspurpScared
dhdhhdhdhd1wsd guys could you download it and try eh
graefchen Maybe. Maybe not. limesHmm
bsandro @dhdhhdhdhd1wsd it is not open source?
i'm gonna write software for salons that's i want a patio 11's rants ah geek pirate yeah i have not had that one happen to me yet thankfully dh so after you get
you do some Kleenex testing with your site.
You don't want internet randos to download your tool and try it because you're not going to get any feedback from that because you can't see us.
dhdhhdhdhd1wsd no no not opensource
dhdhhdhdhd1wsd not yet
You need to sit down with someone next to them and give them a specific task like hit the Google API to download some results or
Take your endpoint and document one endpoint.
dhdhhdhdhd1wsd @graefchen please please
Give them a specific task and watch them try to accomplish the task with your tool.
And take notes and don't answer questions until they're stuck for 30 seconds.
No.
Stop asking.
You don't need internet randos to look at it.
dhdhhdhdhd1wsd ookie
You need to have people that you are physically sitting next to because there is going to be so, so many small issues that confuse them, that make it hard for them to get started, that you have like 100 hours of work of polish ahead of you.
And I say this having looked at your website and having looked at, I don't know, hundreds of programmer tools before.
43:21All right, so what changed here in our issues?
dhdhhdhdhd1wsd what does a dev tool need to succed?
Oh, yeah, I left a comment.
That's why this is the list.
graefchen User Testing is important. limesNodders I love the Tantacrul Videos where they talk abouter User Testing. limesNodders
theGeekPirate @dhdhhdhdhd1wsd Paying users
A dev tool needs to impress people that it's going to solve a real problem that they know they have and are looking for a solution to.
pushcx https://www.youtube.com/watch?v…
Actually, you know what?
Have I designed it?
yeah dh give this this talk a watch sometime it's an excellent excellent talk about aspects you should look for in designing a business and some basics about pricing and
I know it's a couple years old, but it is solid gold.
44:27dhdhhdhdhd1wsd ookie!
It's about things like meeting your users where you are.
pushcx https://www.youtube.com/watch?v…
And especially because you're making a tool, you should also, this is the book that I hit a lot of entrepreneurs with.
pushcx https://bookshop.org/p/books/ba…
because I figure if I raise it up high and I use both arms and I hit them in the head hard enough by osmosis, they will absorb some of it.
Oh, I fucked up my clipboard there.
That's the same like twice.
Here, this is the book.
icecoldw1tch that's just wrong. what dev tools really lack is seamless agentic AI native integration
icecoldw1tch and online cloud only
Especially with developer products, the ideal thing you want is a tool that is so useful
that your users talk about it.
Thank you, Ice Coldwick.
45:28bsandro @icecoldw1tch and subscription model
And it needs to come with a contract, a support contract from Oracle and their team of enterprise negotiators, right?
graefchen More. AI. limesMalicious We need more AI. limesMalicious
What else does it need?
dhdhhdhdhd1wsd its all ai
Oh, it needs to be based on Electron and use three gigs of RAM because, you know, RAM is free this month.
...53olexsmir and it can be installed only with npm
So, Bisandro, I do unsarcastically think more things should be subscription model, because I don't want to adopt a tool and have the creators get burned out and abandon it after a year.
And if I pay them a couple of bucks a year, it's still going to be there next year, probably.
46:29bsandro @olexsmir only flatpak or whatever is the other one called
I was the first commenter, hit submit and saw my comment twice.
...53bsandro to be at least 500mb of size
olexsmir @bsandro snap
theGeekPirate If you don't bundle it along with the OS, who knows what could go wrong
Comment is indirect.
I suspect to display oddity.
I don't understand this bug.
47:07dhdhhdhdhd1wsd is schema on data exchange a open question
olexsmir yeah you should embed it into kernal
So we do just a little bit of, well, we've replaced Ajax with Fetch, but I still think of it as Ajax.
We do just a little bit of Ajax so that comments post without reloading the whole page.
And it really sounds like somehow we're inserting the comment in the wrong place.
...58theGeekPirate Kernel + custom init to only run your app, EZ Clap
bsandro @pushcx I don't have any experience shipping dev tools, only videogames, but looking at my own preferences as a developer 'subscription' is not one of the things I look for
graefchen I love me AI when I build me imaginary spam detector -- Decision trees are awesome limesYay
I mean, Bissandro, I would also like everything to be free and have no bugs and all this stuff.
But realistically, I can't have it for free.
And if I get it for free, I'm only going to have it for two to five years until they get burned out.
If I pay for it, I'm likely to have it for 20.
bsandro everything gets burned down imho
And I think the economics, especially of bootstrapped products or bootstrapped software products can be so good that one or two developers can make a very good living off a tool that they charge a two digit annual price for.
But
I don't know it's just really.
hard to find.
49:10bsandro that's why my main choosing thing is 'open source'. doesn't matter if it goes down as long as I have code.
And we look at.
projects like vim that get so big that they get their own giant communities and inertia and I do expect that vim will be around in 20 years but.
If I look at any individual developer, they probably have a couple year life cycle of, oh, they have a bunch of spare time and they can donate it to the commons.
But then after a couple of years, either their personal circumstances change or they get tired of it and they leave.
And if your project is very big, enough people arrive that it's okay that you're using up developers.
But that's super inefficient and painful.
There's actually a lot wrong with that model that you just have to be famous enough to get
enough donations in an unsustainable way that it sort of becomes sustainable by numbers i don't like that yeah you know bisandro i thought that for a while and then i actually took on maintenance of a couple of things because the maintainers left and and that sucks i i can't actually keep many projects running
Open source gives me, realistically, it gives me a little bit more of a transition period from when it's abandoned to when I have to go replace it with something else.
But all that's doing is buying a little bit of time.
It's not actually fixing the problem because realistically, you know, I can't maintain that many apps.
I can't add features to that many apps.
52:00Let's look at that.
...08Well, you know what?
Let's look.
bsandro yeah true that too. But my point is mostly about dev tools and frameworks, not everything (though I stick with OSS anyway)
So I haven't been maintaining issues.
You know, I talked about this one first.
The other one is.
Wrong place.
And I haven't looked at the PRs.
theGeekPirate I mainly look for open source solely so I can change something I don't like if I so choose, instead of grovelling then yelling at some poor support agent MingLee
there we go let's get it together the other thing i want to look at after this is there's the prospect of are there more pages that we can cache and you know i did the the basic thing of oh i remember that there are the domain pages and the origin pages but you saw that i
pasted the list of actions and found a couple of more.
So let's remove extra inputs, comments, model flagging, post comment.
53:26Yeah, what an ominous comment in this context. on many pages with different HTML structures. So that's a like, boy, is that the setup for we're picking the wrong place? Yeah, you know, this really could be this, the reply form temporary. If one of these, excuse me. If one of these views is missing this div,
54:05This adds it. Where does it add it? When you click reply, it picks, no, it creates a div to hold the reply form. and then it puts it under the... Yeah, this really looks like it should be forced to be right where the form was.
...51But if that... If this parent selector is failing, could it be walking up the tree and finding What were these screenshots? Yeah, he cropped the screenshot real tight, so I can't tell if it's the notifications page.
56:34bsandro yeah for me too it is not so much a question of morality or anything, but sustainability and pure merit for myself. People use Unity engine and when it breaks(not "if") you can't do much
Yeah.
hovsater Hey oh. Nice to see a fellow Rails dev. I've been working with Ruby/Rails since 2009. :)
One of my, like, big moonshot project ideas is trying to fix open source sustainability.
So mostly I am very frustrated with watching this process of burning out volunteer maintainers.
Hey, Hubsater.
Hubsater?
I don't know how to pronounce that one.
H-O-V?
I don't know.
hovsater It's Swedish, so HovsΓ€ter. :)
Hello, new person.
Yeah, you're coming in towards the end of the stream, because I'm just doing a short one, because I've been doing so much performance work off stream that it's chewed up my days.
Hofsatter?
hovsater Haha
I only know Swedish from going to Ikea.
I'm sorry.
The actual, like, diuresis on the A doesn't help me any.
I'm an American.
I can't pronounce other languages.
so let's look and see if there's any, no, not this view.
Let's go to the pull requests updated.
Oh, this one was really nice to merge.
olexsmir the ikea language looks cool
I think I did it on stream, but this one replacing tree lines.
I'm still so happy with it.
57:55All right. So. Let's put this in the notes.
58:04Oh, I didn't come back for more cacheable pages.
All right, let's move that down.
Vote link in emails.
I think this is.
Yeah, I saw this land in my inbox.
P.S.
No AI was used.
I barely used my own.
I'm sorry we have to be a pain in the ass about this.
For context for anybody, I have added an agents.md.
olexsmir oh does it work?
A user contributed the idea and another one contributed the actual code to try and stop people from submitting slop PRs because I get just the crappiest PRs and lose so much time to reviewing nonsense.
So what does this do?
This is not a... Man, mail new activity is code that really hasn't been touched much in the last decade.
59:14Alexa Mirror does what work?
olexsmir the agents file
The hassle about the lag on my streaming and the fact that other people are talking and bouncing around means if you don't include a subject,
pushcx https://github.com/lobsters/lob…
agents file yeah it does seem to work where was the poll contributor yeah so if you look this up k connor they are an irregular contributor to lobsters but they are why lobsters has dark mode and oh hey you were in here and
Kevin reported that it has worked in a
bunch of things, and one more person who I'm blanking on has mentioned in passing that they got a message like this that was like the bot saying, hey, I can't do it.
And it's only been, what, ten, eight, nine weeks since we added this.
But I haven't had anything that looked like a slop PR in the last two months.
So fingers crossed.
I think I really do hope that.
theGeekPirate Well that's good
Let me put it another way, because there is a way that this is kind of about moderation.
When there is a rule written in the contributing dot MD and there has been for like a year plus, it's very easy to not see that.
And then even if a potential contributor does see it, it's very easy to go, but my use case is special and nobody will notice.
And just kind of ignore it, right?
It's that Simpsons meme of do not enter or do, I'm a sign, not a cop.
I think about that a lot.
But this one of,
if someone tries the tool says hey no and they have to take an affirmative step of deleting the agents.md or explicitly typing out something to like prompt inject their own llm into ignoring the agents.md and it's
I think it's that very tiny speed bump of having to take an affirmative step of, I see the rule, I know the rule, I am deliberately breaking the rule.
You can move pretty fast through ignoring a rule, but having to take an affirmative action to break a rule?
Usually people don't do that.
People do, right?
Like, I mean, obviously we have laws and cops and courts because people break rules all the time.
But, boy, does it cut down the incidents where you have just a little speed bump where it's clear you're about to break a rule.
And maybe a really nice one, because we were just over in the JavaScript.
A really small one.
This...
01:02:59And then where's the story, not story form, here we go.
01:03:09If you were to scroll way back in the mod log to when I first became the admin of the site in, what was that?
2017.
you would see many times a day I was updating story titles to remove things like the names of blogs or the names of authors.
And that has gotten incredibly less common because of two lines of JavaScript where we say, hey, if the title has one of these separators that people commonly use in titles,
where it indicates the next thing is going to be the blog title or an author name or something we are going to have a little slide down because animation catches people's attention and we are going to reveal this title reminder and the title reminder just says please remove extraneous components from titles such as the name of the site blog section author and just having that little nudge exactly at the moment that someone is about to break a small rule
really just enormously cut down on how many of these we get.
Like more than 90% of them went away.
I would say, oh God, more like 95% of them went away.
theGeekPirate Brilliant way to go about it
And then, you know, as a little Easter egg, because I love these little touches that show you someone cares, is if you update the title again,
and we showed this error, this little reminder.
So if the title, if the reminder is showing and we stop seeing that, then we reveal a little thank you.
And it's this little touch of like, hey, thanks, you did the thing.
I think that contributes to a positive feeling because it is not a,
You got your wrist slapped.
It's a, yeah, we gave you a little hint and you caught it and we appreciate that.
Thanks for putting care in.
This is a place where it's okay to care.
This kind of like little polish, I think helps a lot with community norms.
And that's the basic way the agents thing works is let's just, if you give a nudge right when somebody does something,
Most people don't look at those little nudge and go like, yes, I want to affirmatively break the rule.
01:05:58So this one, yeah. So this diff is, it's pretty clearly not a,
01:06:11semantically aware diff because it's a little bit confused because what's happening is we're taking we're throwing away this no yeah so this just moved up here and then this text moves up here
...41So this is, you can enable in your settings mailing list mode and get stories and in fact all comments even by email. And you can reply to them to post your own comments. And it's not like 100% of users use this, but it is a pretty wonderful feature, especially if you use a really nice email client like Mudd or something else fast. or you use a newsreader still, because a lot of newsreaders will read email too. Okay, so this is just the user wants to have the vote link above.
01:07:39I wish this had included a screenshot.
...57So that's probably not going to catch any bugs. Yeah, see, look, there's that GitHub bug again.
01:08:13You know what I don't actually need to see a screenshot i've seen enough of these emails over the years. Yes, we make sure there's a line after the description. And then regardless, we have a. vote.
...37So most of what this is doing is putting it above the story text.
...48dhdhhdhdhd1wsd can i see this stream offline?
Yeah, I'm fine with having our link up there.
pushcx https://push.cx/stream
But this... Yeah, DH, if you go to...
01:09:05There you go.
If you go to my blog, I have an archive of all the streams.
dhdhhdhdhd1wsd ookie!
It sometimes takes me a couple hours or a couple of days to post them.
I don't think I've posted Monday's stream yet.
But yeah, there's that.
And so you can watch them at 2x if you really want, or you want to hear what I sound like when I take Helium.
And I try to tag them and give them little summaries so you can kind of see what the topics are.
But then there is a machine-generated translation, so you can control that first stuff and jump around.
So hopefully it's convenient.
...52Let's just delete this double dash.
So what's going on here is there is a very old email standard for
signatures and a bunch of clients will collapse signatures i say bunch i am talking old people text mode clients like mutt and the various emacs plugins which are way over represented on lobsters you know so instead of being a hundredth of a percent
Boston_Mass are you hosting your own VOD media? :O
Like they are in the world they're like 1% you know so wildly over represented, but if we put this here.
it's going to get collapsed away.
01:10:39Boston mass yeah.
pushcx https://www.dreamhost.com/
theGeekPirate Arch + AwesomeWM, a man after my own heart
Say I my personal blog is on dream host.
...51And they have, what is the promise?
theGeekPirate I miss the old DreamHost blog posts
The marketing copy says something like unlimited storage.
And if you roll up and you upload like 100 gigs, they're going to tell you to F off.
But apparently, if you upload a couple hundred megs of VODs twice a week, they don't care.
So, yeah, where is it?
Boston_Mass oh wow, that's a good deal hah
And I guess disk is cheap enough that they don't care.
I have been using DreamHost... Oh, man.
Yeah, so they say they've been used...
I want to say I started using them in 98 or 99.
Maybe 2000.
Yeah, I talked to a...
theGeekPirate Yeah same
Boston_Mass LUL
theGeekPirate lol
support person on email a couple of years ago and they actually like stopped and they were like i did a double take when i saw how old your account is i don't know they've been fine like you know it's it's reasonably priced reasonably reliable shared hosting
They've picked up a lot of upsells in the last 10 years where they would really like to do specialized WordPress hosting or registered domains for you.
Some of that is starting to get a little pushy, but, you know, I understand where the margins are in this kind of business.
So it makes a lot of sense.
theGeekPirate Were they sold off at some point?
But, like, they're totally fine.
I like them.
I don't know.
Yeah, I don't see the bandwidth claims here anymore.
I guess, honestly, bandwidth and stuff and disk is cheap enough that they're like, let's not overcharge for it.
They're not Apple pricing for RAM.
Can I just suggest deleting this?
01:13:31Oh, and Geek Pirate, as far as I know, they have never been sold off.
They're still their own independent company.
theGeekPirate Good stuff
They have acquired a few other hosting companies over the years, but they've just been fine.
If they were sold off, it went so smoothly and non-exploitatively that I didn't notice.
So, yeah, I'm happy with them.
01:14:11dantex_rl what is it that you are currently working on
theGeekPirate I just noticed their blog posts started to suck
pushcx https://lobste.rs
Dan tell RL there's a link under the thing, under the video, but the, the website is this, they have a blog.
...29theGeekPirate That was a while back though
Let's take a look at that.
theGeekPirate DreamHost used to have the best blog posts on the internet :D
I said, dream host.com slash blog.
theGeekPirate Way back in the day
I wonder if it's slop.
dantex_rl holy moly you type fast
50, open source alternatives to cloud.
Oh man, what a clickbait title.
Grow with AI.
Yeah, okay.
Their blog clearly got taken over by slop.
Are WordPress themes obsolete?
Why would they be obsolete?
What does that even mean?
Themes are dead?
This is clearly some bickering in a community I'm not in.
dantex_rl ai regen slop
DanTex, it's just practice.
I've been typing for a long time.
01:15:19That's a shame.
...25So let's, can I just, sometimes people turn off the ability. I can just commit it. Yay. Look at me being good and writing on actual commit message on this interface. Cool. All right. So I'm going to get this merge. So I've had up the little last call for questions, but this is the last, last call that I'm going to get this merged and then wind down the stream. So if you have any questions on the site of the code base, now is a great time to throw them in and, you know, delay the end of my stream.
01:16:35theGeekPirate https://web.archive.org/web/200…
theGeekPirate Great, now the entire day is going to be me going through their old blog posts, thanks
dantex_rl do you know why atom got phased out, i was kinda a fan of that editor
there's a let's pull that up too i'll peek at that in a sec if i could just drag the tab i'll come back hey it's not my fault you nerd sniped yourself that one's on you buddy i made it
01:17:05theGeekPirate MingLee
Boston_Mass wasn't atom notoriously slow?
graefchen Microsoft.
you know dantex before you joined we were talking about open source sustainability and whether we should insist that our dev tools are open source and i was an unpopular advocate for one of the benefits of paying for tools is they go away less often maybe it's that
...35Maybe Wikipedia has an answer.
...43I'm going to just, yeah, we'll just squash and merge because my thing is not important.
All right, so there's that.
It's always nice to get to merge a PR.
I love doing that.
Where's, oh, look.
You say it got phased out, but it's open source.
Therefore, it's still going, right?
dantex_rl end of life on github tho
olexsmir there's fork as far as i know
If you want to maintain a couple million lines of text editor, Microsoft shut it down in favor of VS Code.
01:18:31I remember using this browser.
Cool, so I'm going to kick off a deploy and wind down the stream here, which is a shame.
dantex_rl RR&R
dantex_rl needed
graefchen Also there is Zed. limesSit
I know there's a lot of conversation going, but I have put so much time into dealing with scrapers this last week that I have to claw back some of my time so that, you know, as much as Lobsters wants to take over my life, and I want it to take over my life, that would not be sustainable, even though the code base is open source.
bsandro chikaiHeh
theGeekPirate Appreciate the stream
olexsmir 25 sec deploy, so beautiful
pushcx https://lobste.rs/chat
all right man these reasonably quick deploys are never going to get old after eight years of 10 minute deploys yeah there's also vim and emacs so lots of hackable editors you know there's as long as there's a discussion rolling no no where am i going here it
01:19:37bsandro thanks for the stream!
yeah you all could roll this discussion over to the lobsters chat room you don't need a site account to join the chat room and everybody loves talking editors in the chat room all right so next stream is going to be Monday and where are we at the holidays no I'm not going to
olexsmir talking about vim is more important than actually using it
olexsmir thanks for the stream
distracted there's nothing going on cool so yes next stream will be monday 2 p.m chicago time best time or you can there's the twitch schedule thing we'll show you and my blue sky i tweet about it alexamir here here i talking about famous better than where's my key
My VimRC is 1100 lines because I have been using Vim for about 30 years now.
olexsmir mine is like 2000 lines, but it's lua
So I have just been slowly, you know, I can learn about one shortcut or abbreviation or one plugin feature a week.
olexsmir and i have a lot of random scripts
And then you do that for 30 years.
Ah, NeoVim.
Yeah, I bounced off that.
I'll see you in the chat room.
bsandro I abandoned my vim after 15 years in favor of nano :)
Take care, folks.