Discussion about this post

User's avatar
Fukitol's avatar

My experience has been similar. I am *very* negative about LLM code quality, its use in production code, and real productivity gains when you account for time spent babysitting, correcting, rewinding, testing, and adequately explaining bugs that the LLM can't "see" (especially ephemeral UX things like "it's clunky"). The insane hype around it, mostly from people who have no idea what they're talking about (and often admit as much) makes me sick.

But that said, for throwaway scripts, boring implementations of nice-to-have features in low-stakes side projects, and the like, it's ... fine.

The code it comes up with often makes me a sad panda, even after several rounds of iteration. Its idea of when it's time to commit is deranged, no doubt by overwhelming examples of bad vs. good commits in training corpus.

But many of the things I've had it do for me would just never have got done before. Too tedious or too low value for effort. Playing with shiny new toy added enough entertainment value to get it done, though in many cases I suspect it cost me time, let alone did it save any.

One difference is don't trust it with anything I don't already know how to do. Else how will I know it's been done correctly?

Another is I mainly use deepseek (via Aider and vim integration). Claude is a little better but not 100x the value better. When Claude gets down a rabbit hole and wastes tokens it costs me dollars. A week of pitting deepseek against my least-likely-to-ever-get-done feature ideas in a couple pet projects cost me a grand total of 30c and got me a pile of working, if ugly, code.

Jay's avatar

I've been using Codex a bit lately on a personal C++ project (as that's all I have these days). Your experience sounds much like mine so far. Between the novelty, and the reduced friction of starting, I'm getting a bit more done than I might otherwise. I'm also willing to start in on tasks I might otherwise delay or ignore, because I can get some sort of start for "free" from the agent. I feel like critiquing and revising a bad implementation is a bit easier than opening an empty editor and starting from there.

I can't imagine how much of a mess you'd end up with if you just let these things run, though, without the critiquing or revising.

3 more comments...

No posts

Ready for more?