learning the accordion is forcing me to come to grips with how absolute pitch doesnt matter and is fake and you can just play in any octave and why not chuck some others in there too
the stradella bass is amazing and gaining the intuition of music being 'to make the sound do this, skip n rows and m columns away, from anywhere' is awesome.
the real skill with no curriculum is not "how to use consumer AI slot machines to get high on stolen valor" but actually "how to defend your code against the roving horde of opportunistic PR agents"
i need to write a paper about how the LLM turned what should have been a 1ksloc package written in a few weeks of having fun thinking about an interesting problem into a 6 month slog trimming a 10ksloc explosion of slop back down to 1ksloc package in one of the more mind numbing grinds i've ever done.
Say you wanted to introduce a new feature, like say an always-on assistant like OpenClaw. How much of the existing code should you have to touch? Probably not much, right? Like all that is is just a set of cron tasks and an event listener that should feed into the normal prompt loop. So like, probably wouldn't need to touch much at all except the terminal i/o parts, that should basically be a wrapper.
How about.. one hundred and forty eight times? That's the number of feature('KAIROS') flags exist in claude code. Note that those are only the parts that were marked for explicit removal in the compiled code (but left in in the sourcemap). That feature is also known as "proactive" and "assistant" elsewhere in the code, and has a number of other related feature flags. This DOES NOT include any of the actual KAIROS code, as the relevant modules were excluded by tree shaking.
Many of them are annotated with LLM comments explaining how "the rest of the shit if broken so we need to do this here" - like for example you'd expect there to be some global way for claude code to declare "we are not in an interactive mode so you can't do interactive things" like ask the user a question. And there are. dozens of them. but none of them really work.
Don't worry, all these changes only create dozens of alternative pathways to check permissions, modify the entire way the system prompt is declared, user input is handled, and so on.
The way that Claude code differentiates human written messages from LLM written messages within the "user message" type (yes, user message does not necessarily mean messages from the user) is that some "origin" property is undefined .
Some of the types were stripped out in the sourceMap, so the MessageOrigin type is missing, but we do have the comment in the image.
So yeah, if something goes wrong in the labyrinth of Claude code that causes this to be undefined, treating messages as if they came from the user is the fallback.
Which is one of hundreds of possible explanations for why Claude code was able to autonomously scrap some expensive thing like in the posts upthread of the quoted post.
literally whether some arbitrary chunk of input text has <TICK> in it
these are all orthogonal to each other. I grouped generously. I tried to filter for only the checks that were annotated as being for whether we were doing a kairos/assistant/persistant rather than whether any subfeatures of that are enabled.
remember this is a NEW FEATURE written with THE BEST state of the art models with THE LATEST agentic techniques. The thing that it's trying to do is so easy it notoriously was implemented in like a weekend because it's literally just a task log, a cron task, and a listener daemon that should sit entirely on top of the existing code, if it made a goddamn bit of sense.
this is part of the "proactive mode" prompt text. surely nothing could go wrong with telling the LLM to just do whatever it wants even if it's not sure about something.
I am still in the process of figuring out how in the hell agents work, but one of my white whale goals has been figuring out how in the hell some prompt text like this could possibly exist where you might be asking the LLM to stop rather than terminating a command. this is basically the distinction between "there is literally deterministic programmatic control over these things at all" vs "it is possible for an LLM to ignore a stop command and just keep going" and the fact that's a question at ALL is deeply disturbing.
the reason why i suspect it might actually be part of the 'stop' sequence is the fact that it is used in the checkPermissionsAndCallTool and runToolUse functions as the thing that happens when the fucking abort handler is invoked.
however it's impossible to confirm a fucking thing about this library because a) i'm not going to run this code, it is so large i will never be able to confirm it was not spiked with something before i got it, and b) static analysis only takes you so far when it's a whacky funhouse mirror where nothing matters and anything can happen
i am so fucking tired. if the LLM invents a tool to call, it first tells itself to call another tool to check if the tool was actually real but the fucking nightmare code failed to pick it up in its necronomically guided wander of the environmental catacombs.
The ToolSearchTool then invalidates all its caches, checks for "deferred tools" (which are an INCREDIBLY AWESOME IDEA that allow tools to be injected in the prompt text, will get to that later), and then performs an old school regex-based scoring against all the tools that exist and their descriptions to find candidates. remember this is A LANGUAGE MODEL whose ENTIRE EXISTENCE is based on SOPHISTICATED TEXT AND INTENT MATCHING.
so yes. there is a chance that your LLM can hallucinate a tool and then end up calling some real tool if there is some regex overlap in their descriptions.
i am trying to write this up and facing a literal technological problem because all the technology we have developed for presenting and writing about code was written with the expectation that the code would be reducible, idiomatic, brief, an extension of normal expression and intention and so on. but there are no tools for presenting code when it is hundreds of randomly wandering snippets related to a theme, you can't post the whole source because of getting sued, and you need to maintain a narrative while also allowing the reader to explore the details of the gore if they so choose.
do you REMEMBER how before i said how <system-reminder> is one of the ways that the LLM talks to itself and there is special handling for those tags (i.e. promoting them to a concentrated block before sending to API): https://neuromatch.social/@jonny/116328504299888679
well it would be a FREAKING AWESOME idea if that was also the way to declare tools that way so that i could literally prompt inject arbitrary code execution via my MCP
the next item on my todo list is look at all the places where there is a retry loop that is not logged in any user discoverable way - there are a lot of uh strategic nondisclosures of failure states in here. in my use, i have found that it is very difficult to get claude to tell you what it's doing at any given time, and it is increasingly clear why.
Digital infrastructure 4 a cooperative internet. social/technological systems & systems neuro as a side gig. writin bout the surveillance state n makin some p2p. #UAW4811 rank and file agitatorinformation is political, science is labor.science/work-oriented alt of @jonnyThis is a public account, quotes/boosts/links are always ok <3.