Gralkor — April 8, 2026

Gralkor

I needed a better memory plugin for OpenClaw, so I made one – Gralkor (https://lnkd.in/gQyn2HTA)

I don’t mean better than the default, I mean better than the top OpenClaw memory plugins.

I started with the best open source, temporally-aware memory available – Graphiti (https://lnkd.in/gpRn5SXC). I’ve worked with many graph and vector memory systems and Graphiti still amazes me. Graphiti’s strengths are perfect for a long-running personal agent – I really appreciate Zep sharing it with us.

On top of Graphiti, I’ve put a lot of myself and the latest research into Gralkor.

I was quite surprised at how other memory plugins work. Typically they just capture individual question and answer pairs – not much to extract context from! What about ideas that come together slowly over the course of a whole conversation?

Instead, I learned heaps about OpenClaw’s hooks and figured out how to ingest whole episodes that make sense as tasks and conversations. More context, richer extraction, deeper understanding.

Did you know that most memory plugins for OpenClaw only remember dialog? When your agent tells you it did this or that last week, it doesn’t remember doing it – it remembers saying it did. Ask how and it will extrapolate confidently and the error compounds in memory. Your agents mostly don’t remember what they thought either, including how they solved their last problem – I sure couldn’t work under those conditions!

Instead, I built a distillation process to ingest thoughts and actions in context with dialog, tuning for the highest fidelity possible without crowding the graph with tool call parameters.

Gralkor provides a simple platform to experiment with memory consolidation and learning. You’ve got cron, just add Thinker CLI and Gralkor to start your quest for recursive self-improvement. We can learn together – ask me for my reflection cron! This is showing up in research a lot now as ERL.

Finally, custom ontologies! You can define your own entities and relationships, using a configuration scheme designed for accurate classification.

You could focus on standard domain language, or structure your agents memory around your model of the world. This is another one starting to come up in research.

So, enjoy Gralkor (https://lnkd.in/g79xCK2V). Star it, let me know what you think, tell your friends – all those nice things. Great trees need strong roots.

Trunk Sync and Seance —

Trunk Sync and Seance

Trunk Sync has a new “seance” feature.

Are you worried about inheriting AI-generated code you don’t understand? No problem, you can always talk to the guy who just wrote it.

Resurrect the long-dead coding agent responsible at exactly the moment in code and context when they changed that line. Learn how the code works and why it works that way straight from the programmer, rather than through post-hoc analysis (guessing).

Seance is a feature of Trunk Sync, which I use for extreme continuous integration with my coding agents. It was the challenge of not being able to personally defend main – normally my last line of defence – that drove me to create Seance.

Typical example at https://lnkd.in/gqMEeBE4 – wanting to know why a Docker image was changed.

In your project folder:

ppm i @susu-eng/trunk-sync
trunk-sync install

Please remember, I am just sharing my own experiments. I only hope it’s interesting for you.

Trunk Sync: Maximum continuous integration for coding agents. Agents work in parallel on local worktrees, across remote machines – any mix, all with agentic conflict resolution. No resolving conflicts by hand, or discovering that an agent never pushed its work.

Seance: Talk to dead coding agents. Point at any line of code and rewind the codebase and session back to the exact moment it was written. Ask the agent what it was thinking. Understand generated code on demand and stop worrying about keeping up with every change your agents make.

Academic: When you can run multiple Claude Code agents on the same codebase from anywhere and without breaking each other’s work, your comprehension becomes the bottleneck. People are framing this as “cognitive debt”, and here we are exploring the far right of this debate – extreme post-hoc understanding. Don’t worry about cognitive debt at all – just build as fast as you can and make it easier to catch up selectively. I’m not endorsing – just experimenting and learning like you.

Caveats: There’s a flag for pushing Claude transcripts in case the session doing the work was on another machine or needs to be accessed after Claude cleans it up. A better version (please feel free to PR) would push transcripts to a server so they can be accessed securely outside of Git.

There’s another command for summoning the developer who instructed the agent to write the code, but that one is occult – best kept as an easter egg 😂

Thinker CLI —

Thinker CLI

I’m sharing Thinker CLI.

You’ve seen me talk about how valuable CLIs are in agent-land already:
– Self-documenting
– Model domain objects and lifecycles
– Model workflows
– Provide fast feedback
– Teach agents incrementally (rather than requiring full usage baked into a skill)
– Run by any shell-using agent
Give an agent a good CLI and it can do _the thing_ even if it doesn’t know how, because _how_ is baked into the CLI.

Thinker CLI brings all these benefits _and it’s super simple_.

Thinker lets anybody define (and share!) a guided, multi-step thought process for your agent in a JSON config file. Agents follows user directions (or automation) to use Thinker with the config file, then Thinker walks them through the multi-turn process in the config file call by call using structured inputs, structured outputs, interpolation into templates, and strict validation. This way work is presented to the agent clearly, incrementally, and validated at each step. The agent can “think through” complicated work, programmed in advance.

I’ve been using this approach – human-guided CoT sequences with structured inputs and outputs – to great effect in my projects for years now. With good design, it _way_ outperforms the generalised reasoning processes built into current models. I’m really happy I can share it in such a simple way.

Used in an agent, you can define steps for searching in memory, saving back into memory, researching online, producing complex artefacts: Thinker CLI allows you to compose any of your agents functionality in linear sequences using natural language.

Links:
– If you want to read more: https://lnkd.in/g3khXusD
– If you want to tell your agent to install: https://lnkd.in/g-SzcWiU
– Example of a coding agent running it: https://lnkd.in/gyDxBNGv (I normally use Thinker with OpenClaw, but this was easier to get logs of. You see how any agent can use it)

Defensive programming and coding agents —

Defensive programming and coding agents

Codex and Claude are way too defensive. I think this is a good time to talk about defensive programming.

Say I believe some scenario is impossible, and should it be true there will be an error – a console error, request failure, something noisy – but life will go on.

This is actually good. I am probably not wrong, so there is no reason to complicate my code. If I am wrong, great! Through failing fast (and good observability) I will discover my wrongness and we will all be better off for it. The effects of being wrong in software are cumulative and sometimes fatal, so we want to uncover wrongness early.

Building a good understanding of how data actually flows through your system is important. You should not just guess. You also should not defend against everything by default. You should actually check, be confident that you know, and be eager to discover that you are wrong.

The worst thing is taking a wild guess at how some unexpected edge case should be handled, when you really have no idea why that would have happened, or what the downstream implications of your handling will be. It is routine for coding agents to mishandle, downgrade to warning, or just completely swallow errors that reveal critical misunderstandings and concordant design problems in your (their?) software.

Coding agents love defensive programming. There could be many reasons for this, but two come to mind:
– They just don’t want it to crash, like the early JS mentality.
– They don’t want to “miss an edge case”, perhaps reflective of a lot of training data produced by people who didn’t want to “miss an edge case”.

When you vibe code (not agentically engineer, or whatever we’re calling it) and everything looks amazing, how much of the implementation is just failing quietly because of defensive programming? Perhaps it helps explain the early-euphoria-hard-crash we saw many vibe coders go through.

Instruct your coding agents to fail fast and loud.