First live observations: Insight Cards & Entity Extraction (v10.55) #897
Replies: 4 comments
-
|
Addendum — Milvus backend credit Worth mentioning here: the @henry201605 is also actively testing the new v10.55 features on Milvus (as evidenced by PR #898). Would be great to hear your observations on how entity extraction and insight cards behave on a Milvus corpus! 🐋 |
Beta Was this translation helpful? Give feedback.
-
|
Status update on the three open questions (v10.57.3): 1. InsightGenerator tag exclusion list — not yet implemented. The gap detector still fires on status/metadata tags like 2. CI observation retention policy — the recommendation in the post stands: use 3. Insight card acknowledgement — no "acknowledged / won't fix" flag yet. For now, the workaround is deleting the unwanted card; it will re-generate on the next cycle unless the underlying tag pattern changes. A Curious to hear from others running on different corpus sizes — particularly the entity extraction yield (links per 500 scanned) and the insight card survival rate (net-new vs. generated). On this 9,300-memory corpus the ratio was 6/189 (3.2%); I would expect that to be higher on a younger corpus where fewer cards have been stored previously. @henry201605 — how are Steps 5 and 6 looking on your Milvus corpus? |
Beta Was this translation helpful? Give feedback.
-
|
Great to see the features in action on a real corpus! Here are our numbers for comparison — smaller scale but same stack. Our Setup
ObservationsEntity Extraction:
Insight Cards:
Harvest + locale patterns (our fork):
Thanks for the credit on |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for these detailed observations — they directly shaped v10.58.0 (released today, May 16 2026)! All three issues you raised are now addressed:
Details: PR #939 | CHANGELOG | Release v10.58.0 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What this is
After enabling
MCP_INSIGHT_CARDS_ENABLED=trueand running a fullmaintaincycle today, here are the first-hand observations from using Insight Cards (#869) and Entity Extraction (#868) in a real memory corpus of ~9,300 memories.Credit where it is due:
tag_matchAND/OR filtering — contributed by @filhocf in PR [Feature]: tags filter that narrows down results (aka AND filter) #889The maintain cycle output
With all 6 steps running (
MCP_INSIGHT_CARDS_ENABLED=true, consolidation enabled):The deduplication on Step 6 is working well — 183 previously-stored cards correctly skipped, only 6 genuinely new ones written.
Key observations from the first 106 insight cards
1.
conflict:unresolvedgap — known false positiveThe gap detector flagged: "Tag 'conflict:unresolved' has 140 memories but no decisions recorded."
This is a false positive. The tag is applied by the session consolidation hook as a status marker on session summaries ("this session had unresolved items"). It is not a knowledge domain requiring decision documentation.
Takeaway for future improvement: the InsightGenerator gap detector would benefit from a configurable exclusion list of tags that are metadata/status markers rather than knowledge domains (e.g.
conflict:unresolved,automated,__test__).2.
radar:2026-04-02— real data quality issue surfaced64 LinkedIn posts harvested by an agent in April had
memory_type=NULL. The insight trend detector tried to sort them and hit:This exposed a pre-existing data quality gap and a bug in the InsightGenerator.
Fix shipped as v10.55.2: Three
dict.get("key", default)calls ininsights.pyreplaced withor ""/or []. The gotcha:dict.getdoes not fall back todefaultwhen the key exists with aNonevalue. All 64 memories also bulk-updated tomemory_type="reference".3.
citag — signal/noise problem made visibleThe gap detector surfaced: "Tag 'ci' has 2,476 memories but no decisions recorded."
On inspection: 2,276 of those are automated CI run observation dumps. The insight card is technically correct but practically noise. The
citag does have 39 decisions — they're just drowned out.Recommendation: use
memory_type=observation+ tagtemporaryfor automated CI dumps so the 7-day retention policy auto-expires them. Reservedecision/learningtypes for actionable CI outcomes.4. Architecture trend reversal — real gap caught
The trend detector noticed the
architecturetag recently shifted fromdecision+pattern→learning+observation. This was accurate — formal architecture decision memories had stopped being recorded as the codebase matured.Action taken: stored an explicit architecture direction decision memory documenting the current state (modular server layer, Strategy Pattern for storage, HTTP as primary interface, Hybrid sync separation of concerns).
How to enable these features
Add to your
.env:Then trigger a maintain cycle via MCP tool:
Or via HTTP API:
Open questions / future improvements
temporarytag by default to self-expire?Happy to hear how others are finding these features on larger corpora. 🧠
Beta Was this translation helpful? Give feedback.
All reactions