Memory
Mango uses a vector store to remember successful interactions. Every time a query works correctly, it is saved. The next time you ask a similar question, that example is injected into the prompt — making the answer faster and more accurate.
How it works
- Store — after a successful tool call,
(question, tool_name, tool_args, result_summary)is saved automatically. - Retrieve — at the start of each
ask(), the memory is searched for questions semantically similar to the current one. - Inject — matches above the similarity threshold are added to the system prompt as few-shot examples.
This happens transparently. You don't need to manage it manually.
Three memory layers
ChromaAgentMemory maintains three separate ChromaDB collections:
| Layer | Collection | Description |
|---|---|---|
| Auto-save | mango_memory | Populated automatically during use. Grows over time. |
| Text notes | mango_memory_text | Free-form domain knowledge you add manually. |
| Training | mango_memory_training | Gold-standard examples you load explicitly. Injected first, never overwritten by auto-save. |
Training entries take priority over auto-saved ones in the prompt — the LLM uses them directly without exploring the schema first, which reduces latency and improves accuracy.
Setup
from mango.integrations.chromadb import ChromaAgentMemory
# Persistent (survives restarts)
memory = ChromaAgentMemory(persist_dir="./mango_memory")
# In-memory (for testing)
memory = ChromaAgentMemory(persist_dir=":memory:")
Pass memory to MangoAgent:
agent = MangoAgent(..., agent_memory=memory)
Training
Load verified question→query pairs so the agent learns your domain before it starts answering.
Format
Training data is a JSONL file. Each line is one entry:
{"type": "training", "question": "How many orders were placed in 2024?", "tool_name": "run_mql", "tool_args": {"operation": "count", "collection": "orders", "filter": {"created_at": {"$gte": "2024-01-01", "$lte": "2024-12-31"}}}}
{"type": "training", "question": "Top 10 products by revenue", "tool_name": "run_mql", "tool_args": {"operation": "aggregate", "collection": "orders", "pipeline": [{"$unwind": "$items"}, {"$group": {"_id": "$items.product_id", "revenue": {"$sum": "$items.total"}}}, {"$sort": {"revenue": -1}}, {"$limit": 10}]}}
{"type": "text", "text": "'active user' means a user with at least one order in the last 90 days"}
type can be "training" (question→tool call) or "text" (free-form note). Both are supported in the same file.
CLI
mango train --file knowledge.jsonl
Output:
Training complete: 59 entries loaded (0 errors)
Python
from mango.memory.models import TrainingEntry
from mango.memory import make_entry_id
entry = TrainingEntry(
id=make_entry_id(),
question="How many orders were placed in 2024?",
tool_name="run_mql",
tool_args={
"operation": "count",
"collection": "orders",
"filter": {"created_at": {"$gte": "2024-01-01", "$lte": "2024-12-31"}},
},
)
await memory.train(entry)
REST API
POST /memory/train
Content-Type: application/json
{
"question": "How many orders were placed in 2024?",
"tool_name": "run_mql",
"tool_args": {"operation": "count", "collection": "orders", "filter": {"created_at": {"$gte": "2024-01-01"}}}
}
Correcting mistakes in chat
If the agent gives a wrong answer and saves an incorrect query to memory, the user can ask to remove it in plain language — no UI needed.
Enable this by registering DeleteLastMemoryEntryTool after the agent is created:
from mango.tools.memory_tools import DeleteLastMemoryEntryTool
agent = MangoAgent(...)
tools.register(DeleteLastMemoryEntryTool(memory, lambda: agent._last_memory_entry_id))
Now users can say things like:
- "that was wrong, delete it"
- "forget the last query"
- "don't save that"
The agent calls delete_last_memory_entry and removes the entry from the vector index. The next similar question starts fresh.
Pre-load domain knowledge
You can save free-form text that the LLM can retrieve when relevant:
# Business terminology
await memory.save_text("'active user' means a user with at least one order in the last 90 days")
await memory.save_text("'revenue' always refers to the total_amount field in the orders collection")
# Field semantics
await memory.save_text("the status field uses integers: 1=pending, 2=shipped, 3=delivered, 4=cancelled")
# Relationships
await memory.save_text("orders.user_id references the _id field in the users collection")
Inspect what's stored
# Search auto-saved entries
entries = await memory.retrieve(
"how many users signed up last month?",
top_k=10,
similarity_threshold=0.0, # 0.0 = return everything
)
for e in entries:
print(f"{e.question} → {e.tool_name}({e.tool_args}) [{e.similarity:.2f}]")
# Count entries per layer
print(f"Auto-saved: {memory.count()}")
print(f"Training: {memory.training_count()}")
# Delete a specific entry
await memory.delete(entry_id)
Export and import
Move memory between environments or back it up:
# Export all entries (auto-save + text + training)
entries = await memory.export_all()
# Import into another memory instance
imported = await memory.import_all(entries)
print(f"Imported {imported} entries")
REST API:
GET /memory/export
POST /memory/import (body: list of dicts from export_all)
MemoryEntry
@dataclass
class MemoryEntry:
id: str
question: str # original natural language question
tool_name: str # tool that was called
tool_args: dict # exact arguments that worked
result_summary: str # first 300 chars of the result
similarity: float # filled by retrieve(), cosine similarity score
timestamp: str # ISO timestamp
TrainingEntry
@dataclass
class TrainingEntry:
id: str
question: str
tool_name: str
tool_args: dict
result_summary: str # optional
similarity: float # filled by get_training_entries()
timestamp: str
Configuration
| Parameter | Default | Description |
|---|---|---|
persist_dir | ".mango_memory" | Directory for ChromaDB storage. Use ":memory:" for ephemeral. |
collection_name | "mango_memory" | Base name for ChromaDB collections. Training uses {name}_training, text uses {name}_text. |
memory_top_k | 3 | Max auto-saved examples to inject per question (set on MangoAgent). |
similarity_threshold | 0.6 | Minimum similarity score to include an example. |
training_top_k | 3 | Max training examples to inject per question (set on MangoAgent). |
Custom memory backend
Mango's memory system is built on an abstract interface. Implement MemoryService to use any vector store:
from mango.memory import MemoryService
class MyPineconeMemory(MemoryService):
async def store(self, entry: MemoryEntry) -> None: ...
async def retrieve(self, question, top_k, similarity_threshold) -> list[MemoryEntry]: ...
async def delete(self, entry_id) -> None: ...
async def save_text(self, text) -> str: ...
async def search_text(self, query, top_k, similarity_threshold) -> list[TextMemoryEntry]: ...
def count(self) -> int: ...
→ See Custom Memory Backend for the full guide.