Memory
Mango uses a vector store to remember successful interactions. Every time a query works correctly, it is saved. The next time you ask a similar question, that example is injected into the prompt — making the answer faster and more accurate.
How it works
- Store — after a successful tool call,
(question, tool_name, tool_args, result_summary)is saved automatically. - Retrieve — at the start of each
ask(), the memory is searched for questions semantically similar to the current one. - Inject — matches above the similarity threshold are added to the system prompt as few-shot examples.
This happens transparently. You don't need to manage it manually.
Setup
from mango.integrations.chromadb import ChromaAgentMemory
# Persistent (survives restarts)
memory = ChromaAgentMemory(persist_dir="./mango_memory")
# In-memory (for testing)
memory = ChromaAgentMemory(persist_dir=":memory:")
Pass memory to MangoAgent:
agent = MangoAgent(..., agent_memory=memory)
Pre-load domain knowledge
You can save free-form text that the LLM can retrieve when relevant:
# Business terminology
await memory.save_text("'active user' means a user with at least one order in the last 90 days")
await memory.save_text("'revenue' always refers to the total_amount field in the orders collection")
# Field semantics
await memory.save_text("the status field uses integers: 1=pending, 2=shipped, 3=delivered, 4=cancelled")
# Relationships
await memory.save_text("orders.user_id references the _id field in the users collection")
The LLM will retrieve these when answering questions that touch on these concepts.
Inspect what's stored
# Search for entries similar to a question
entries = await memory.retrieve(
"how many users signed up last month?",
top_k=10,
similarity_threshold=0.0, # 0.0 = return everything
)
for e in entries:
print(f"{e.question} → {e.tool_name}({e.tool_args}) [{e.similarity:.2f}]")
# Count total stored entries
print(f"Total memories: {memory.count()}")
# Delete an entry
await memory.delete(entry_id)
MemoryEntry
@dataclass
class MemoryEntry:
id: str
question: str # original natural language question
tool_name: str # tool that was called
tool_args: dict # exact arguments that worked
result_summary: str # first 300 chars of the result
similarity: float # filled by retrieve(), cosine similarity score
timestamp: str # ISO timestamp
Configuration
| Parameter | Default | Description |
|---|---|---|
persist_dir | ".mango_memory" | Directory for ChromaDB storage. Use ":memory:" for ephemeral. |
collection_name | "mango_memory" | Base name for ChromaDB collections. |
memory_top_k | 3 | Max examples to inject per question (set on MangoAgent). |
similarity_threshold | 0.6 | Minimum similarity score to include an example. |
Custom memory backend
Mango's memory system is built on an abstract interface. Implement MemoryService to use any vector store:
from mango.memory import MemoryService
class MyPineconeMemory(MemoryService):
async def store(self, entry: MemoryEntry) -> None: ...
async def retrieve(self, question, top_k, similarity_threshold) -> list[MemoryEntry]: ...
async def delete(self, entry_id) -> None: ...
async def save_text(self, text) -> str: ...
async def search_text(self, query, top_k, similarity_threshold) -> list[TextMemoryEntry]: ...
def count(self) -> int: ...
→ See Custom Memory Backend for the full guide.