Memory

Mango uses a vector store to remember successful interactions. Every time a query works correctly, it is saved. The next time you ask a similar question, that example is injected into the prompt — making the answer faster and more accurate.

How it works

  1. Store — after a successful tool call, (question, tool_name, tool_args, result_summary) is saved automatically.
  2. Retrieve — at the start of each ask(), the memory is searched for questions semantically similar to the current one.
  3. Inject — matches above the similarity threshold are added to the system prompt as few-shot examples.

This happens transparently. You don't need to manage it manually.

Three memory layers

ChromaAgentMemory maintains three separate ChromaDB collections:

LayerCollectionDescription
Auto-savemango_memoryPopulated automatically during use. Grows over time.
Text notesmango_memory_textFree-form domain knowledge you add manually.
Trainingmango_memory_trainingGold-standard examples you load explicitly. Injected first, never overwritten by auto-save.

Training entries take priority over auto-saved ones in the prompt — the LLM uses them directly without exploring the schema first, which reduces latency and improves accuracy.

Setup

from mango.integrations.chromadb import ChromaAgentMemory

# Persistent (survives restarts)
memory = ChromaAgentMemory(persist_dir="./mango_memory")

# In-memory (for testing)
memory = ChromaAgentMemory(persist_dir=":memory:")

Pass memory to MangoAgent:

agent = MangoAgent(..., agent_memory=memory)

Training

Load verified question→query pairs so the agent learns your domain before it starts answering.

Format

Training data is a JSONL file. Each line is one entry:

{"type": "training", "question": "How many orders were placed in 2024?", "tool_name": "run_mql", "tool_args": {"operation": "count", "collection": "orders", "filter": {"created_at": {"$gte": "2024-01-01", "$lte": "2024-12-31"}}}}
{"type": "training", "question": "Top 10 products by revenue", "tool_name": "run_mql", "tool_args": {"operation": "aggregate", "collection": "orders", "pipeline": [{"$unwind": "$items"}, {"$group": {"_id": "$items.product_id", "revenue": {"$sum": "$items.total"}}}, {"$sort": {"revenue": -1}}, {"$limit": 10}]}}
{"type": "text", "text": "'active user' means a user with at least one order in the last 90 days"}

type can be "training" (question→tool call) or "text" (free-form note). Both are supported in the same file.

CLI

mango train --file knowledge.jsonl

Output:

Training complete: 59 entries loaded (0 errors)

Python

from mango.memory.models import TrainingEntry
from mango.memory import make_entry_id

entry = TrainingEntry(
    id=make_entry_id(),
    question="How many orders were placed in 2024?",
    tool_name="run_mql",
    tool_args={
        "operation": "count",
        "collection": "orders",
        "filter": {"created_at": {"$gte": "2024-01-01", "$lte": "2024-12-31"}},
    },
)
await memory.train(entry)

REST API

POST /memory/train
Content-Type: application/json

{
  "question": "How many orders were placed in 2024?",
  "tool_name": "run_mql",
  "tool_args": {"operation": "count", "collection": "orders", "filter": {"created_at": {"$gte": "2024-01-01"}}}
}

Correcting mistakes in chat

If the agent gives a wrong answer and saves an incorrect query to memory, the user can ask to remove it in plain language — no UI needed.

Enable this by registering DeleteLastMemoryEntryTool after the agent is created:

from mango.tools.memory_tools import DeleteLastMemoryEntryTool

agent = MangoAgent(...)
tools.register(DeleteLastMemoryEntryTool(memory, lambda: agent._last_memory_entry_id))

Now users can say things like:

  • "that was wrong, delete it"
  • "forget the last query"
  • "don't save that"

The agent calls delete_last_memory_entry and removes the entry from the vector index. The next similar question starts fresh.

Pre-load domain knowledge

You can save free-form text that the LLM can retrieve when relevant:

# Business terminology
await memory.save_text("'active user' means a user with at least one order in the last 90 days")
await memory.save_text("'revenue' always refers to the total_amount field in the orders collection")

# Field semantics
await memory.save_text("the status field uses integers: 1=pending, 2=shipped, 3=delivered, 4=cancelled")

# Relationships
await memory.save_text("orders.user_id references the _id field in the users collection")

Inspect what's stored

# Search auto-saved entries
entries = await memory.retrieve(
    "how many users signed up last month?",
    top_k=10,
    similarity_threshold=0.0,  # 0.0 = return everything
)
for e in entries:
    print(f"{e.question}{e.tool_name}({e.tool_args}) [{e.similarity:.2f}]")

# Count entries per layer
print(f"Auto-saved: {memory.count()}")
print(f"Training:   {memory.training_count()}")

# Delete a specific entry
await memory.delete(entry_id)

Export and import

Move memory between environments or back it up:

# Export all entries (auto-save + text + training)
entries = await memory.export_all()

# Import into another memory instance
imported = await memory.import_all(entries)
print(f"Imported {imported} entries")

REST API:

GET  /memory/export
POST /memory/import   (body: list of dicts from export_all)

MemoryEntry

@dataclass
class MemoryEntry:
    id: str
    question: str       # original natural language question
    tool_name: str      # tool that was called
    tool_args: dict     # exact arguments that worked
    result_summary: str # first 300 chars of the result
    similarity: float   # filled by retrieve(), cosine similarity score
    timestamp: str      # ISO timestamp

TrainingEntry

@dataclass
class TrainingEntry:
    id: str
    question: str
    tool_name: str
    tool_args: dict
    result_summary: str  # optional
    similarity: float    # filled by get_training_entries()
    timestamp: str

Configuration

ParameterDefaultDescription
persist_dir".mango_memory"Directory for ChromaDB storage. Use ":memory:" for ephemeral.
collection_name"mango_memory"Base name for ChromaDB collections. Training uses {name}_training, text uses {name}_text.
memory_top_k3Max auto-saved examples to inject per question (set on MangoAgent).
similarity_threshold0.6Minimum similarity score to include an example.
training_top_k3Max training examples to inject per question (set on MangoAgent).

Custom memory backend

Mango's memory system is built on an abstract interface. Implement MemoryService to use any vector store:

from mango.memory import MemoryService

class MyPineconeMemory(MemoryService):
    async def store(self, entry: MemoryEntry) -> None: ...
    async def retrieve(self, question, top_k, similarity_threshold) -> list[MemoryEntry]: ...
    async def delete(self, entry_id) -> None: ...
    async def save_text(self, text) -> str: ...
    async def search_text(self, query, top_k, similarity_threshold) -> list[TextMemoryEntry]: ...
    def count(self) -> int: ...

→ See Custom Memory Backend for the full guide.