Infinite Memory for AI.
Locally.

A local-first vector memory engine for Python. No API keys. No cloud bills. Just high-performance, offline RAG for your LLM agents.

Read The Docs
bash — 80x24
â–ˆ

Data Flow Architecture

User Query
L1 Cache
(O(1) Hash)
Vector DB
(Chroma)
LLM Agent
Hot Path (Cache Hit) Cold Path (Vector Search)

System Capabilities

âš¡

O(1) Semantic Cache

Why search vectors twice? MemLoop hashes queries to intercept repeated questions instantly. Improves latency by 99% for recurring topics.

🔒

100% Offline

Your data never leaves localhost. We use ChromaDB and lightweight SentenceTransformers that run on your CPU. Perfect for sensitive contracts, medical data, or PII.

📂

Universal Ingestion

Point to a folder of .pdf, .csv, or .txt files. MemLoop handles the ETL pipeline automatically.

🔖

Page-Level Citations

Hallucination killer. Every retrieval comes with source metadata: {source: "manual.pdf", page: 42}.

Developer documentation

Install and run your first agent in under 30 seconds.

# 1. Install via pip pip install memloop google-generativeai # 2. Build a RAG Agent (20 Lines) import google.generativeai as genai from memloop import MemLoop # Setup genai.configure(api_key="YOUR_KEY") model = genai.GenerativeModel('gemini-pro') brain = MemLoop() # Ingest (Run once, remember forever) brain.learn_url("https://docs.python.org/3/") # Retrieve & Generate query = "How do decorators work?" context = brain.recall(query) prompt = f"Context:\n{context}\n\nUser: {query}" response = model.generate_content(prompt) print(response.text)

Gemini RAG Starter

Drop-in script that learns a URL, retrieves context, and answers.

import google.generativeai as genai
from memloop import MemLoop

genai.configure(api_key="YOUR_GEMINI_KEY")
model = genai.GenerativeModel('gemini-2.5-flash')
brain = MemLoop()

print(f"Learned {brain.learn_url('https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)')} chunks.")

query = "What is a transformer?"
context = brain.recall(query)

response = model.generate_content(f"Answer using this context:\n{context}\n\nUser: {query}")

print(f"\n Context Found:\n{context[:200]}...\n")
print(f" Gemini Says:\n{response.text}")

Comprehensive Reference for Power Users.

MemLoop API at a glance

All public functions and status fields from the core engine.

Initialization

MemLoop(db_path, chunk_size, chunk_overlap, cache_max_size, cache_similarity_threshold, retrieval_max_distance, short_term_limit)

Configures chunking, cache policy, and retrieval threshold.

Ingestion

learn_url(url, follow_links=False, max_pages=10) -> int

learn_local(folder_path) -> int

learn_doc(file_path, page_number=None) -> int

Memory

add_memory(text) -> None

recall(query, n_results=5, include_short_term=True) -> str

Cache & Management

forget_cache() -> None

forget_source(source) -> None

status() -> dict

__repr__() -> str

Status Fields

long_term_count

short_term_count

cache_size

cache_max

class MemLoop(db_path="./memloop_data")

Initializes the local vector engine. Data is persisted to disk.

Usage examples (one per function)

Copy and paste these snippets to get started instantly.

Initialization
from memloop import MemLoop

brain = MemLoop(
    db_path="./memloop_data",
    chunk_size=500,
    chunk_overlap=100,
    cache_max_size=512,
    cache_similarity_threshold=0.15,
    retrieval_max_distance=1.2,
    short_term_limit=10,
)
learn_url()
count = brain.learn_url(
    "https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)",
    follow_links=False,
    max_pages=10,
)
print(f"Indexed {count} chunks.")
learn_local()
count = brain.learn_local("./docs")
print(f"Indexed {count} chunks from local folder.")
learn_doc()
count = brain.learn_doc("./manual.pdf", page_number=2)
print(f"Indexed {count} chunks from page 2.")
add_memory()
brain.add_memory("Customer asked about pricing tiers and enterprise plan.")
recall()
context = brain.recall(
    "What is a transformer?",
    n_results=5,
    include_short_term=True,
)
print(context)
forget_cache()
brain.forget_cache()
forget_source()
brain.forget_source("https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)")
status()
stats = brain.status()
print(stats)
# {"long_term_count": 1200, "short_term_count": 3, "cache_size": 14, "cache_max": 512}

.learn_url(url: str) -> int

Scrapes, cleans, and chunks a webpage. Returns chunk count.

.learn_local(folder_path: str) -> int

Recursively ingests a folder. Supports .pdf, .csv, .txt, .md.

.recall(query: str) -> str

1. Checks Semantic Cache (O(1)).
2. Performs Vector Search.
3. Returns formatted text with citations.

Use the terminal for quick testing and data management.

$ memloop [SYSTEM]: Initializing Neural Link... > /learn https://en.wikipedia.org/wiki/Artificial_intelligence [SYSTEM]: Absorbed 45 chunks. > What is the history of AI? [MEMLOOP]: "AI history began in antiquity..." (Source: Wikipedia)