Ingestion Strategies | OpenMemory

Advanced strategies for ingesting large documents, websites, and multimodal data into OpenMemory.

Ingestion Strategies

Ingesting data into long-term memory requires more than just "copy-paste". You need to consider chunking, metadata, and relevance.

Pipelines

OpenMemory supports ingestion pipelines for:

  • Documents: PDF, DOCX, TXT, Markdown.
  • Web: URL scraping (HTML to Markdown).
  • Audio/Video: MP3, WAV, MP4, MOV (via Whisper & FFmpeg).

API Usage

You can ingest files via the API:

POST /memory/ingest
{
  "content_type": "audio/mp3",
  "data": "base64_encoded_audio...",
  "user_id": "user_1"
}

Best Practices

1. Add Metadata

Always tag ingested content with its source.

await ingest(mem, "paper.pdf", {
  metadata: { source: "research_paper", year: 2024 }
});

2. Use Semantic Chunking

For complex documents, use semantic chunking to keep related ideas together.

3. Rate Limiting

When ingesting a whole website, be mindful of your embedding provider's rate limits. OpenMemory handles backoff automatically in the background.

© 2025 OpenMemory · MIT License