How to Self-Host Cognee with Dokploy or Docker Compose

Complete guide to self-hosting Cognee AI memory platform on your own infrastructure using Dokploy or Docker Compose with PostgreSQL, pgvector, and MCP integration.

How to Self-Host Cognee with Dokploy or Docker Compose
Community & Learning

Join BitBuddies

Level up your DevOps skills with hands-on courses on CloudPanel and Dockploy. Join our community of developers and get expert workshops to accelerate your online journey.

Expert-led Courses
Live Workshops
Supportive Community
Practical Projects
Explore BitBuddies

Start your journey to DevOps mastery today! 🚀

If you’re building AI applications that need persistent memory and knowledge graphs, Cognee is a powerful open-source platform worth exploring. It transforms your data into structured knowledge graphs with semantic search capabilities, making it perfect for RAG applications, chatbots, and AI assistants. While cloud solutions exist, self-hosting gives you complete control over your data, privacy, and costs.

In this comprehensive guide, I’ll show you how to self-host Cognee on your own infrastructure using either Dokploy (the easiest method) or Docker Compose for more control. We’ll set up a production-ready deployment with PostgreSQL and pgvector for both metadata and vector storage, using OpenAI for the embedding model, plus the MCP server for AI assistant integration.

What is Cognee?

Cognee is an AI memory platform that organizes your data into knowledge graphs. Unlike simple vector databases, Cognee builds semantic relationships between your data points, enabling more intelligent retrieval and reasoning for AI applications.

Key Features of Cognee

  • Knowledge Graph Construction: Automatically extracts entities and relationships from your documents
  • Vector Embeddings: Semantic search using configurable embedding providers (OpenAI, Gemini, Ollama, etc.)
  • Multi-Provider LLM Support: Works with OpenAI, Anthropic, Google Gemini, Ollama, and more
  • MCP Integration: Model Context Protocol support for AI coding assistants like Cursor, Claude, and VS Code
  • REST API: Full-featured API for data ingestion, processing, and search
  • Flexible Storage: Supports PostgreSQL, SQLite, Neo4j, and various vector stores
  • Code Intelligence: Special pipelines for analyzing and understanding codebases
  • Dataset Management: Organize data into separate datasets with permissions
  • Session Memory: Maintain conversational context across interactions

Why Self-Host Cognee?

Benefits of Self-Hosting:

  • Complete data privacy and ownership
  • No usage limits or API costs (beyond LLM providers)
  • Custom infrastructure and scaling options
  • Integration with private networks and services
  • Full control over model and embedding choices

Use Cases:

  • Building AI assistants with long-term memory
  • RAG (Retrieval Augmented Generation) applications
  • Code analysis and documentation tools
  • Knowledge management systems
  • AI-powered search for internal documents

Prerequisites

Before you begin, make sure you have:

  • A VPS or Server: Minimum 4GB RAM and 2 CPU cores recommended
  • A Domain Name: For accessing your Cognee API (e.g., cognee.yourdomain.com)
  • Docker Installed: Docker and Docker Compose (Dokploy includes this)
  • OpenAI API Key: For embeddings and LLM operations (or alternative provider)
  • Basic Command Line Knowledge: For running deployment commands

Hosting Recommendations

For production use, we recommend a VPS with at least 4GB RAM. We use pgvector/pgvector:pg17 which is PostgreSQL with the pgvector extension - this single database handles both relational data (metadata, users, datasets) AND vector embeddings for semantic search. This simplifies deployment significantly. Providers like Hetzner, DigitalOcean, or AWS work well.

Understanding the Storage Architecture

Before diving into deployment, it’s helpful to understand how Cognee stores data:

Three Storage Layers

Cognee uses three storage layers, and our Docker Compose handles all of them:

  1. Relational Database (DB_PROVIDER=postgres): Stores metadata, user accounts, datasets, document information, and pipeline state
  2. Vector Database (VECTOR_DB_PROVIDER=pgvector): Stores embeddings for semantic similarity search using the pgvector extension
  3. Graph Database (GRAPH_DATABASE_PROVIDER=kuzu): Stores knowledge graph data (entities and relationships) in a file-based directory

We use pgvector/pgvector:pg17 - PostgreSQL 17 with the pgvector extension - for both relational and vector storage. For the graph database, Kuzu stores data inside the container’s filesystem, so we need a volume to persist it across container restarts.

Option 1: Deploy with Dokploy (Easiest Method)

Dokploy is an open-source Platform as a Service that simplifies deploying Docker applications. If you haven’t set up Dokploy yet, check out our Dokploy Installation Guide.

Step 1: Install Dokploy

If not already installed:

curl -sSL https://dokploy.com/install.sh | sh

Access Dokploy at http://your-vps-ip:3000 and complete the setup.

Step 2: Create a New Project

  1. Log in to Dokploy dashboard
  2. Click “Create Project” and name it (e.g., “Cognee”)
  3. Inside the project, click “Add Service”“Compose”
  4. Select “Docker Compose” type (not Stack)
  5. Name it “cognee-stack”

Step 3: Add Docker Compose Configuration

Go to the General tab and paste the following Docker Compose configuration:

services:
  cognee:
    image: cognee/cognee:main
    networks:
      - dokploy-network
      - cognee-network
    volumes:
      - cognee-data:/app/.cognee_system
    environment:
      - HOST=0.0.0.0
      - ENVIRONMENT=production
      - LOG_LEVEL=INFO
      # Authentication (REQUIRED for public deployment)
      - REQUIRE_AUTHENTICATION=true
      # LLM Configuration
      - LLM_API_KEY=${LLM_API_KEY}
      - LLM_PROVIDER=${LLM_PROVIDER:-openai}
      - LLM_MODEL=${LLM_MODEL:-gpt-4o-mini}
      # Embedding Configuration (OpenAI)
      - EMBEDDING_PROVIDER=${EMBEDDING_PROVIDER:-openai}
      - EMBEDDING_MODEL=${EMBEDDING_MODEL:-openai/text-embedding-3-small}
      - EMBEDDING_DIMENSIONS=${EMBEDDING_DIMENSIONS:-1536}
      - EMBEDDING_API_KEY=${LLM_API_KEY}
      # Database Configuration (PostgreSQL for relational data)
      - DB_PROVIDER=postgres
      - DB_HOST=cognee-postgres
      - DB_PORT=5432
      - DB_NAME=cognee_db
      - DB_USERNAME=cognee
      - DB_PASSWORD=${DB_PASSWORD}
      # Vector Database (pgvector - uses SAME PostgreSQL instance)
      - VECTOR_DB_PROVIDER=pgvector
      # Graph Database (default Kuzu - file-based)
      - GRAPH_DATABASE_PROVIDER=${GRAPH_DATABASE_PROVIDER:-kuzu}
    depends_on:
      cognee-postgres:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 4GB

  cognee-mcp:
    image: cognee/cognee-mcp:main
    networks:
      - dokploy-network
      - cognee-network
    environment:
      - TRANSPORT_MODE=http
      - API_URL=http://cognee:8000
      - LOG_LEVEL=INFO
    depends_on:
      cognee:
        condition: service_healthy
    restart: unless-stopped

  cognee-postgres:
    image: pgvector/pgvector:pg17
    networks:
      - cognee-network
    environment:
      - POSTGRES_USER=cognee
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_DB=cognee_db
    volumes:
      - cognee-postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U cognee -d cognee_db"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: unless-stopped

networks:
  cognee-network:
    name: cognee-network
  dokploy-network:
    external: true

volumes:
  cognee-data:
  cognee-postgres-data:

Volume Explanation

  • cognee-data:/app/.cognee_system - Persists Kuzu graph database and system files
  • cognee-postgres-data:/var/lib/postgresql/data - Persists PostgreSQL data (relational + vector)

Using named volumes ensures data persists across Dokploy deployments.

UI Not Available in Docker

The Cognee frontend Docker image (cognee-frontend) is experimental and currently not well-supported. For the Cognee UI, you need to run cognee-cli -ui locally with a Python installation, which launches both frontend and backend. For Docker deployments, use the Swagger UI at https://cognee.yourdomain.com/docs for full API access.

Configuring Domains in Dokploy

This Docker Compose configuration doesn’t include Traefik labels or exposed ports. Instead, configure domains through Dokploy’s Domain tab:

  1. After deploying, go to the Domains tab for your compose service
  2. Click Add Domain and configure:
    • Domain: cognee.yourdomain.com
    • Container: Select the cognee service
    • Port: 8000
    • Enable HTTPS for automatic SSL
  3. Repeat for the MCP service:
    • Domain: mcp.yourdomain.com
    • Container: Select the cognee-mcp service
    • Port: 8000

This approach is cleaner than inline Traefik labels and allows easy domain management through the Dokploy UI.

Important Notes

  • The dokploy-network is required for Traefik routing
  • Don’t set container_name as it causes issues with Dokploy features
  • The same PostgreSQL instance (cognee-postgres) is used for BOTH relational data AND vector storage via pgvector

Step 4: Configure Environment Variables

Go to the Environment tab and add these variables:

# OpenAI API Key (required for LLM and embeddings)
LLM_API_KEY=sk-your-openai-api-key-here

# Database Password (generate a strong password)
DB_PASSWORD=your-secure-database-password-here

# Optional: LLM Configuration
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini

# Optional: Embedding Configuration
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIMENSIONS=1536

# Optional: Graph Database (kuzu is default, can use neo4j or falkordb)
GRAPH_DATABASE_PROVIDER=kuzu

Generate a secure database password:

openssl rand -base64 32

Step 5: Configure DNS

Before deploying, set up your DNS A records:

  1. cognee.yourdomain.com → Your VPS IP
  2. mcp.yourdomain.com → Your VPS IP

Step 6: Deploy and Configure Domains

  1. Click “Deploy” and wait for the services to start
  2. Monitor the logs in the Deployments or Logs tab
  3. Go to the Domains tab and add domains for each service (see notes above)
  4. Wait about 30 seconds for Traefik to generate SSL certificates

Once deployed, verify:

# Check API health
curl https://cognee.yourdomain.com/health

# Check MCP health
curl https://mcp.yourdomain.com/health

# Access API documentation
open https://cognee.yourdomain.com/docs

Option 2: Deploy with Docker Compose Only

For more manual control or if you prefer not to use Dokploy, here’s how to deploy with Docker Compose directly.

Step 1: Prepare Your Server

Update system and install Docker:

# Update packages
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

# Install Docker Compose plugin
sudo apt install docker-compose-plugin -y

Step 2: Create Project Directory

mkdir -p ~/cognee
cd ~/cognee

Step 3: Create Docker Compose File

nano docker-compose.yml

Paste the following configuration:

services:
  cognee:
    image: cognee/cognee:main
    container_name: cognee
    networks:
      - cognee-network
    volumes:
      - cognee-data:/app/.cognee_system
    environment:
      - HOST=0.0.0.0
      - ENVIRONMENT=production
      - LOG_LEVEL=INFO
      # Authentication (REQUIRED for public deployment)
      - REQUIRE_AUTHENTICATION=true
      # LLM Configuration
      - LLM_API_KEY=${LLM_API_KEY}
      - LLM_PROVIDER=${LLM_PROVIDER:-openai}
      - LLM_MODEL=${LLM_MODEL:-gpt-4o-mini}
      # Embedding Configuration (OpenAI)
      - EMBEDDING_PROVIDER=${EMBEDDING_PROVIDER:-openai}
      - EMBEDDING_MODEL=${EMBEDDING_MODEL:-openai/text-embedding-3-small}
      - EMBEDDING_DIMENSIONS=${EMBEDDING_DIMENSIONS:-1536}
      - EMBEDDING_API_KEY=${LLM_API_KEY}
      # Database Configuration (PostgreSQL for relational data)
      - DB_PROVIDER=postgres
      - DB_HOST=postgres
      - DB_PORT=5432
      - DB_NAME=cognee_db
      - DB_USERNAME=cognee
      - DB_PASSWORD=${DB_PASSWORD}
      # Vector Database (pgvector - uses SAME PostgreSQL instance)
      - VECTOR_DB_PROVIDER=pgvector
      # Graph Database
      - GRAPH_DATABASE_PROVIDER=${GRAPH_DATABASE_PROVIDER:-kuzu}
    ports:
      - "8000:8000"
    depends_on:
      postgres:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    restart: unless-stopped
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 4GB

  cognee-mcp:
    image: cognee/cognee-mcp:main
    container_name: cognee-mcp
    networks:
      - cognee-network
    environment:
      - TRANSPORT_MODE=http
      - API_URL=http://cognee:8000
      - LOG_LEVEL=INFO
    ports:
      - "8001:8000"
    depends_on:
      cognee:
        condition: service_healthy
    restart: unless-stopped

  postgres:
    image: pgvector/pgvector:pg17
    container_name: cognee-postgres
    networks:
      - cognee-network
    environment:
      - POSTGRES_USER=cognee
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_DB=cognee_db
    volumes:
      - cognee-postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U cognee -d cognee_db"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: unless-stopped

networks:
  cognee-network:
    name: cognee-network

volumes:
  cognee-postgres-data:
  cognee-data:

Volume Explanation

  • cognee-data:/app/.cognee_system - Persists Kuzu graph database and Cognee system files
  • cognee-postgres-data:/var/lib/postgresql/data - Persists PostgreSQL data (relational + vector via pgvector)

The pgvector/pgvector:pg17 image is PostgreSQL 17 with the pgvector extension. When we set DB_PROVIDER=postgres and VECTOR_DB_PROVIDER=pgvector, Cognee uses the same PostgreSQL database for both relational and vector storage.

UI Not Available in Docker

The Cognee frontend Docker image is experimental and currently not well-supported. To access the Cognee UI, you need to run cognee-cli -ui locally with a Python installation. For Docker deployments, use the Swagger UI at http://localhost:8000/docs for full API access.

Step 4: Create Environment File

nano .env

Add your configuration:

# OpenAI API Key (required)
LLM_API_KEY=sk-your-openai-api-key-here

# LLM Configuration
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini

# Embedding Configuration
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIMENSIONS=1536

# Database Configuration
DB_PASSWORD=your-secure-database-password

# Graph Database Provider (kuzu, neo4j, or falkordb)
GRAPH_DATABASE_PROVIDER=kuzu

Step 5: Start Cognee

# Start services
docker compose up -d

# View logs
docker compose logs -f

# Check status
docker compose ps

Step 6: Set Up Reverse Proxy with Nginx

For production with custom domains and SSL:

sudo apt install nginx certbot python3-certbot-nginx -y

Create Nginx configuration:

sudo nano /etc/nginx/sites-available/cognee

Paste:

# Cognee API
server {
    listen 80;
    server_name cognee.yourdomain.com;

    location / {
        proxy_pass http://localhost:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 300;
        proxy_connect_timeout 300;
        proxy_send_timeout 300;
    }
}

# MCP Server
server {
    listen 80;
    server_name mcp.yourdomain.com;

    location / {
        proxy_pass http://localhost:8001;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 300;
        proxy_connect_timeout 300;
        proxy_send_timeout 300;
    }
}

Enable and get SSL:

# Enable site
sudo ln -s /etc/nginx/sites-available/cognee /etc/nginx/sites-enabled/

# Test configuration
sudo nginx -t

# Reload Nginx
sudo systemctl reload nginx

# Get SSL certificates
sudo certbot --nginx -d cognee.yourdomain.com -d mcp.yourdomain.com

Option 3: MCP Server Only (Without Full API)

If you only need the MCP server for AI coding assistants like Cursor or Claude Code, you can run the MCP server standalone without the full Cognee API stack. This is a lightweight option perfect for personal development environments.

When to Use MCP-Only Mode

The standalone MCP server is ideal when:

  • You only need AI assistant memory features (not the full REST API)
  • You want a minimal, single-container deployment
  • You’re using it for personal development, not shared team knowledge graphs
  • You want quick setup without managing PostgreSQL

Each MCP instance maintains its own separate data in this mode.

Quick Start with Docker

# Set your API key
export LLM_API_KEY=your_openai_api_key_here

# Create env file
echo "LLM_API_KEY=$LLM_API_KEY" > .env

# Start MCP server
docker run -e TRANSPORT_MODE=http --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main

Verify the Server

curl http://localhost:8000/health

Connect to AI Clients

Once running, connect your AI coding assistant:

Cursor IDE:

{
  "mcpServers": {
    "cognee": {
      "url": "http://localhost:8000/mcp"
    }
  }
}

Claude Code:

claude mcp add --transport http cognee http://localhost:8000/mcp -s project

Docker Compose for MCP-Only (with Persistence)

For persistent storage with the standalone MCP server:

services:
  cognee-mcp:
    image: cognee/cognee-mcp:main
    container_name: cognee-mcp
    environment:
      - TRANSPORT_MODE=http
      - LLM_API_KEY=${LLM_API_KEY}
      - LLM_PROVIDER=${LLM_PROVIDER:-openai}
      - LLM_MODEL=${LLM_MODEL:-gpt-4o-mini}
      - LOG_LEVEL=INFO
    volumes:
      - cognee-mcp-data:/app/.cognee_system
    ports:
      - "8000:8000"
    restart: unless-stopped

volumes:
  cognee-mcp-data:

Create .env file:

LLM_API_KEY=sk-your-openai-api-key-here
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini

Start with:

docker compose up -d

Standalone vs API Mode

Standalone Mode (shown above): Each MCP instance has its own database. Data is not shared between instances.

API Mode (Options 1 & 2): Multiple MCP clients connect to a shared Cognee backend with centralized PostgreSQL storage. Use this for team collaboration or when you need the full REST API.

Configuration Options

Cognee is highly configurable. Here are the key options you can customize:

LLM Providers

Cognee supports multiple LLM providers. Update the environment variables accordingly:

OpenAI (Default):

LLM_PROVIDER=openai
LLM_MODEL=gpt-4o-mini
LLM_API_KEY=sk-your-key

Anthropic Claude:

LLM_PROVIDER=anthropic
LLM_MODEL=claude-3-5-sonnet-20241022
LLM_API_KEY=sk-ant-your-key

Google Gemini:

LLM_PROVIDER=gemini
LLM_MODEL=gemini/gemini-2.0-flash
LLM_API_KEY=AIza-your-key

Ollama (Local):

LLM_PROVIDER=ollama
LLM_MODEL=llama3.1:8b
LLM_ENDPOINT=http://host.docker.internal:11434/v1
LLM_API_KEY=ollama

Embedding Providers

Configure embedding models for vector search:

OpenAI (Default):

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIMENSIONS=1536

OpenAI Large (Better Quality):

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=openai/text-embedding-3-large
EMBEDDING_DIMENSIONS=3072

Google Gemini:

EMBEDDING_PROVIDER=gemini
EMBEDDING_MODEL=gemini/text-embedding-004
EMBEDDING_DIMENSIONS=768
EMBEDDING_API_KEY=AIza-your-key

Dimension Consistency

If you change embedding dimensions, you must reset your vector database. The dimensions must match between your embedding provider and vector store configuration. Since we use pgvector (same PostgreSQL), resetting means dropping and recreating the vector tables.

Graph Database Options

Cognee supports different graph databases for knowledge graph storage:

Kuzu (Default - File-based):

GRAPH_DATABASE_PROVIDER=kuzu

Neo4j (For production/multi-agent):

GRAPH_DATABASE_PROVIDER=neo4j
GRAPH_DATABASE_URL=bolt://neo4j:7687
GRAPH_DATABASE_USERNAME=neo4j
GRAPH_DATABASE_PASSWORD=your-password

To add Neo4j to your Docker Compose:

  neo4j:
    image: neo4j:latest
    container_name: cognee-neo4j
    networks:
      - cognee-network
    ports:
      - "7474:7474"
      - "7687:7687"
    environment:
      - NEO4J_AUTH=neo4j/your-password
      - NEO4J_PLUGINS=["apoc", "graph-data-science"]
    volumes:
      - neo4j-data:/data

Using the Cognee API

Once deployed, you can interact with Cognee via its REST API. Since authentication is enabled, you’ll need to register and login first.

Accessing Cognee

Cognee provides several ways to interact with it:

  1. Swagger UI - Interactive API documentation at /docs (e.g., https://cognee.yourdomain.com/docs) - Recommended for Docker deployments
  2. REST API - All operations via HTTP endpoints
  3. MCP Integration - Through AI coding assistants like Cursor or Claude
  4. CLI with Web UI - Run cognee-cli -ui locally to launch a full web interface (requires local Python installation)

Note: The Cognee Web UI is currently only available through the CLI (cognee-cli -ui), not via Docker. For Docker deployments, use the Swagger UI at /docs for complete API access.

Check Health

curl https://cognee.yourdomain.com/health

Register a User

curl -X POST "https://cognee.yourdomain.com/api/v1/auth/register" \
  -H "Content-Type: application/json" \
  -d '{"email": "[email protected]", "password": "your-strong-password"}'

Login and Get Token

TOKEN=$(curl -s -X POST "https://cognee.yourdomain.com/api/v1/auth/login" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "[email protected]&password=your-strong-password" | jq -r .access_token)

echo $TOKEN

Create a Dataset

curl -X POST "https://cognee.yourdomain.com/api/v1/datasets" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"name": "my_documents"}'

Add Data

curl -X POST "https://cognee.yourdomain.com/api/v1/add" \
  -H "Authorization: Bearer $TOKEN" \
  -F "data=@/path/to/document.pdf" \
  -F "datasetName=my_documents"

Build Knowledge Graph (Cognify)

curl -X POST "https://cognee.yourdomain.com/api/v1/cognify" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"datasets": ["my_documents"]}'
curl -X POST "https://cognee.yourdomain.com/api/v1/search" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query": "What are the main topics?", "datasets": ["my_documents"], "top_k": 10}'

View API Documentation (Swagger UI)

The Swagger UI is your main interface for exploring and testing the API:

https://cognee.yourdomain.com/docs

This interactive documentation lets you:

  • Browse all available endpoints
  • Test API calls directly in the browser
  • View request/response schemas
  • Authenticate and manage your session

Using MCP with Self-Hosted Cognee

One of Cognee’s powerful features is its Model Context Protocol (MCP) integration. This allows AI coding assistants like Cursor, Claude Code, and VS Code extensions to use Cognee as persistent memory.

What is MCP?

MCP (Model Context Protocol) is a standard for connecting AI assistants to external tools and data sources. Cognee’s MCP server provides 11 tools including:

  • add: Store documents and data in memory
  • cognify: Transform data into knowledge graphs
  • search: Semantic search across your knowledge
  • codify: Analyze and index code repositories
  • save_interaction: Store conversation context
  • get_developer_rules: Retrieve coding patterns and rules
  • list_datasets: View all stored datasets
  • prune: Clear all memory for a fresh start

MCP is Already Included!

If you followed the Docker Compose configurations above, the MCP server is already running as part of your deployment:

  • Dokploy: Available at https://mcp.yourdomain.com
  • Docker Compose: Available at http://localhost:8001 (or your configured domain)

The MCP server connects to the Cognee backend internally via the Docker network (http://cognee:8000).

Connecting Cursor IDE

  1. Open Cursor Settings → Tools & MCP
  2. Click + Add MCP Server
  3. Add this configuration to mcp.json:

For local development:

{
  "mcpServers": {
    "cognee": {
      "url": "http://localhost:8001/mcp"
    }
  }
}

For your public MCP server:

{
  "mcpServers": {
    "cognee": {
      "url": "https://mcp.yourdomain.com/mcp"
    }
  }
}
  1. Refresh the MCP connection in Cursor
  2. Use Agent mode to access Cognee tools

Connecting Claude Code

# For local development
claude mcp add --transport http cognee http://localhost:8001/mcp -s project

# For remote server
claude mcp add --transport http cognee https://mcp.yourdomain.com/mcp -s project

Using MCP Tools

Once connected, you can ask your AI assistant to:

  • “Add this file to Cognee memory”
  • “Search Cognee for authentication patterns”
  • “Codify this repository to build a knowledge graph”
  • “Save our conversation as developer rules”
  • “List all my Cognee datasets”

The AI will automatically use the appropriate Cognee MCP tools.

MCP Authentication

The MCP server connects to your Cognee backend internally. If you need to authenticate MCP requests to the Cognee API, you can add API_TOKEN environment variable to the MCP service configuration.

Maintenance and Backups

Regular Backups

For Dokploy deployments, configure automated backups through Dokploy’s interface or follow our Dokploy Backups Guide.

For Docker Compose with PostgreSQL:

# Manual backup (includes both relational data AND vector embeddings)
docker compose exec postgres pg_dump -U cognee cognee_db > backup-$(date +%Y%m%d).sql

# Restore backup
cat backup-20241127.sql | docker compose exec -T postgres psql -U cognee cognee_db

Automated backup script (backup.sh):

#!/bin/bash
BACKUP_DIR="/backups/cognee"
DATE=$(date +%Y%m%d-%H%M)
mkdir -p $BACKUP_DIR

# Backup PostgreSQL (contains both relational and vector data)
docker compose exec -T postgres pg_dump -U cognee cognee_db | gzip > $BACKUP_DIR/cognee-$DATE.sql.gz

# Keep only last 7 days
find $BACKUP_DIR -name "cognee-*.sql.gz" -mtime +7 -delete

Add to crontab:

chmod +x backup.sh
crontab -e
# Add: 0 2 * * * /path/to/backup.sh

Updating Cognee

With Dokploy:

  1. Go to your service
  2. Click “Redeploy”
  3. Dokploy pulls the latest image

With Docker Compose:

cd ~/cognee
docker compose pull
docker compose up -d

Version Pinning

For production stability, consider pinning to a specific version tag instead of :main:

image: cognee/cognee:v0.1.0
image: cognee/cognee-mcp:v0.1.0

Check the GitHub releases for versions.

Security Best Practices

  • Authentication Enabled: We set REQUIRE_AUTHENTICATION=true - never disable this for public deployments
  • Use HTTPS: Always use SSL/TLS in production (Traefik/Certbot handles this)
  • Strong Passwords: Use complex passwords for database and user accounts
  • Environment Variables: Never commit .env files to version control
  • Firewall: Only expose necessary ports (80, 443)
  • Regular Updates: Keep Cognee and Docker images updated
  • Backup Encryption: Encrypt database backups at rest
  • Network Isolation: Use Docker networks to isolate services
  • Monitor Logs: Set up log monitoring for security events
  • Rate Limiting: Consider adding rate limits via Nginx or Traefik

Conclusion

Self-hosting Cognee gives you a powerful AI memory platform with complete control over your data and infrastructure. The combination of PostgreSQL with pgvector efficiently handles both relational data and vector embeddings in a single database, while OpenAI provides high-quality embeddings for semantic search.

Key highlights of this setup:

  • Single PostgreSQL instance with pgvector handles both metadata AND vector storage
  • MCP server included for seamless AI assistant integration
  • Authentication enabled for secure public deployment
  • Production-ready with health checks, resource limits, and proper networking

Whether you choose the simplicity of Dokploy or the flexibility of Docker Compose, you can have Cognee running in minutes. The MCP integration makes it particularly powerful for AI-assisted development, allowing your coding assistants to maintain persistent memory across sessions.

Next Steps

  • Explore the Cognee documentation for advanced features
  • Set up automated backups with our Dokploy Backups Guide
  • Try the MCP integration with Cursor or Claude Code
  • Experiment with different LLM and embedding providers
  • Build custom pipelines for your specific use cases
View Cognee on GitHub

Have questions about self-hosting Cognee? Drop a comment below!

Frequently Asked Questions

While vector databases like Pinecone or Weaviate store embeddings for semantic search, Cognee goes further by building knowledge graphs that capture relationships between entities. This enables more intelligent retrieval that understands context and connections, not just similarity scores.

Cognee uses vector search as one component (via pgvector in our setup) but adds:

  • Entity extraction and relationship mapping
  • Graph-based reasoning
  • Multi-hop queries across related data
  • Automatic summarization and chunking

Using pgvector/pgvector:pg17 gives you PostgreSQL with the pgvector extension, which serves both purposes:

Advantages:

  • Single database to manage, backup, and maintain
  • ACID transactions across both relational and vector data
  • Lower resource usage than running separate databases
  • Simpler deployment and networking

When to consider alternatives:

  • Very large vector datasets (billions of vectors)
  • Need for specialized vector search features
  • Already have Qdrant/Weaviate/Pinecone infrastructure

For most self-hosted deployments, pgvector is excellent and simplifies operations significantly.

OpenAI is the default and easiest option, but Cognee supports multiple providers:

LLM Providers:

  • OpenAI (GPT-4, GPT-4o-mini)
  • Anthropic (Claude)
  • Google Gemini
  • Ollama (local models)
  • Any OpenAI-compatible endpoint

Embedding Providers:

  • OpenAI (text-embedding-3-small/large)
  • Google Gemini
  • Ollama
  • Fastembed (local, CPU-friendly)

You can mix providers - for example, use a local Ollama LLM with OpenAI embeddings.

Monthly costs (example):

  • VPS with 4GB RAM (Hetzner): $8/month
  • Domain: $1/month
  • OpenAI API usage: Variable ($5-50/month depending on usage)

Total: $15-60/month depending on usage

The main variable cost is LLM/embedding API usage. Using local models with Ollama can reduce this to nearly zero.

Absolutely! The MCP server is optional. You can remove the cognee-mcp service from the Docker Compose and use Cognee purely as a REST API for:

  • Building RAG applications
  • Creating AI assistants with memory
  • Document analysis and search
  • Knowledge management systems

The MCP integration is specifically useful for AI coding assistants like Cursor and Claude Code.

Kuzu (default) works well for single-server deployments and is the easiest to set up (file-based, no additional services).

Neo4j is recommended for:

  • Multi-agent deployments (concurrent access)
  • Large-scale knowledge graphs
  • When you need Neo4j’s visualization tools
  • Enterprise features and support

FalkorDB is a good middle ground offering both graph and vector capabilities.

Start with Kuzu and migrate to Neo4j if you need more scalability.

To completely reset Cognee:

# Stop services
docker compose down

# Remove volumes (WARNING: deletes all data including vectors)
docker volume rm cognee_postgres-data

# Start fresh
docker compose up -d

For a soft reset (keep user data but clear knowledge graphs), use the Cognee API:

curl -X POST "https://cognee.yourdomain.com/api/v1/prune" \
  -H "Authorization: Bearer $TOKEN"

Cognee doesn’t directly import from other vector databases, but you can:

  1. Export your documents from the source system
  2. Re-ingest them into Cognee using the /add endpoint
  3. Run cognify to build the knowledge graph

The knowledge graph structure Cognee creates is different from raw vector embeddings, so re-processing is typically the best approach anyway.

Cognee provides several monitoring options:

  1. Health endpoint: GET /health for basic liveness checks
  2. Detailed health: GET /health/detailed for component status
  3. Logs: Container logs show processing status and errors
  4. Dataset status: GET /api/v1/datasets/{id}/status for processing state

For production monitoring:

  • Set up uptime monitoring (UptimeRobot, Pingdom)
  • Configure log aggregation (Loki, ELK)
  • Monitor PostgreSQL metrics
  • Track API response times

Related Posts