Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getcore.me/llms.txt

Use this file to discover all available pages before exploring further.

CORE uses embedding models to convert text into vector representations for semantic search, memory retrieval, and knowledge graph operations. Vectors are stored in pgvector with a fixed dimension: all embeddings must match the configured EMBEDDING_MODEL_SIZE.

Configuration

EMBEDDINGS_PROVIDER=openai          # openai | google | ollama
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_MODEL_SIZE=1536           # must match pgvector column dimension

Providers

OpenAI

Requires OPENAI_API_KEY. Works with direct API or proxies via OPENAI_BASE_URL.
ModelDimensionsNotes
text-embedding-3-small1536Default. Good balance of cost and quality
text-embedding-3-large3072Higher quality, higher cost
text-embedding-ada-0021536Legacy
EMBEDDINGS_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_MODEL_SIZE=1536
OPENAI_API_KEY=sk-...

Google Gemini

Requires GOOGLE_GENERATIVE_AI_API_KEY. Get a free key from Google AI Studio.
ModelDefault DimensionsConfigurableNotes
text-embedding-004768NoStable, production-ready
gemini-embedding-0013072Yes (256, 512, 768, 1024, 3072)Higher quality
gemini-embedding-2-preview3072Yes (256, 512, 768, 1024, 3072)Latest, best quality
For models with configurable dimensions, EMBEDDING_MODEL_SIZE controls the output dimensionality. This lets you match existing pgvector columns without changing the schema.
EMBEDDINGS_PROVIDER=google
EMBEDDING_MODEL=gemini-embedding-2-preview
EMBEDDING_MODEL_SIZE=1024
GOOGLE_GENERATIVE_AI_API_KEY=AIza...

Ollama (Self-Hosted)

Requires a running Ollama instance. No API key needed: fully local and private.
ModelDimensionsNotes
mxbai-embed-large1024Default for Ollama. Good general-purpose
nomic-embed-text768Lightweight
all-minilm384Smallest, fastest
snowflake-arctic-embed1024High quality for retrieval
bge-large1024Strong multilingual support
Pull the model first:
ollama pull mxbai-embed-large
EMBEDDINGS_PROVIDER=ollama
EMBEDDING_MODEL=mxbai-embed-large
EMBEDDING_MODEL_SIZE=1024
OLLAMA_URL=http://ollama:11434

Choosing a Model

Use CaseRecommended ModelProviderDimensions
Best quality (cloud)gemini-embedding-2-previewGoogle3072
Low cost (cloud)text-embedding-004Google768
Balanced (cloud)text-embedding-3-smallOpenAI1536
Fully local / air-gappedmxbai-embed-largeOllama1024
Minimum resourcesall-minilmOllama384

Dimension Mismatch Handling

pgvector columns are created with a fixed dimension. If the embedding model returns vectors that don’t match EMBEDDING_MODEL_SIZE:
  • Too short: padded with zeros. Works but degrades retrieval quality. A warning is logged.
  • Too long: fails with an error. Update EMBEDDING_MODEL_SIZE and re-embed.

Switching Embedding Models

Changing models requires re-embedding all existing vectors since different models produce incompatible vector spaces.
  1. Update EMBEDDINGS_PROVIDER, EMBEDDING_MODEL, and EMBEDDING_MODEL_SIZE
  2. If the dimension changed, update your pgvector column dimension
  3. Re-embed all existing content
Google models with configurable dimensions (gemini-embedding-001, gemini-embedding-2-preview) can output at your existing dimension size, avoiding step 2. Re-embedding is still needed since vector spaces differ between models.