Embeddings

bigRAG supports multiple embedding providers. Each collection can use a different provider and model.

Supported Models

Provider	Model	Dimensions	Notes
OpenAI	`text-embedding-3-small`	1536	Default model
OpenAI	`text-embedding-3-large`	3072	Best quality (OpenAI)
Cohere	`embed-english-v3.0`	1024	English-optimized
Cohere	`embed-multilingual-v3.0`	1024	Multilingual support
Cohere	`embed-english-light-v3.0`	384	Lightweight English model
Cohere	`embed-multilingual-light-v3.0`	384	Lightweight multilingual model

Listing Available Models

curl http://localhost:6100/v1/embeddings/models \
  -H "Authorization: Bearer $BIGRAG_API_SECRET"

Returns all available embedding models with their dimensions and descriptions.

Configuring per Collection

Embedding configuration is set when creating a collection. Each collection can use a different provider and model.

OpenAI

curl -X POST http://localhost:6100/v1/collections \
  -H "Authorization: Bearer $BIGRAG_API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "openai_collection",
    "embedding_provider": "openai",
    "embedding_model": "text-embedding-3-small",
    "embedding_api_key": "sk-...",
    "dimension": 1536
  }'

Cohere

curl -X POST http://localhost:6100/v1/collections \
  -H "Authorization: Bearer $BIGRAG_API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "cohere_collection",
    "embedding_provider": "cohere",
    "embedding_model": "embed-english-v3.0",
    "embedding_api_key": "your-cohere-api-key",
    "dimension": 1024
  }'

How Embedding Works

When a document is processed, its text is split into chunks
Each chunk is sent to the configured embedding provider's API
The resulting vector is stored in Milvus alongside the chunk text
Queries are embedded using the same model for consistent similarity comparison

Concurrency

Embedding requests are rate-limited by a semaphore to avoid overwhelming the provider API. Configure the concurrency limit:

BIGRAG_EMBEDDING_CONCURRENCY=8  # Max concurrent embedding requests (default)

The embedding model and dimension cannot be changed after a collection is created. All documents in a collection must use the same embedding model for consistent vector search.