bigRAG
Concepts

Embeddings

Supported embedding providers and models for vector search.

bigRAG supports multiple embedding providers. Each collection can use a different provider and model.

Supported Models

ProviderModelDimensionsNotes
OpenAItext-embedding-3-small1536Default model
OpenAItext-embedding-3-large3072Best quality (OpenAI)
Cohereembed-english-v3.01024English-optimized
Cohereembed-multilingual-v3.01024Multilingual support
Cohereembed-english-light-v3.0384Lightweight English model
Cohereembed-multilingual-light-v3.0384Lightweight multilingual model

Listing Available Models

curl http://localhost:6100/v1/embeddings/models \
  -H "Authorization: Bearer $BIGRAG_API_SECRET"

Returns all available embedding models with their dimensions and descriptions.

Configuring per Collection

Embedding configuration is set when creating a collection. Each collection can use a different provider and model.

OpenAI

curl -X POST http://localhost:6100/v1/collections \
  -H "Authorization: Bearer $BIGRAG_API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "openai_collection",
    "embedding_provider": "openai",
    "embedding_model": "text-embedding-3-small",
    "embedding_api_key": "sk-...",
    "dimension": 1536
  }'

Cohere

curl -X POST http://localhost:6100/v1/collections \
  -H "Authorization: Bearer $BIGRAG_API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "cohere_collection",
    "embedding_provider": "cohere",
    "embedding_model": "embed-english-v3.0",
    "embedding_api_key": "your-cohere-api-key",
    "dimension": 1024
  }'

How Embedding Works

  1. When a document is processed, its text is split into chunks
  2. Each chunk is sent to the configured embedding provider's API
  3. The resulting vector is stored in Milvus alongside the chunk text
  4. Queries are embedded using the same model for consistent similarity comparison

Concurrency

Embedding requests are rate-limited by a semaphore to avoid overwhelming the provider API. Configure the concurrency limit:

BIGRAG_EMBEDDING_CONCURRENCY=8  # Max concurrent embedding requests (default)

The embedding model and dimension cannot be changed after a collection is created. All documents in a collection must use the same embedding model for consistent vector search.

On this page