Ollama

Ollama is an open-source tool that allows to run large language models (LLMs) locally on their own computers. To use Ollama, you can install it here and download the model you want to run with the ollama run command.

Chat model

We recommend configuring Llama3.1 8B as your chat model.

YAML
JSON

config.yaml
models:
  - name: Llama3.1 8B
    provider: ollama
    model: llama3.1:8b

config.json
{
  "models": [
    {
      "title": "Llama3.1 8B",
      "provider": "ollama",
      "model": "llama3.1:8b"
    }
  ]
}

Autocomplete model

We recommend configuring Qwen2.5-Coder 1.5B as your autocomplete model.

YAML
JSON

config.yaml
models:
  - name: Qwen2.5-Coder 1.5B
    provider: ollama
    model: qwen2.5-coder:1.5b-base
    roles:
      - autocomplete

config.json
{
  "tabAutocompleteModel": {
    "title": "Qwen2.5-Coder 1.5B",
    "provider": "ollama",
    "model": "qwen2.5-coder:1.5b-base"
  }
}

Embeddings model

We recommend configuring Nomic Embed Text as your embeddings model.

YAML
JSON

config.yaml
models:
  - name: Nomic Embed Text
    provider: ollama
    model: nomic-embed-text
    roles:
      - embed

config.json
{
  "embeddingsProvider": {
    "provider": "ollama",
    "model": "nomic-embed-text"
  }
}

Reranking model

Ollama currently does not offer any reranking models.

Click here to see a list of reranking model providers.

Using a remote instance

To configure a remote instance of Ollama, add the "apiBase" property to your model in config.json:

YAML
JSON

config.yaml
models:
  - name: Llama3.1 8B
    provider: ollama
    model: llama3.1:8b
    apiBase: http://<my endpoint>:11434

config.json
{
  "models": [
    {
      "title": "Llama3.1 8B",
      "provider": "ollama",
      "model": "llama3.1:8b",
      "apiBase": "http://<my endpoint>:11434"
    }
  ]
}

Chat model​

Autocomplete model​

Embeddings model​

Reranking model​

Using a remote instance​

Chat model

Autocomplete model

Embeddings model

Reranking model

Using a remote instance