diff options
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/docs/02-installation/01-docker.md | 14 | ||||
| -rw-r--r-- | docs/docs/02-installation/04-kubernetes.md | 16 | ||||
| -rw-r--r-- | docs/docs/03-configuration/02-different-ai-providers.md | 26 |
3 files changed, 29 insertions, 27 deletions
diff --git a/docs/docs/02-installation/01-docker.md b/docs/docs/02-installation/01-docker.md index 02dceac3..e2fb38d6 100644 --- a/docs/docs/02-installation/01-docker.md +++ b/docs/docs/02-installation/01-docker.md @@ -59,19 +59,7 @@ OPENAI_API_KEY=<key> Learn more about the costs of using openai [here](../administration/openai). -<details> - <summary>If you want to use Ollama (https://ollama.com/) instead for local inference.</summary> - - **Note:** The quality of the tags you'll get will depend on the quality of the model you choose. - - - Make sure ollama is running. - - Set the `OLLAMA_BASE_URL` env variable to the address of the ollama API. - - Set `INFERENCE_TEXT_MODEL` to the model you want to use for text inference in ollama (for example: `llama3.1`) - - Set `INFERENCE_IMAGE_MODEL` to the model you want to use for image inference in ollama (for example: `llava`) - - Make sure that you `ollama pull`-ed the models that you want to use. - - You might want to tune the `INFERENCE_CONTEXT_LENGTH` as the default is quite small. The larger the value, the better the quality of the tags, but the more expensive the inference will be. - -</details> +If you want to use a different AI provider (e.g. Ollama for local inference), check out the [different AI providers](../configuration/different-ai-providers) guide. ### 5. Start the service diff --git a/docs/docs/02-installation/04-kubernetes.md b/docs/docs/02-installation/04-kubernetes.md index 8cdc96b6..9014cf22 100644 --- a/docs/docs/02-installation/04-kubernetes.md +++ b/docs/docs/02-installation/04-kubernetes.md @@ -22,7 +22,7 @@ To see all available configuration options check the [documentation](../configur To configure the neccessary secrets for the application copy the `.secrets_sample` file to `.secrets` and change the sample secrets to your generated secrets. -> Note: You **should** change the random strings. You can use `openssl rand -base64 36` to generate the random strings. +> Note: You **should** change the random strings. You can use `openssl rand -base64 36` to generate the random strings. ### 3. Setup OpenAI @@ -37,19 +37,7 @@ OPENAI_API_KEY=<key> Learn more about the costs of using openai [here](../administration/openai). -<details> - <summary>[EXPERIMENTAL] If you want to use Ollama (https://ollama.com/) instead for local inference.</summary> - - **Note:** The quality of the tags you'll get will depend on the quality of the model you choose. Running local models is a recent addition and not as battle tested as using openai, so proceed with care (and potentially expect a bunch of inference failures). - - - Make sure ollama is running. - - Set the `OLLAMA_BASE_URL` env variable to the address of the ollama API. - - Set `INFERENCE_TEXT_MODEL` to the model you want to use for text inference in ollama (for example: `mistral`) - - Set `INFERENCE_IMAGE_MODEL` to the model you want to use for image inference in ollama (for example: `llava`) - - Make sure that you `ollama pull`-ed the models that you want to use. - - -</details> +If you want to use a different AI provider (e.g. Ollama for local inference), check out the [different AI providers](../configuration/different-ai-providers) guide. ### 4. Deploy the service diff --git a/docs/docs/03-configuration/02-different-ai-providers.md b/docs/docs/03-configuration/02-different-ai-providers.md index 9a86e04f..7d1a4589 100644 --- a/docs/docs/03-configuration/02-different-ai-providers.md +++ b/docs/docs/03-configuration/02-different-ai-providers.md @@ -18,6 +18,28 @@ OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Ollama is a local LLM provider that you can use to run your own LLM server. You'll need to pass ollama's address to karakeep and you need to ensure that it's accessible from within the karakeep container (e.g. no localhost addresses). +Ollama provides two API endpoints: + +1. **OpenAI-compatible API (Recommended)** - Uses the `/v1` chat endpoint which handles message formatting automatically +2. **Native Ollama API** - Requires manual formatting for some models + +### Option 1: OpenAI-compatible API (Recommended) + +This approach uses Ollama's OpenAI-compatible endpoint and is more reliable with various models: + +``` +OPENAI_API_KEY=ollama +OPENAI_BASE_URL=http://ollama.mylab.com:11434/v1 + +# Make sure to pull the models in ollama first. Example models: +INFERENCE_TEXT_MODEL=gemma3 +INFERENCE_IMAGE_MODEL=llava +``` + +### Option 2: Native Ollama API + +Alternatively, you can use the native Ollama API: + ``` # MAKE SURE YOU DON'T HAVE OPENAI_API_KEY set, otherwise it takes precedence. @@ -31,6 +53,10 @@ INFERENCE_IMAGE_MODEL=llava # INFERENCE_OUTPUT_SCHEMA=plain ``` +:::tip +If you experience issues with certain models (especially OpenAI's gpt-oss models or other models requiring specific chat formats), try using the OpenAI-compatible API endpoint instead. +::: + ## Gemini Gemini has an OpenAI-compatible API. You need to get an api key from Google AI Studio. |
