feature: Allow customizing the inference's context length

author: MohamedBassem <me@mbassem.com> 2024-10-12 17:25:01 +0000
committer: MohamedBassem <me@mbassem.com> 2024-10-12 17:37:42 +0000
commit: 1b09682685f54f29957163be9b9f9fc2de3b49cc (patch)
tree: 7f10a7635cf984acd45147c24ec3e1d35798e8ba /docs
parent: c16173ea0fdbf6cc47b13756c0a77e8399669055 (diff)
download: karakeep-1b09682685f54f29957163be9b9f9fc2de3b49cc.tar.zst
1 files changed, 11 insertions, 10 deletions
diff --git a/docs/docs/03-configuration.md b/docs/docs/03-configuration.md
index 3d674d63..98fa7a1a 100644
--- a/docs/docs/03-configuration.md
+++ b/docs/docs/03-configuration.md
@@ -48,16 +48,17 @@ Either `OPENAI_API_KEY` or `OLLAMA_BASE_URL` need to be set for automatic taggin
 - Running local models is a recent addition and not as battle tested as using OpenAI, so proceed with care (and potentially expect a bunch of inference failures).
   :::
 
-| Name                      | Required | Default     | Description                                                                                                                                                                                     |
-| ------------------------- | -------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| OPENAI_API_KEY            | No       | Not set     | The OpenAI key used for automatic tagging. More on that in [here](/openai).                                                                                                                     |
-| OPENAI_BASE_URL           | No       | Not set     | If you just want to use OpenAI you don't need to pass this variable. If, however, you want to use some other openai compatible API (e.g. azure openai service), set this to the url of the API. |
-| OLLAMA_BASE_URL           | No       | Not set     | If you want to use ollama for local inference, set the address of ollama API here.                                                                                                              |
-| OLLAMA_KEEP_ALIVE         | No       | Not set     | Controls how long the model will stay loaded into memory following the request (example value: "5m").                                                                                           |
-| INFERENCE_TEXT_MODEL      | No       | gpt-4o-mini | The model to use for text inference. You'll need to change this to some other model if you're using ollama.                                                                                     |
-| INFERENCE_IMAGE_MODEL     | No       | gpt-4o-mini | The model to use for image inference. You'll need to change this to some other model if you're using ollama and that model needs to support vision APIs (e.g. llava).                           |
-| INFERENCE_LANG            | No       | english     | The language in which the tags will be generated.                                                                                                                                               |
-| INFERENCE_JOB_TIMEOUT_SEC | No       | 30          | How long to wait for the inference job to finish before timing out. If you're running ollama without powerful GPUs, you might want to increase the timeout a bit.                               |
+| Name                      | Required | Default     | Description                                                                                                                                                                                                                                                                                                                                                                           |
+| ------------------------- | -------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| OPENAI_API_KEY            | No       | Not set     | The OpenAI key used for automatic tagging. More on that in [here](/openai).                                                                                                                                                                                                                                                                                                           |
+| OPENAI_BASE_URL           | No       | Not set     | If you just want to use OpenAI you don't need to pass this variable. If, however, you want to use some other openai compatible API (e.g. azure openai service), set this to the url of the API.                                                                                                                                                                                       |
+| OLLAMA_BASE_URL           | No       | Not set     | If you want to use ollama for local inference, set the address of ollama API here.                                                                                                                                                                                                                                                                                                    |
+| OLLAMA_KEEP_ALIVE         | No       | Not set     | Controls how long the model will stay loaded into memory following the request (example value: "5m").                                                                                                                                                                                                                                                                                 |
+| INFERENCE_TEXT_MODEL      | No       | gpt-4o-mini | The model to use for text inference. You'll need to change this to some other model if you're using ollama.                                                                                                                                                                                                                                                                           |
+| INFERENCE_IMAGE_MODEL     | No       | gpt-4o-mini | The model to use for image inference. You'll need to change this to some other model if you're using ollama and that model needs to support vision APIs (e.g. llava).                                                                                                                                                                                                                 |
+| INFERENCE_CONTEXT_LENGTH  | No       | 2048        | The max number of tokens that we'll pass to the inference model. If your content is larger than this size, it'll be truncated to fit. The larger this value, the more of the content will be used in tag inference, but the more expensive the inference will be (money-wise on openAI and resource-wise on ollama). Check the model you're using for its max supported content size. |
+| INFERENCE_LANG            | No       | english     | The language in which the tags will be generated.                                                                                                                                                                                                                                                                                                                                     |
+| INFERENCE_JOB_TIMEOUT_SEC | No       | 30          | How long to wait for the inference job to finish before timing out. If you're running ollama without powerful GPUs, you might want to increase the timeout a bit.                                                                                                                                                                                                                     |
 
 ## Crawler Configs
author	MohamedBassem <me@mbassem.com>	2024-10-12 17:25:01 +0000
committer	MohamedBassem <me@mbassem.com>	2024-10-12 17:37:42 +0000
commit	1b09682685f54f29957163be9b9f9fc2de3b49cc (patch)
tree	7f10a7635cf984acd45147c24ec3e1d35798e8ba /docs
parent	c16173ea0fdbf6cc47b13756c0a77e8399669055 (diff)
download	karakeep-1b09682685f54f29957163be9b9f9fc2de3b49cc.tar.zst