aboutsummaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorBenjamin Michaelis <github@relay.benjamin.michaelis.net>2025-10-25 11:12:08 -0700
committerGitHub <noreply@github.com>2025-10-25 19:12:08 +0100
commit046c29dcf1083f0ab89b080f7696e6d642a6bd17 (patch)
tree76ecd012080781c87ce53f045721adfabe88019b /docs
parent8c0aae33b878827ca0978d9979bb4f2b51ef2f6e (diff)
downloadkarakeep-046c29dcf1083f0ab89b080f7696e6d642a6bd17.tar.zst
fix: update OpenAI API to use max_completion_tokens instead of max_tokens (#2000)
* fix: update OpenAI API to use max_completion_tokens instead of max_tokens The OpenAI API has deprecated max_tokens in favor of max_completion_tokens for newer models. This change updates both text and image model calls. * feat: add support for max_completion_tokens in OpenAI inference configuration
Diffstat (limited to 'docs')
-rw-r--r--docs/docs/03-configuration.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/docs/docs/03-configuration.md b/docs/docs/03-configuration.md
index f57968be..d9e55322 100644
--- a/docs/docs/03-configuration.md
+++ b/docs/docs/03-configuration.md
@@ -95,6 +95,7 @@ Either `OPENAI_API_KEY` or `OLLAMA_BASE_URL` need to be set for automatic taggin
| EMBEDDING_TEXT_MODEL | No | text-embedding-3-small | The model to be used for generating embeddings for the text. |
| INFERENCE_CONTEXT_LENGTH | No | 2048 | The max number of tokens that we'll pass to the inference model. If your content is larger than this size, it'll be truncated to fit. The larger this value, the more of the content will be used in tag inference, but the more expensive the inference will be (money-wise on openAI and resource-wise on ollama). Check the model you're using for its max supported content size. |
| INFERENCE_MAX_OUTPUT_TOKENS | No | 2048 | The maximum number of tokens that the inference model is allowed to generate in its response. This controls the length of AI-generated content like tags and summaries. Increase this if you need longer responses, but be aware that higher values will increase costs (for OpenAI) and processing time. |
+| INFERENCE_USE_MAX_COMPLETION_TOKENS | No | false | \[OpenAI Only\] Whether to use the newer `max_completion_tokens` parameter instead of the deprecated `max_tokens` parameter. Set to `true` if using GPT-5 or o-series models which require this. Will become the default in a future release. |
| INFERENCE_LANG | No | english | The language in which the tags will be generated. |
| INFERENCE_NUM_WORKERS | No | 1 | Number of concurrent workers for AI inference tasks (tagging and summarization). Increase this if you have multiple AI inference requests and want to process them in parallel. |
| INFERENCE_ENABLE_AUTO_TAGGING | No | true | Whether automatic AI tagging is enabled or disabled. |