aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--docs/docs/03-configuration.md2
-rw-r--r--docs/docs/06-openai.md4
-rw-r--r--packages/shared/config.ts2
3 files changed, 4 insertions, 4 deletions
diff --git a/docs/docs/03-configuration.md b/docs/docs/03-configuration.md
index 3790d289..995a1af9 100644
--- a/docs/docs/03-configuration.md
+++ b/docs/docs/03-configuration.md
@@ -56,7 +56,7 @@ Either `OPENAI_API_KEY` or `OLLAMA_BASE_URL` need to be set for automatic taggin
| OPENAI_BASE_URL | No | Not set | If you just want to use OpenAI you don't need to pass this variable. If, however, you want to use some other openai compatible API (e.g. azure openai service), set this to the url of the API. |
| OLLAMA_BASE_URL | No | Not set | If you want to use ollama for local inference, set the address of ollama API here. |
| OLLAMA_KEEP_ALIVE | No | Not set | Controls how long the model will stay loaded into memory following the request (example value: "5m"). |
-| INFERENCE_TEXT_MODEL | No | gpt-4o-mini | The model to use for text inference. You'll need to change this to some other model if you're using ollama. |
+| INFERENCE_TEXT_MODEL | No | gpt-4.1-mini | The model to use for text inference. You'll need to change this to some other model if you're using ollama. |
| INFERENCE_IMAGE_MODEL | No | gpt-4o-mini | The model to use for image inference. You'll need to change this to some other model if you're using ollama and that model needs to support vision APIs (e.g. llava). |
| EMBEDDING_TEXT_MODEL | No | text-embedding-3-small | The model to be used for generating embeddings for the text. |
| INFERENCE_CONTEXT_LENGTH | No | 2048 | The max number of tokens that we'll pass to the inference model. If your content is larger than this size, it'll be truncated to fit. The larger this value, the more of the content will be used in tag inference, but the more expensive the inference will be (money-wise on openAI and resource-wise on ollama). Check the model you're using for its max supported content size. |
diff --git a/docs/docs/06-openai.md b/docs/docs/06-openai.md
index 289f44c2..32218da8 100644
--- a/docs/docs/06-openai.md
+++ b/docs/docs/06-openai.md
@@ -4,8 +4,8 @@ This service uses OpenAI for automatic tagging. This means that you'll incur som
## Text Tagging
-For text tagging, we use the `gpt-4o-mini` model. This model is [extremely cheap](https://openai.com/api/pricing). Cost per inference varies depending on the content size per article. Though, roughly, You'll be able to generate tags for almost 3000+ bookmarks for less than $1.
+For text tagging, we use the `gpt-4.1-mini` model. This model is [extremely cheap](https://openai.com/api/pricing). Cost per inference varies depending on the content size per article. Though, roughly, You'll be able to generate tags for almost 3000+ bookmarks for less than $1.
## Image Tagging
-For image uploads, we use the `gpt-4o-mini` model for extracting tags from the image. You can learn more about the costs of using this model [here](https://platform.openai.com/docs/guides/vision/calculating-costs). To lower the costs, we're using the low resolution mode (fixed number of tokens regardless of image size). You'll be able to run inference for 1000+ images for less than a $1.
+For image uploads, we use the `gpt-4o-mini` model for extracting tags from the image. You can learn more about the costs of using this model [here](https://platform.openai.com/docs/guides/images?api-mode=chat#calculating-costs). To lower the costs, we're using the low resolution mode (fixed number of tokens regardless of image size). You'll be able to run inference for 1000+ images for less than a $1.
diff --git a/packages/shared/config.ts b/packages/shared/config.ts
index 8abb5902..046583c6 100644
--- a/packages/shared/config.ts
+++ b/packages/shared/config.ts
@@ -32,7 +32,7 @@ const allEnv = z.object({
OLLAMA_KEEP_ALIVE: z.string().optional(),
INFERENCE_JOB_TIMEOUT_SEC: z.coerce.number().default(30),
INFERENCE_FETCH_TIMEOUT_SEC: z.coerce.number().default(300),
- INFERENCE_TEXT_MODEL: z.string().default("gpt-4o-mini"),
+ INFERENCE_TEXT_MODEL: z.string().default("gpt-4.1-mini"),
INFERENCE_IMAGE_MODEL: z.string().default("gpt-4o-mini"),
EMBEDDING_TEXT_MODEL: z.string().default("text-embedding-3-small"),
INFERENCE_CONTEXT_LENGTH: z.coerce.number().default(2048),