| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
| |
* feat(ai): Support restricting AI tags to a subset of existing tags
Co-authored-by: Claude <noreply@anthropic.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* feat(ocr): add LLM-based OCR support alongside Tesseract
Add support for using configured LLM inference providers (OpenAI or Ollama)
for OCR text extraction from images as an alternative to Tesseract.
Changes:
- Add OCR_USE_LLM environment variable flag (default: false)
- Add buildOCRPrompt function for LLM-based text extraction
- Add readImageTextWithLLM function in asset preprocessing worker
- Update extractAndSaveImageText to route between Tesseract and LLM OCR
- Update documentation with the new configuration option
When OCR_USE_LLM is enabled, the system uses the configured inference model
to extract text from images. If no inference provider is configured, it
falls back to Tesseract.
https://claude.ai/code/session_01Y7h7kDAmqXKXEWDmWbVkDs
* format
---------
Co-authored-by: Claude <noreply@anthropic.com>
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* feat: add customizable tag styles
* add tag lang setting
* ui settings cleanup
* fix migration
* change look of the field
* more fixes
* fix tests
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* feat: lazy load tiktoken to reduce memory footprint
The js-tiktoken module loads a large encoding dictionary into memory
immediately on import. This change defers the loading of the encoding
until it's actually needed by using a lazy getter pattern.
This reduces memory usage for processes that import this module but
don't actually use the token encoding functions.
* fix: use createRequire for lazy tiktoken import in ES module
The previous implementation used bare require() which fails at runtime
in ES modules (ReferenceError: require is not defined). This fixes it
by using createRequire from Node's 'module' package, which creates a
require function that works in ES module contexts.
* refactor: convert tiktoken lazy loading to async dynamic imports
Changed from createRequire to async import() for lazy loading tiktoken,
making buildTextPrompt and buildSummaryPrompt async. This is cleaner for
ES modules and properly defers the large tiktoken encoding data until
it's actually needed.
Updated all callers to await these async functions:
- packages/trpc/routers/bookmarks.ts
- apps/workers/workers/inference/tagging.ts
- apps/workers/workers/inference/summarize.ts
- apps/web/components/settings/AISettings.tsx (converted to useEffect)
* feat: add untruncated prompt builders for UI previews
Added buildTextPromptUntruncated and buildSummaryPromptUntruncated
functions that don't require token counting or truncation. These are
synchronous and don't load tiktoken, making them perfect for UI
previews where exact token limits aren't needed.
Updated AISettings.tsx to use these untruncated versions, eliminating
the need for useEffect/useState and avoiding unnecessary tiktoken
loading in the browser.
* fix
* fix
---------
Co-authored-by: Claude <noreply@anthropic.com>
|
| | |
|
| |
|
| |
Corrected "who's" to "whose" in buildImagePrompt and buildTextPrompt.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* feat: add optional `thinking` key to tagging response schema
* prompt: fix indent
* prompt: remove extra 'language' word
* prompt: use xml as separator
* revert: dont use a thinking tags
Signed-off-by: thiswillbeyourgithub
<26625900+thiswillbeyourgithub@users.noreply.github.com>
* prompt: don't ask to include website tags
* prompt: aim for 5 tags
* prompt: dont tell bot its a bot
* prompt: propose a tag_error
* Revert "prompt: propose a tag_error"
This reverts commit 78c5099a187960cc3697b77f2b2bd687edb015f3.
* minor prompt tweaks
* minor prompt tweaks take 2
---------
Signed-off-by: thiswillbeyourgithub
Co-authored-by: Mohamed Bassem <me@mbassem.com>
|
| |
|
|
| |
choking the tokenizer. Fixes #1622
|
| | |
|
| | |
|
| |
|
|
| |
content
|
| | |
|
| | |
|
| |
|