aboutsummaryrefslogtreecommitdiffstats
path: root/apps/workers (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* fix: Don't enqueue video tasks when video downlaod is disabledMohamed Bassem2025-09-061-8/+10
|
* fix: fix long worker log lines when downloading base64 imagesMohamed Bassem2025-08-301-1/+3
|
* fix: Respect wal mode for the queue dbMohamed Bassem2025-08-301-1/+1
|
* fix: dangling assets created by changing crawling configMohamedBassem2025-08-221-5/+6
|
* fix(workers): Drop the withTimeout wrappersMohamedBassem2025-08-222-10/+2
|
* feat: Export prometheus metrics from the workersMohamedBassem2025-08-2214-5/+111
|
* refactor: Refactor crawlerWorker to use tryCatchMohamedBassem2025-07-271-123/+117
|
* refactor: Extract meilisearch as a pluginMohamedBassem2025-07-273-61/+45
|
* chore: More turbo fixesMohamedBassem2025-07-271-2/+2
|
* fix: Ensure that all packages are ESM packagesMohamedBassem2025-07-271-0/+1
|
* deps: Upgrade viteMohamed Bassem2025-07-261-1/+1
|
* fix: Run workers in prod without tsx. Fixes #1673Mohamed Bassem2025-07-192-2/+26
|
* feat: Allow setting browserless crawling per userMohamed Bassem2025-07-191-1/+19
|
* Revert "fix: Fix the types of the bookmark types in the db query"Mohamed Bassem2025-07-132-21/+1
| | | | This reverts commit 4ba3e8047a5b1f160169617187436c09e91662ec.
* fix: Fix the types of the bookmark types in the db queryMohamed Bassem2025-07-132-1/+21
|
* feat: Add proper proxy support. fixes #1265Mohamed Bassem2025-07-132-9/+87
|
* deps: Upgrade typescript to 5.8Mohamed Bassem2025-07-121-1/+1
|
* deps: Upgrade drizzleMohamed Bassem2025-07-121-1/+1
|
* fix: Prioritize crawling user added links over bulk imports. fixes #1717Mohamed Bassem2025-07-125-24/+55
|
* fix: Fix search indexing after content splitMohamed Bassem2025-07-061-7/+4
|
* feat: Store large html content in the asset dbMohamed Bassem2025-07-065-9/+135
|
* feat: Add per user storage quotaMohamed Bassem2025-07-064-75/+183
|
* feat(workers): Allow custmoizing max parallelism for a bunch of workers. ↵Mohamed Bassem2025-07-055-5/+7
| | | | Fixes #724
* fix(workers): A more lenient JSON parsing for LLM responses. Fixes #1267Mohamed Bassem2025-07-041-1/+39
|
* fix(workers): Disable the metascraper readability as it's causing slowness ↵Mohamed Bassem2025-06-221-2/+0
| | | | in worker
* fix(workers): Fix jsdom console logs leaking into worker logsMohamed Bassem2025-06-221-2/+3
|
* feat(workers): adding a local metascraper plugin for Reddit posts (#1302)David Woods2025-06-223-13/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * chore: metascraper 5.x comes with its own types, including @types/metascraper is now redundant; also updating to latest versions of metascraper libraries * feat (workers): creating a local metascraper plugin for Reddit posts In the past, the preview images for bookmarks from Reddit links were poorly chosen. Reddit does not use opengraph tags, so metascraper-images simply looked for all images on the page and returned the first. This tended to be the profile picture for the poster for the Reddit link. This new plugin, using the existing metascraper framework, provides a better selection of image for the bookmark when the URL domain is 'reddit'. In addition, recent changes (I believe this was a side effect of adding the metascraper-author and/or the metascaper-publisher plugins, but it could also be related to the metascraper-readibility plugin) broke what used to be a good choice of bookmark title. Previously, titles looked like 'Tinyauth just reached 1000 stars! : r/selfhosted' with both thread title and subreddit mentioned. After this update, all Reddit posts now have the same title: 'The heart of the internet'. To return to the better format, this new metascraper-reddit plugin now attempts to retrieve the better title from reddit URLs. Note that in order to gain precendence in title selection, the 'metascraperReddit()' inclusion in the crawlerWorkers.ts metascraper instantiation list had to be moved above metascraperReadability(). * chore: updated Hoarder in text to Karakeep * chore: update metascraper versions fix for metascraper types has been merged; the expect-error comment can be removed * chore: merge with master --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* feat(workers): migrate from puppeteer to playwright (#1296)Mael2025-06-222-34/+39
| | | | | | | | | | | | | | | | | | | | | | | * feat: convert to playwright Convert crawling to use Playwright instead of Chrome. - Update Dockerfile to include Playwright - Update crawler worker to use Playwright API - Update dependencies * feat: convert from Puppeteer to Playwright for crawling * feat: update docker-compose * use separate browser context for better isolation * skip chrome download in linux script * readd the stealth plugin --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* chore: More oxlint changesMohamed Bassem2025-06-223-7/+4
|
* chore: migrate away from eslint to oxlint (#1642)xuatz2025-06-225-12/+27
| | | | | | | * chore: migrate away from eslint to oxlint * revert turbo task name lint * it seems like we can remove the seemingly default globals
* fix: Fix webhook not firing on deletion. Fixes #1613Mohamed Bassem2025-06-211-18/+19
|
* fix(workers): video downloader should log yt-dlp errors (#1624)irobot2025-06-211-3/+6
| | | | In the event that yt-dlp errors out, the error details should be logged. yt-dlp prints out the error message to stderr.
* feat: Allow specifying the overwrite mode for singlefile archives. Fixes #1125Mohamed Bassem2025-06-011-3/+3
|
* feat: Generate RSS feeds from lists (#1507)Mohamed Bassem2025-05-311-31/+3
| | | | | | | | | | | * refactor: Move bookmark utils from shared-react to shared * Expose RSS feeds for lists * Add e2e tests * Slightly improve the look of the share dialog * allow specifying a limit in the rss endpoint
* feat: Add AI auto summarization. Fixes #1163Mohamed Bassem2025-05-1812-100/+264
|
* feat: Allow enabling/disabling RSS feedsMohamed Bassem2025-05-171-0/+1
|
* feat: Implement generic rule engine (#1318)Mohamed Bassem2025-04-275-16/+133
| | | | | | | | | | | | | | | | | * Add schema for the new rule engine * Add rule engine backend logic * Implement the worker logic and event firing * Implement the UI changesfor the rule engine * Ensure that when a referenced list or tag are deleted, the corresponding event/action is * Dont show smart lists in rule engine events * Add privacy validations for attached tag and list ids * Move the rules logic into a models
* chore: rename missing files/conf from Hoarder to Karakeep (#1280)adripo2025-04-212-3/+3
| | | | | | | | | * refactor: Rename remaining project configuration from Hoarder to Karakeep * some fixes --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* fix(workers): Fix dompurify to run on readability's input not outputMohamed Bassem2025-04-211-4/+12
|
* deps: Upgrade readability to 0.6 & adblocker to 2.5.1Mohamed Bassem2025-04-211-2/+2
|
* fix(workers): Close browser if connect on demand (#1151)Chang-Yen Tseng2025-04-161-0/+3
|
* feat: Add an MCP server for karakeepMohamed Bassem2025-04-131-1/+1
|
* chore: Rename hoarder packages to karakeepMohamedBassem2025-04-1214-73/+73
|
* feat(workers): Add CRAWLER_SCREENSHOT_TIMEOUT_SEC (#1155)Chang-Yen Tseng2025-03-271-10/+18
|
* feat(workers): Adds publisher and author og:meta tags to Bookmark (#1141)erik-nilcoast2025-03-223-1/+32
|
* deps: Upgrade pdfjs and dompurifyMohamed Bassem2025-03-221-4/+3
|
* feat(workers): allows videoWorker to use ytdlp command line arguments ↵erik-nilcoast2025-03-161-1/+2
| | | | specified in the config. Fixes #775 #792 (#1117)
* fix: Revert the accidental upgrade of deps. #1107Mohamed Bassem2025-03-101-1/+1
|
* build(deps): bump dompurify from 3.0.9 to 3.2.4 (#1102)dependabot[bot]2025-03-091-1/+1
| | | | | | | | | | | | | | Bumps [dompurify](https://github.com/cure53/DOMPurify) from 3.0.9 to 3.2.4. - [Release notes](https://github.com/cure53/DOMPurify/releases) - [Commits](https://github.com/cure53/DOMPurify/compare/3.0.9...3.2.4) --- updated-dependencies: - dependency-name: dompurify dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix(workers): Small typo fix in assetPreprocessingWorker.tsChris2025-03-081-2/+2
|