aboutsummaryrefslogtreecommitdiffstats
path: root/apps/workers/package.json (follow)
Commit message (Collapse)AuthorAgeFilesLines
* fix: Stricter SSRF validation (#2082)Mohamed Bassem2025-11-021-0/+2
| | | | | | | | | | | | | | | | | | | * fix: Stricter SSRF validation * skip dns resolution if running in proxy context * more fixes * Add LRU cache * change the env variable for internal hostnames * make dns resolution timeout configerable * upgrade ipaddr * handle ipv6 * handle proxy bypass for request interceptor
* deps: Upgrade metascraper pluginsMohamed Bassem2025-10-261-11/+11
|
* deps: Upgrade metascraper-readability 5.49.6Mohamed Bassem2025-10-261-1/+1
|
* fix: fix bundling liteque in the workersMohamed Bassem2025-09-141-0/+1
|
* refactor: Move callsites to liteque to be behind a pluginMohamed Bassem2025-09-141-1/+0
|
* fix: Respect wal mode for the queue dbMohamed Bassem2025-08-301-1/+1
|
* feat: Export prometheus metrics from the workersMohamedBassem2025-08-221-0/+4
|
* refactor: Extract meilisearch as a pluginMohamedBassem2025-07-271-0/+1
|
* chore: More turbo fixesMohamedBassem2025-07-271-2/+2
|
* fix: Ensure that all packages are ESM packagesMohamedBassem2025-07-271-0/+1
|
* deps: Upgrade viteMohamed Bassem2025-07-261-1/+1
|
* fix: Run workers in prod without tsx. Fixes #1673Mohamed Bassem2025-07-191-2/+5
|
* feat: Add proper proxy support. fixes #1265Mohamed Bassem2025-07-131-0/+2
|
* deps: Upgrade typescript to 5.8Mohamed Bassem2025-07-121-1/+1
|
* deps: Upgrade drizzleMohamed Bassem2025-07-121-1/+1
|
* fix: Prioritize crawling user added links over bulk imports. fixes #1717Mohamed Bassem2025-07-121-1/+1
|
* feat(workers): adding a local metascraper plugin for Reddit posts (#1302)David Woods2025-06-221-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * chore: metascraper 5.x comes with its own types, including @types/metascraper is now redundant; also updating to latest versions of metascraper libraries * feat (workers): creating a local metascraper plugin for Reddit posts In the past, the preview images for bookmarks from Reddit links were poorly chosen. Reddit does not use opengraph tags, so metascraper-images simply looked for all images on the page and returned the first. This tended to be the profile picture for the poster for the Reddit link. This new plugin, using the existing metascraper framework, provides a better selection of image for the bookmark when the URL domain is 'reddit'. In addition, recent changes (I believe this was a side effect of adding the metascraper-author and/or the metascaper-publisher plugins, but it could also be related to the metascraper-readibility plugin) broke what used to be a good choice of bookmark title. Previously, titles looked like 'Tinyauth just reached 1000 stars! : r/selfhosted' with both thread title and subreddit mentioned. After this update, all Reddit posts now have the same title: 'The heart of the internet'. To return to the better format, this new metascraper-reddit plugin now attempts to retrieve the better title from reddit URLs. Note that in order to gain precendence in title selection, the 'metascraperReddit()' inclusion in the crawlerWorkers.ts metascraper instantiation list had to be moved above metascraperReadability(). * chore: updated Hoarder in text to Karakeep * chore: update metascraper versions fix for metascraper types has been merged; the expect-error comment can be removed * chore: merge with master --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* feat(workers): migrate from puppeteer to playwright (#1296)Mael2025-06-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | * feat: convert to playwright Convert crawling to use Playwright instead of Chrome. - Update Dockerfile to include Playwright - Update crawler worker to use Playwright API - Update dependencies * feat: convert from Puppeteer to Playwright for crawling * feat: update docker-compose * use separate browser context for better isolation * skip chrome download in linux script * readd the stealth plugin --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* chore: migrate away from eslint to oxlint (#1642)xuatz2025-06-221-9/+2
| | | | | | | * chore: migrate away from eslint to oxlint * revert turbo task name lint * it seems like we can remove the seemingly default globals
* deps: Upgrade readability to 0.6 & adblocker to 2.5.1Mohamed Bassem2025-04-211-2/+2
|
* feat: Add an MCP server for karakeepMohamed Bassem2025-04-131-1/+1
|
* chore: Rename hoarder packages to karakeepMohamedBassem2025-04-121-9/+9
|
* feat(workers): Adds publisher and author og:meta tags to Bookmark (#1141)erik-nilcoast2025-03-221-1/+4
|
* deps: Upgrade pdfjs and dompurifyMohamed Bassem2025-03-221-4/+3
|
* fix: Revert the accidental upgrade of deps. #1107Mohamed Bassem2025-03-101-1/+1
|
* build(deps): bump dompurify from 3.0.9 to 3.2.4 (#1102)dependabot[bot]2025-03-091-1/+1
| | | | | | | | | | | | | | Bumps [dompurify](https://github.com/cure53/DOMPurify) from 3.0.9 to 3.2.4. - [Release notes](https://github.com/cure53/DOMPurify/releases) - [Commits](https://github.com/cure53/DOMPurify/compare/3.0.9...3.2.4) --- updated-dependencies: - dependency-name: dompurify dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat: Add PDF screenshot generation and display (#995)Ahmad Mujahid2025-02-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updated pdf2json to 3.1.5 * Extract and store a screenshot from PDF files using pdf2pic * Installing graphicsmagick and ghostscript * Generate Missing PDF screenshot with tidyAssets worker for backward support * Display PDF screenshot instead of the PDF in web if it exists. * Display PDF screenshot in mobile app if exists. * Updated pnpm-lock.yaml * Removed console.log * Revert the unnecessary changes in package.json * Revert pnpm-lock changes * Prevent rendering PDF files if the screenshot is not generated * refactor: replace useEffect with useMemo for section initialization * feat: show PDF file download button and handle large PDFs by defaulting to screenshot view * feat: add file size to openapi spec * feature: Add Assets preprocessing in fix mode to admin actions * i18n: add reprocess_assets_fix_mode translation * i18n: Add missing ar translations * A bunch of fixes * Fix openspec schema --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* fix: Fix node22 error in worker container. Fixes #962Mohamed Bassem2025-02-021-1/+1
|
* fix: Abort all IO when workers timeout instead of detaching. Fixes #742Mohamed Bassem2025-02-011-1/+1
|
* deps: Upgrade typescript to 5.7Mohamed Bassem2025-02-011-1/+1
|
* chore: add format:fix and lint:fix scripts to all packagesMohamed Bassem (aider)2024-12-311-0/+2
|
* deps: Upgrade drizzle-orm to 0.38.3Mohamed Bassem2024-12-291-1/+1
|
* fix(workers): Don't block connection to chrome when failing to download ↵Mohamed Bassem2024-11-211-1/+2
| | | | adblock list. #674
* fix: Feed refreshes were not getting re-enqueued for failed jobsMohamed Bassem2024-11-091-1/+1
|
* feature: Schedule RSS feed refreshes every hourMohamed Bassem2024-11-031-2/+4
|
* feature: Add support for subscribing to RSS feeds. Fixes #202Mohamed Bassem2024-11-031-0/+2
|
* deps: Extract the queue implementation into its own reposMohamed Bassem2024-10-271-1/+1
|
* refactor: Move inference to the shared packageMohamed Bassem2024-10-261-2/+0
|
* feature: Add OCR support for images. Fixes #296Mohamed Bassem2024-10-201-0/+1
|
* fix(workers): Pin execa to avoid ERR_PACKAGE_PATH_NOT_EXPORTED errorYour Name2024-10-191-1/+1
|
* deps: Upgrade metascraper for faster docker buildsMohamedBassem2024-10-121-10/+10
|
* feature: Allow customizing the inference's context lengthMohamedBassem2024-10-121-1/+1
|
* deps: Upgrade openai packageMohamedBassem2024-10-051-1/+1
|
* deps: Upgrade drizzle and next auth drizzle adapterMohamedBassem2024-09-151-1/+1
|
* build: Fix sherif failures by sorting depsMohamedBassem2024-08-311-1/+1
|
* refactor: Replace the usage of bullMQ with the hoarder sqlite-based queue (#309)Mohamed Bassem2024-07-211-1/+1
|
* feature: Full page archival with monolith. Fixes #132MohamedBassem2024-05-261-0/+1
|
* fix(crawler): Better extraction for amazon imagesMohamedBassem2024-04-231-0/+1
|
* feature: Add PDF support (#88)Ahmad Mujahid2024-04-111-0/+2
| | | | | | | | | | | | | | | | | | | * feature: Add PDF support * fix: PDF feature enhancements * fix: Freeze expo-share-intent version to prevent breaking changes * fix: set endOfLine to auto for cross-platform development * fix: Upgrading eslint/parser and eslint-plugin to 7.6.0 to solve the linting issues * fix: enhancing PDF feature * fix: Allowing null in fiename for backward compatibility * fix: update pnpm file with pnpm 9.0.0-alpha-8 * fix:(web): PDF Preview for web
* format: Add missing lint and format, and format the entire repoMohamedBassem2024-03-301-0/+2
|