aboutsummaryrefslogtreecommitdiffstats
path: root/apps/workers/crawlerWorker.ts (follow)
Commit message (Expand)AuthorAgeFilesLines
* refactor: Start tracking bookmark assets in the assets tableMohamedBassem2024-10-061-60/+83
* refactor: Include userId in the assets tableMohamedBassem2024-10-061-0/+5
* feature(web): Add ability to manually trigger full page archives. Fixes #398 ...kamtschatka2024-09-301-3/+5
* fix(workers): Log stacktrace on worker error. #424 (#429)kamtschatka2024-09-261-1/+3
* fix(workers): Shutdown workers on SIGTERMMohamedBassem2024-07-281-0/+4
* fix: async/await issues with the new queue (#319)kamtschatka2024-07-211-2/+2
* refactor: Replace the usage of bullMQ with the hoarder sqlite-based queue (#309)Mohamed Bassem2024-07-211-31/+29
* fix: monolith not embedding SVG files correctly. Fixes #289 (#306)kamtschatka2024-07-141-5/+2
* refactor: added the bookmark type to the database (#256)kamtschatka2024-07-011-0/+6
* refactor: remove redundant code from crawler worker and refactor handling of ...kamtschatka2024-06-291-32/+49
* feature: Automatically transfer image urls into bookmared assets. Fixes #246MohamedBassem2024-06-231-6/+16
* refactor: extract assets into their own database table. #215 (#220)kamtschatka2024-06-231-29/+71
* feature: add support for PDF links. Fixes #28 (#216)kamtschatka2024-06-221-57/+163
* fix: Trigger search re-index on bookmark tag manual updates. Fixes #208 (#210)kamtschatka2024-06-091-5/+2
* fix(crawler): Only update the database if full page archival is enabledMohamedBassem2024-05-261-19/+19
* feature: Full page archival with monolith. Fixes #132MohamedBassem2024-05-261-1/+65
* feature(crawler): Allow connecting to browser's websocket address and launchi...MohamedBassem2024-05-151-28/+55
* feature: Take full page screenshots #143 (#148)kamtschatka2024-05-121-1/+2
* feature(crawler): Allow increasing crawler concurrency and configure storing ...MohamedBassem2024-04-261-0/+13
* fix(crawler): Better extraction for amazon imagesMohamedBassem2024-04-231-0/+2
* fix(workers): Set a modern user agent and update the default viewport sizeMohamedBassem2024-04-231-0/+7
* feature: Allow recrawling bookmarks without running inference jobsMohamedBassem2024-04-201-7/+29
* feature: Download images and screenshotsMohamedBassem2024-04-201-28/+130
* feature: Recrawl failed links from admin UI (#95)Ahmad Mujahid2024-04-111-0/+20
* fix: Increase default navigation timeout to 30s, make it configurable and add...MohamedBassem2024-04-111-1/+1
* fix(crawler): Skip validating URLs in metascrapper as it was already being va...MohamedBassem2024-04-091-0/+3
* fix(workers): Increase default timeout to 60s, make it configurable and impro...MohamedBassem2024-04-061-11/+21
* fix(workers): Add a timeout to the crawling job to prevent it from getting st...MohamedBassem2024-04-021-1/+2
* chore(workers): Remove unused configuration optionsMohamedBassem2024-03-311-2/+0
* format: Add missing lint and format, and format the entire repoMohamedBassem2024-03-301-5/+6
* refactor: Validate env variables using zodMohamedBassem2024-03-271-1/+1
* docker: Use external chrome docker containerMohamedBassem2024-03-241-10/+40
* fix(workers): Fix the leaky browser instances in workers during developmentMohamedBassem2024-03-211-28/+30
* fix: Simple validations for crawled URLsMohamedBassem2024-03-211-1/+17
* structure: Create apps dir and copy tooling dir from t3-turbo repoMohamedBassem2024-03-141-0/+201