aboutsummaryrefslogtreecommitdiffstats
path: root/apps/workers/crawlerWorker.ts (unfollow)
Commit message (Expand)AuthorFilesLines
2024-10-06refactor: Start tracking bookmark assets in the assets tableMohamedBassem1-60/+83
2024-10-06refactor: Include userId in the assets tableMohamedBassem1-0/+5
2024-09-30feature(web): Add ability to manually trigger full page archives. Fixes #398 ...kamtschatka1-3/+5
2024-09-26fix(workers): Log stacktrace on worker error. #424 (#429)kamtschatka1-1/+3
2024-07-28fix(workers): Shutdown workers on SIGTERMMohamedBassem1-0/+4
2024-07-21fix: async/await issues with the new queue (#319)kamtschatka1-2/+2
2024-07-21refactor: Replace the usage of bullMQ with the hoarder sqlite-based queue (#309)Mohamed Bassem1-31/+29
2024-07-14fix: monolith not embedding SVG files correctly. Fixes #289 (#306)kamtschatka1-5/+2
2024-07-01refactor: added the bookmark type to the database (#256)kamtschatka1-0/+6
2024-06-29refactor: remove redundant code from crawler worker and refactor handling of ...kamtschatka1-32/+49
2024-06-23feature: Automatically transfer image urls into bookmared assets. Fixes #246MohamedBassem1-6/+16
2024-06-23refactor: extract assets into their own database table. #215 (#220)kamtschatka1-29/+71
2024-06-22feature: add support for PDF links. Fixes #28 (#216)kamtschatka1-57/+163
2024-06-09fix: Trigger search re-index on bookmark tag manual updates. Fixes #208 (#210)kamtschatka1-5/+2
2024-05-26fix(crawler): Only update the database if full page archival is enabledMohamedBassem1-19/+19
2024-05-26feature: Full page archival with monolith. Fixes #132MohamedBassem1-1/+65
2024-05-15feature(crawler): Allow connecting to browser's websocket address and launchi...MohamedBassem1-28/+55
2024-05-12feature: Take full page screenshots #143 (#148)kamtschatka1-1/+2
2024-04-26feature(crawler): Allow increasing crawler concurrency and configure storing ...MohamedBassem1-0/+13
2024-04-23fix(crawler): Better extraction for amazon imagesMohamedBassem1-0/+2
2024-04-23fix(workers): Set a modern user agent and update the default viewport sizeMohamedBassem1-0/+7
2024-04-20feature: Allow recrawling bookmarks without running inference jobsMohamedBassem1-7/+29
2024-04-20feature: Download images and screenshotsMohamedBassem1-28/+130
2024-04-11feature: Recrawl failed links from admin UI (#95)Ahmad Mujahid1-0/+20
2024-04-11fix: Increase default navigation timeout to 30s, make it configurable and add...MohamedBassem1-1/+1
2024-04-09fix(crawler): Skip validating URLs in metascrapper as it was already being va...MohamedBassem1-0/+3
2024-04-06fix(workers): Increase default timeout to 60s, make it configurable and impro...MohamedBassem1-11/+21
2024-04-02fix(workers): Add a timeout to the crawling job to prevent it from getting st...MohamedBassem1-1/+2
2024-03-31chore(workers): Remove unused configuration optionsMohamedBassem1-2/+0
2024-03-30format: Add missing lint and format, and format the entire repoMohamedBassem1-5/+6
2024-03-27refactor: Validate env variables using zodMohamedBassem1-1/+1
2024-03-24docker: Use external chrome docker containerMohamedBassem1-10/+40
2024-03-21fix(workers): Fix the leaky browser instances in workers during developmentMohamedBassem1-28/+30
2024-03-21fix: Simple validations for crawled URLsMohamedBassem1-1/+17
2024-03-14structure: Create apps dir and copy tooling dir from t3-turbo repoMohamedBassem1-0/+0
2024-03-05feature: Store html content of links in the databaseMohamedBassem1-0/+1
2024-03-05fix: Use puppeteer adblocker to block cookies noticesMohamedBassem1-0/+6
2024-03-02feature: Store full link content and index themMohamedBassem1-1/+12
2024-03-01feature: Add full text search supportMohamedBassem1-0/+8
2024-02-23db: Migrate from prisma to drizzleMohamedBassem1-10/+10
2024-02-20branding: Rename app to HoarderMohamedBassem1-4/+4
2024-02-17build: Fix docker imagesMohamedBassem1-1/+5
2024-02-17fix: Let the crawler wait a bit more for page loadMohamedBassem1-2/+12
2024-02-14fix: Harden puppeteer against browser disconnections and exceptionsMohamedBassem1-16/+33
2024-02-14feature: Add ability to refresh bookmark detailsMohamedBassem1-1/+13
2024-02-11fix: Fix build for workers package and add it to CIMohamedBassem1-8/+42
2024-02-09[feature] Use puppeteer for fetching websitesMohamedBassem1-4/+32
2024-02-09[chore] Linting and formating tweakingMohamedBassem1-5/+11
2024-02-09[refactor] Extract the bookmark model to be a high level model to support oth...MohamedBassem1-23/+9
2024-02-08[refactor] Move the different packages to the package subdirMohamedBassem1-0/+0