aboutsummaryrefslogtreecommitdiffstats
path: root/apps/workers/crawlerWorker.ts (unfollow)
Commit message (Expand)AuthorFilesLines
2024-04-26feature(crawler): Allow increasing crawler concurrency and configure storing ...MohamedBassem1-0/+13
2024-04-23fix(crawler): Better extraction for amazon imagesMohamedBassem1-0/+2
2024-04-23fix(workers): Set a modern user agent and update the default viewport sizeMohamedBassem1-0/+7
2024-04-20feature: Allow recrawling bookmarks without running inference jobsMohamedBassem1-7/+29
2024-04-20feature: Download images and screenshotsMohamedBassem1-28/+130
2024-04-11feature: Recrawl failed links from admin UI (#95)Ahmad Mujahid1-0/+20
2024-04-11fix: Increase default navigation timeout to 30s, make it configurable and add...MohamedBassem1-1/+1
2024-04-09fix(crawler): Skip validating URLs in metascrapper as it was already being va...MohamedBassem1-0/+3
2024-04-06fix(workers): Increase default timeout to 60s, make it configurable and impro...MohamedBassem1-11/+21
2024-04-02fix(workers): Add a timeout to the crawling job to prevent it from getting st...MohamedBassem1-1/+2
2024-03-31chore(workers): Remove unused configuration optionsMohamedBassem1-2/+0
2024-03-30format: Add missing lint and format, and format the entire repoMohamedBassem1-5/+6
2024-03-27refactor: Validate env variables using zodMohamedBassem1-1/+1
2024-03-24docker: Use external chrome docker containerMohamedBassem1-10/+40
2024-03-21fix(workers): Fix the leaky browser instances in workers during developmentMohamedBassem1-28/+30
2024-03-21fix: Simple validations for crawled URLsMohamedBassem1-1/+17
2024-03-14structure: Create apps dir and copy tooling dir from t3-turbo repoMohamedBassem1-0/+0
2024-03-05feature: Store html content of links in the databaseMohamedBassem1-0/+1
2024-03-05fix: Use puppeteer adblocker to block cookies noticesMohamedBassem1-0/+6
2024-03-02feature: Store full link content and index themMohamedBassem1-1/+12
2024-03-01feature: Add full text search supportMohamedBassem1-0/+8
2024-02-23db: Migrate from prisma to drizzleMohamedBassem1-10/+10
2024-02-20branding: Rename app to HoarderMohamedBassem1-4/+4
2024-02-17build: Fix docker imagesMohamedBassem1-1/+5
2024-02-17fix: Let the crawler wait a bit more for page loadMohamedBassem1-2/+12
2024-02-14fix: Harden puppeteer against browser disconnections and exceptionsMohamedBassem1-16/+33
2024-02-14feature: Add ability to refresh bookmark detailsMohamedBassem1-1/+13
2024-02-11fix: Fix build for workers package and add it to CIMohamedBassem1-8/+42
2024-02-09[feature] Use puppeteer for fetching websitesMohamedBassem1-4/+32
2024-02-09[chore] Linting and formating tweakingMohamedBassem1-5/+11
2024-02-09[refactor] Extract the bookmark model to be a high level model to support oth...MohamedBassem1-23/+9
2024-02-08[refactor] Move the different packages to the package subdirMohamedBassem1-0/+0
2024-02-07[feature] Add openAI integration for extracting tags from articlesMohamedBassem1-0/+6
2024-02-07[refactor] Rename the crawlers package to workersMohamedBassem1-0/+0
2024-02-06Implement metadata fetching logic in the crawlerMohamedBassem1-2/+68
2024-02-06Init package and start bullmq workersMohamedBassem1-0/+6