aboutsummaryrefslogtreecommitdiffstats
path: root/apps/workers (follow)
Commit message (Collapse)AuthorAgeFilesLines
* chore: add format:fix and lint:fix scripts to all packagesMohamed Bassem (aider)2024-12-311-0/+2
|
* deps: Upgrade drizzle-orm to 0.38.3Mohamed Bassem2024-12-291-1/+1
|
* refactor: Move asset preprocessing to its own worker out of the inference workerMohamed Bassem2024-12-265-118/+231
|
* feature: Store crawling status code and allow users to find broken links. ↵Mohamed Bassem2024-12-081-4/+6
| | | | Fixes #169
* feature(workers): Allow running hoarder without chrome as a hard dependency. ↵Mohamed Bassem2024-11-301-11/+35
| | | | Fixes #650
* fix(workers): Add spaces in tag placeholders for better tokenizationMohamed Bassem2024-11-241-3/+3
|
* feature: Add support for tag placeholders in custom prompts. #111 (#612)kamtschatka2024-11-241-1/+42
| | | | | | | | | | * PR for #111 added a $tags,$aiTags and $userTags placeholder that will be replaced with all tags, ai tags or user tags during inference * Use the new buildImpersonatingTRPCClient util --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* fix(workers): Set a timeout on the screenshot call and completely skip it if ↵Mohamed Bassem2024-11-231-13/+32
| | | | screenshotting is disabled
* fix(workers): Don't block connection to chrome when failing to download ↵Mohamed Bassem2024-11-212-7/+24
| | | | adblock list. #674
* chore(workers): Add extra logging for browser connection errorsMohamed Bassem2024-11-211-1/+1
|
* fix: Stop erroring in video download when there's no media foundMohamed Bassem2024-11-091-1/+5
|
* fix: Improve the robustness of the feed workerMohamed Bassem2024-11-091-4/+27
|
* fix: Remove old downloaded video when it gets refreshedMohamed Bassem2024-11-091-0/+2
|
* fix: Only update bookmark tagging/crawling status when worker is out of retriesMohamed Bassem2024-11-095-19/+26
|
* fix: Feed refreshes were not getting re-enqueued for failed jobsMohamed Bassem2024-11-091-1/+1
|
* fix: Pass arguments to monolith and yt-dlp as array for better escapingMohamed Bassem2024-11-032-2/+2
|
* fix: Fix bug in tag normalization regex. Fixes #595Mohamed Bassem2024-11-031-1/+1
|
* feature: Schedule RSS feed refreshes every hourMohamed Bassem2024-11-033-3/+37
|
* feature: Add support for subscribing to RSS feeds. Fixes #202Mohamed Bassem2024-11-034-2/+190
|
* feature: Archive videos using yt-dlp. Fixes #215 (#525)kamtschatka2024-10-284-52/+272
| | | | | | | | | | | | | | | | | | | | | * Allow downloading more content from a webpage and index it #215 Added a worker that allows downloading videos depending on the environment variables refactored the code a bit added new video asset updated documentation * Some tweaks * Drop the dependency on the yt-dlp wrapper * Update openapi specs * Dont log an error when the url is not supported * Better handle supported websites that dont download anything --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* deps: Extract the queue implementation into its own reposMohamed Bassem2024-10-275-5/+5
|
* fix: Index the summary in searchMohamed Bassem2024-10-271-0/+1
|
* feature: Add a summarize with AI button for linksMohamed Bassem2024-10-271-3/+6
|
* refactor: Move inference to the shared packageMohamed Bassem2024-10-263-159/+2
|
* feature: Add OCR support for images. Fixes #296Mohamed Bassem2024-10-203-1/+44
|
* fix(workers): Pin execa to avoid ERR_PACKAGE_PATH_NOT_EXPORTED errorYour Name2024-10-191-1/+1
|
* feature: Allow reseting user password, change their roles and create new ↵kamtschatka2024-10-191-1/+1
| | | | | | | | | | | | | | | users. Fixes #495 (#567) * How do I set the variable "user" or "system" for AI inference #262 changed from system to user * Make Myself an Admin #560 added user management functionality to the admin page * A bunch of UI fixes and simplifications --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
* deps: Upgrade metascraper for faster docker buildsMohamedBassem2024-10-121-10/+10
|
* feature: Allow customizing the inference's context lengthMohamedBassem2024-10-124-19/+11
|
* feature: Introduce a mechanism to cleanup dangling assetsMohamedBassem2024-10-122-3/+116
|
* refactor: Start tracking bookmark assets in the assets tableMohamedBassem2024-10-061-60/+83
|
* refactor: Include userId in the assets tableMohamedBassem2024-10-061-0/+5
|
* deps: Upgrade openai packageMohamedBassem2024-10-051-1/+1
|
* feature(web): Add ability to manually trigger full page archives. Fixes #398 ↵kamtschatka2024-09-301-3/+5
| | | | | | | | | | | | | (#418) * [Feature Request] Ability to select what to "crawl full page archive" #398 Added the ability to start a full page crawl for links and also in bulk operations added the ability to refresh links as a bulk operation as well * minor icon and wording changes --------- Co-authored-by: MohamedBassem <me@mbassem.com>
* feature(web): Add the ability to customize the inference prompts. Fixes #170MohamedBassem2024-09-291-39/+42
|
* fix(workers): Log stacktrace on worker error. #424 (#429)kamtschatka2024-09-263-3/+7
| | | extended logging when an exception occurrs, so it is possible to see the stacktrace of a failed execution
* deps: Upgrade drizzle and next auth drizzle adapterMohamedBassem2024-09-151-1/+1
|
* feature(worker): Allow configuring inference job timeout and ollama keep ↵MohamedBassem2024-09-152-1/+2
| | | | alive. Fixes #389 #224
* build: Fix sherif failures by sorting depsMohamedBassem2024-08-311-1/+1
|
* fix(workers): Shutdown workers on SIGTERMMohamedBassem2024-07-282-0/+9
|
* fix: async/await issues with the new queue (#319)kamtschatka2024-07-212-3/+3
|
* refactor: Replace the usage of bullMQ with the hoarder sqlite-based queue (#309)Mohamed Bassem2024-07-215-72/+75
|
* fix: monolith not embedding SVG files correctly. Fixes #289 (#306)kamtschatka2024-07-141-5/+2
| | | passing in the URL of the page to have the proper URL for resolving relative paths
* refactor: added the bookmark type to the database (#256)kamtschatka2024-07-011-0/+6
| | | | | | | | | | | | | | | | | * refactoring asset types Extracted out functions to silently delete assets and to update them after crawling Generalized the mapping of assets to bookmark fields to make extending them easier * Added the bookmark type to the database Introduced an enum to have better type safety cleaned up the code and based some code on the type directly * add BookmarkType.UNKNWON * lint and remove unused function --------- Co-authored-by: MohamedBassem <me@mbassem.com>
* refactor: remove redundant code from crawler worker and refactor handling of ↵kamtschatka2024-06-291-32/+49
| | | | | | | | | | | | | asset types (#253) * refactoring asset types Extracted out functions to silently delete assets and to update them after crawling Generalized the mapping of assets to bookmark fields to make extending them easier * revert silentDeleteAsset and hide better-sqlite3 --------- Co-authored-by: MohamedBassem <me@mbassem.com>
* feature: Automatically transfer image urls into bookmared assets. Fixes #246MohamedBassem2024-06-231-6/+16
|
* refactor: extract assets into their own database table. #215 (#220)kamtschatka2024-06-231-29/+71
| | | | | | | | | | | | | | | | | | | * Allow downloading more content from a webpage and index it #215 added a new table that contains the information about assets for link bookmarks created migration code that transfers the existing data into the new table * Allow downloading more content from a webpage and index it #215 removed the old asset columns from the database updated the UI to use the data from the linkBookmarkAssets array * generalize the assets table to not be linked in particular to links * fix migrations post merge * fix missing asset ids in the getBookmarks call --------- Co-authored-by: MohamedBassem <me@mbassem.com>
* feature: add support for PDF links. Fixes #28 (#216)kamtschatka2024-06-221-57/+163
| | | | | | | | | | | | | | | | | * feature request: pdf support #28 Added a new sourceUrl column to the asset bookmarks Added transforming a link bookmark pointing at a pdf to an asset bookmark made sure the "View Original" link is also shown for asset bookmarks that have a sourceURL updated gitignore for IDEA * remove pdf parsing from the crawler * extract the http logic into its own function to avoid duplicating the post-processing actions (openai/index) * Add 5s timeout to the content type fetch --------- Co-authored-by: MohamedBassem <me@mbassem.com>
* fix: Trigger search re-index on bookmark tag manual updates. Fixes #208 (#210)kamtschatka2024-06-092-10/+4
| | | | | | | | | | | | * re-index of database is not scanning all places when bookmark tags are changed. Manual indexing is working as workaround #208 introduced a new function to trigger a reindex to reduce copy/paste added missing reindexes when tags are deleted/bookmarks are updated * give functions a bit more descriptive name --------- Co-authored-by: kamtschatka <simon.schatka@gmx.at> Co-authored-by: MohamedBassem <me@mbassem.com>
* fix(workers): AI infered tags can contain " " at the beginning. Fixes #184 ↵kamtschatka2024-06-071-3/+5
| | | | | | | (#194) added a trim to tags to prevent whitespaces at the beginning/end of tags Co-authored-by: kamtschatka <simon.schatka@gmx.at>