rcgit

/ karakeep
follow (on) | order: default date topo
Age Commit message Author Files +/-
deps: Upgrade drizzle-orm to 0.38.3 Mohamed Bassem 5 -15/+114
refactor: Move asset preprocessing to its own worker out of the inference worker Mohamed Bassem 7 -120/+258
feature: Store crawling status code and allow users to find broken links. Fixes… Mohamed Bassem 9 -6/+1628
feature(workers): Allow running hoarder without chrome as a hard dependency.… Mohamed Bassem 1 -11/+35
fix(workers): Add spaces in tag placeholders for better tokenization Mohamed Bassem 1 -3/+3
feature: Add support for tag placeholders in custom prompts. #111 (#612)
* PR for #111
added a $tags,$aiTags and $userTags placeholder that will be replaced with all tags, ai tags or user tags during inference
* Use the new buildImpersonatingTRPCClient util
---------
Co-authored-by: Mohamed Bassem <me@mbassem.com>
kamtschatka 2 -1/+47
fix(workers): Set a timeout on the screenshot call and completely skip it if… Mohamed Bassem 1 -13/+32
fix(workers): Don't block connection to chrome when failing to download adblock… Mohamed Bassem 5 -117/+120
chore(workers): Add extra logging for browser connection errors Mohamed Bassem 1 -1/+1
fix: Stop erroring in video download when there's no media found Mohamed Bassem 1 -1/+5
fix: Improve the robustness of the feed worker Mohamed Bassem 1 -4/+27
fix: Remove old downloaded video when it gets refreshed Mohamed Bassem 1 -0/+2
fix: Only update bookmark tagging/crawling status when worker is out of retries Mohamed Bassem 5 -19/+26
fix: Feed refreshes were not getting re-enqueued for failed jobs Mohamed Bassem 4 -9/+15
fix: Pass arguments to monolith and yt-dlp as array for better escaping Mohamed Bassem 2 -2/+2
fix: Fix bug in tag normalization regex. Fixes #595 Mohamed Bassem 1 -1/+1
feature: Schedule RSS feed refreshes every hour Mohamed Bassem 5 -11/+66
feature: Add support for subscribing to RSS feeds. Fixes #202 Mohamed Bassem 16 -3/+2280
feature: Archive videos using yt-dlp. Fixes #215 (#525)
* Allow downloading more content from a webpage and index it #215
Added a worker that allows downloading videos depending on the environment variables
refactored the code a bit
added new video asset
updated documentation
* Some tweaks
* Drop the dependency on the yt-dlp wrapper
* Update openapi specs
* Dont log an error when the url is not supported
* Better handle supported websites that dont download anything
---------
Co-authored-by: Mohamed Bassem <me@mbassem.com>
kamtschatka 17 -71/+403
deps: Extract the queue implementation into its own repos Mohamed Bassem 23 -1336/+65
fix: Index the summary in search Mohamed Bassem 3 -1/+4
feature: Add a summarize with AI button for links Mohamed Bassem 12 -11/+1536
refactor: Move inference to the shared package Mohamed Bassem 6 -165/+166
feature: Add OCR support for images. Fixes #296 Mohamed Bassem 8 -2/+139
fix(workers): Pin execa to avoid ERR_PACKAGE_PATH_NOT_EXPORTED error Your Name 2 -11/+19
feature: Allow reseting user password, change their roles and create new users.…
* How do I set the variable "user" or "system" for AI inference #262
changed from system to user
* Make Myself an Admin #560
added  user management functionality to the admin page
* A bunch of UI fixes and simplifications
---------
Co-authored-by: Mohamed Bassem <me@mbassem.com>
kamtschatka 9 -52/+711
deps: Upgrade metascraper for faster docker builds MohamedBassem 2 -126/+242
feature: Allow customizing the inference's context length MohamedBassem 9 -36/+51
feature: Introduce a mechanism to cleanup dangling assets MohamedBassem 10 -8/+351
refactor: Start tracking bookmark assets in the assets table MohamedBassem 14 -175/+1581
refactor: Include userId in the assets table MohamedBassem 7 -0/+1235
deps: Upgrade openai package MohamedBassem 2 -23/+12
feature(web): Add ability to manually trigger full page archives. Fixes #398…
* [Feature Request] Ability to select what to "crawl full page archive" #398
Added the ability to start a full page crawl for links and also in bulk operations
added the ability to refresh links as a bulk operation as well
* minor icon and wording changes
---------
Co-authored-by: MohamedBassem <me@mbassem.com>
kamtschatka 5 -6/+89
feature(web): Add the ability to customize the inference prompts. Fixes #170 MohamedBassem 13 -39/+1764
fix(workers): Log stacktrace on worker error. #424 (#429)
extended logging when an exception occurrs, so it is possible to see the stacktrace of a failed execution
kamtschatka 3 -3/+7
deps: Upgrade drizzle and next auth drizzle adapter MohamedBassem 7 -54/+89
feature(worker): Allow configuring inference job timeout and ollama keep alive.… MohamedBassem 4 -19/+26
build: Fix sherif failures by sorting deps MohamedBassem 6 -12/+12
fix(workers): Shutdown workers on SIGTERM MohamedBassem 2 -0/+9
fix: async/await issues with the new queue (#319) kamtschatka 6 -25/+27
refactor: Replace the usage of bullMQ with the hoarder sqlite-based queue (#309) Mohamed Bassem 13 -344/+128
fix: monolith not embedding SVG files correctly. Fixes #289 (#306)
passing in the URL of the page to have the proper URL for resolving relative paths
kamtschatka 1 -5/+2
refactor: added the bookmark type to the database (#256)
* refactoring asset types
Extracted out functions to silently delete assets and to update them after crawling
Generalized the mapping of assets to bookmark fields to make extending them easier
* Added the bookmark type to the database
Introduced an enum to have better type safety
cleaned up the code and based some code on the type directly
* add BookmarkType.UNKNWON
* lint and remove unused function
---------
Co-authored-by: MohamedBassem <me@mbassem.com>
kamtschatka 27 -120/+1266
refactor: remove redundant code from crawler worker and refactor handling of…
* refactoring asset types
Extracted out functions to silently delete assets and to update them after crawling
Generalized the mapping of assets to bookmark fields to make extending them easier
* revert silentDeleteAsset and hide better-sqlite3
---------
Co-authored-by: MohamedBassem <me@mbassem.com>
kamtschatka 3 -65/+80
feature: Automatically transfer image urls into bookmared assets. Fixes #246 MohamedBassem 2 -9/+23
refactor: extract assets into their own database table. #215 (#220)
* Allow downloading more content from a webpage and index it #215
added a new table that contains the information about assets for link bookmarks
created migration code that transfers the existing data into the new table
* Allow downloading more content from a webpage and index it #215
removed the old asset columns from the database
updated the UI to use the data from the linkBookmarkAssets array
* generalize the assets table to not be linked in particular to links
* fix migrations post merge
* fix missing asset ids in the getBookmarks call
---------
Co-authored-by: MohamedBassem <me@mbassem.com>
kamtschatka 6 -52/+1271
feature: add support for PDF links. Fixes #28 (#216)
* feature request: pdf support #28
Added a new sourceUrl column to the asset bookmarks
Added transforming a link bookmark pointing at a pdf to an asset bookmark
made sure the "View Original" link is also shown for asset bookmarks that have a sourceURL
updated gitignore for IDEA
* remove pdf parsing from the crawler
* extract the http logic into its own function to avoid duplicating the post-processing actions (openai/index)
* Add 5s timeout to the content type fetch
---------
Co-authored-by: MohamedBassem <me@mbassem.com>
kamtschatka 10 -93/+1263
fix: Trigger search re-index on bookmark tag manual updates. Fixes #208 (#210)
* re-index of database is not scanning all places when bookmark tags are changed. Manual indexing is working as workaround #208
introduced a new function to trigger a reindex to reduce copy/paste
added missing reindexes when tags are deleted/bookmarks are updated
* give functions a bit more descriptive name
---------
Co-authored-by: kamtschatka <simon.schatka@gmx.at>
Co-authored-by: MohamedBassem <me@mbassem.com>
kamtschatka 6 -55/+41
fix(workers): AI infered tags can contain " " at the beginning. Fixes #184…
added a trim to tags to prevent whitespaces at the beginning/end of tags
Co-authored-by: kamtschatka <simon.schatka@gmx.at>
kamtschatka 1 -3/+5
fix(crawler): Only update the database if full page archival is enabled MohamedBassem 1 -19/+19
feature: Full page archival with monolith. Fixes #132 MohamedBassem 14 -7/+1259
feature(inference): Improve ollama tagging (#162)
* Inference Failed with Ollama #20
Changed the prompt to be split in 2, so ollama does not forget them
* Update apps/workers/openaiWorker.ts
Co-authored-by: Mohamed Bassem <me@mbassem.com>
---------
Co-authored-by: kamtschatka <simon.schatka@gmx.at>
Co-authored-by: Mohamed Bassem <me@mbassem.com>
kamtschatka 1 -5/+12
feature(crawler): Allow connecting to browser's websocket address and launching… MohamedBassem 3 -36/+70
feature: Take full page screenshots #143 (#148)
Added the fullPage flag to take full screen screenshots
updated the UI accordingly to properly show the screenshots instead of scaling it down
Co-authored-by: kamtschatka <simon.schatka@gmx.at>
kamtschatka 4 -3/+9
fix(inference): Attempt to reuse existing identical tags MohamedBassem 2 -23/+63
feature(crawler): Allow increasing crawler concurrency and configure storing… MohamedBassem 3 -4/+26
fix(crawler): Better extraction for amazon images MohamedBassem 3 -0/+20
fix(workers): Increase robustness of search worker and add extra logging. Fixes… MohamedBassem 1 -24/+45
fix(workers): Set a modern user agent and update the default viewport size MohamedBassem 1 -0/+7
feature: Allow recrawling bookmarks without running inference jobs MohamedBassem 4 -9/+46
feature: Download images and screenshots MohamedBassem 22 -135/+1373
fix: Fix slice call in the content truncation logic which was resulting in… MohamedBassem 1 -1/+1
feature: Add title to bookmarks and allow editing them. Fixes #27 MohamedBassem 17 -54/+1240
fix: Differentiate between pending in db and in redis in admin job stats MohamedBassem 3 -26/+64
feature: Recrawl failed links from admin UI (#95)
* feature: Retry failed crawling URLs
* fix: Enhancing visuals and some minor changes.
Ahmad Mujahid 8 -25/+1067
fix: Increase default navigation timeout to 30s, make it configurable and add… MohamedBassem 5 -6/+17
feature: Add PDF support (#88)
* feature: Add PDF support
* fix: PDF feature enhancements
* fix: Freeze expo-share-intent version to prevent breaking changes
* fix: set endOfLine to auto for cross-platform development
* fix: Upgrading eslint/parser and eslint-plugin to 7.6.0 to solve the linting issues
* fix: enhancing PDF feature
* fix: Allowing null in fiename for backward compatibility
* fix: update pnpm file with pnpm 9.0.0-alpha-8
* fix:(web): PDF Preview for web
Ahmad Mujahid 24 -107/+2387
feature(inference): Upgrade the default vision model to the new gpt-4-turbo MohamedBassem 4 -10/+11
fix(crawler): Skip validating URLs in metascrapper as it was already being… MohamedBassem 1 -0/+3
fix(workers): Increase default timeout to 60s, make it configurable and improve… MohamedBassem 3 -11/+29
feature: Include server version in the admin UI. Fixes #66 MohamedBassem 8 -14/+92
fix(workers): Add a timeout to the crawling job to prevent it from getting… MohamedBassem 2 -1/+18
feat(workers): Allow configuring the language in which the tags are generated.… MohamedBassem 4 -28/+8
chore(workers): Remove unused configuration options MohamedBassem 2 -6/+0
format: Add missing lint and format, and format the entire repo MohamedBassem 57 -192/+255
fix: Sort search results by relevance MohamedBassem 3 -1/+26
feature(web): Add support for attaching notes to bookmarks MohamedBassem 10 -2/+1012
fix: Drop the 2k char limit on notes. Fixes #25 MohamedBassem 2 -7/+12
fix: Attempt to increase the reliability of the ollama inference MohamedBassem 4 -17/+49
feature: Add support for local models using ollama MohamedBassem 10 -81/+206
refactor: Validate env variables using zod MohamedBassem 7 -46/+91
docker: Use external chrome docker container MohamedBassem 8 -33/+61
fix(workers): Fix the leaky browser instances in workers during development MohamedBassem 3 -29/+46
fix: Simple validations for crawled URLs MohamedBassem 1 -1/+17
fix(workers): Drop the restriction of the tags being lowercase and without… MohamedBassem 1 -2/+2
refactor: Change asset storage to be the filesystem instead of sqlite MohamedBassem 16 -75/+2006
Feature: Add support for uploading images and automatically inferring their…
* feature: Experimental support for asset uploads
* feature(web): Add new bookmark type asset
* feature: Add support for automatically tagging images
* fix: Add support for image assets in preview page
* use next Image for fetching the images
* Fix auth and error codes in the route handlers
* Add support for image uploads on mobile
* Fix typing of upload requests
* Remove the ugly dragging box
* Bump mobile version to 1.3
* Change the editor card placeholder to mention uploading images
* Fix a typo
* Change ios icon for photo library
* Silence typescript error
Mohamed Bassem 31 -79/+2736
docker: Fix dockerfiles to adapt to the new repo structure MohamedBassem 8 -172/+46
structure: Create apps dir and copy tooling dir from t3-turbo repo MohamedBassem 396 -9511/+10350