karakeep - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	feat: Add AI auto summarization. Fixes #1163	Mohamed Bassem	2025-05-18	1	-877/+0
\|
*	chore: rename missing files/conf from Hoarder to Karakeep (#1280)	adripo	2025-04-21	1	-1/+1
\| \| \| \| \| \| \| \| \|	* refactor: Rename remaining project configuration from Hoarder to Karakeep * some fixes --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
*	fix(workers): Fix dompurify to run on readability's input not output	Mohamed Bassem	2025-04-21	1	-4/+12
\|
*	fix(workers): Close browser if connect on demand (#1151)	Chang-Yen Tseng	2025-04-16	1	-0/+3
\|
*	chore: Rename hoarder packages to karakeep	MohamedBassem	2025-04-12	1	-8/+8
\|
*	feat(workers): Add CRAWLER_SCREENSHOT_TIMEOUT_SEC (#1155)	Chang-Yen Tseng	2025-03-27	1	-10/+18
\|
*	feat(workers): Adds publisher and author og:meta tags to Bookmark (#1141)	erik-nilcoast	2025-03-22	1	-0/+24
\|
*	feat: Add PDF screenshot generation and display (#995)	Ahmad Mujahid	2025-02-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Updated pdf2json to 3.1.5 * Extract and store a screenshot from PDF files using pdf2pic * Installing graphicsmagick and ghostscript * Generate Missing PDF screenshot with tidyAssets worker for backward support * Display PDF screenshot instead of the PDF in web if it exists. * Display PDF screenshot in mobile app if exists. * Updated pnpm-lock.yaml * Removed console.log * Revert the unnecessary changes in package.json * Revert pnpm-lock changes * Prevent rendering PDF files if the screenshot is not generated * refactor: replace useEffect with useMemo for section initialization * feat: show PDF file download button and handle large PDFs by defaulting to screenshot view * feat: add file size to openapi spec * feature: Add Assets preprocessing in fix mode to admin actions * i18n: add reprocess_assets_fix_mode translation * i18n: Add missing ar translations * A bunch of fixes * Fix openspec schema --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
*	fix: Dont rearchive singlefile uploads and consider them as archives	Mohamed Bassem	2025-02-02	1	-2/+6
\|
*	fix: Abort all IO when workers timeout instead of detaching. Fixes #742	Mohamed Bassem	2025-02-01	1	-13/+62
\|
*	feat: Change webhooks to be configurable by users	Mohamed Bassem	2025-01-19	1	-2/+2
\|
*	feat(webhook): Implement webhook functionality for bookmark events (#852)	玄猫	2025-01-19	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* feat(webhook): Implement webhook functionality for bookmark events - Added WebhookWorker to handle webhook requests. - Integrated webhook triggering in crawlerWorker after video processing. - Updated main worker initialization to include WebhookWorker. - Enhanced configuration to support webhook URLs, token, and timeout. - Documented webhook configuration options in the documentation. - Introduced zWebhookRequestSchema for validating webhook requests. * feat(webhook): Update webhook handling and configuration - Changed webhook operation type from "create" to "crawled" in crawlerWorker and documentation. - Enhanced webhook retry logic in WebhookWorker to support multiple attempts. - Updated Docker configuration to include new webhook environment variables. - Improved validation for webhook configuration in shared config. - Adjusted zWebhookRequestSchema to reflect the new operation type. - Updated documentation to clarify webhook configuration options and usage. * minor modifications --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
*	feat: Add support for singlefile extension uploads. #172	Mohamed Bassem	2025-01-11	1	-6/+30
\|
*	refactor: Move asset preprocessing to its own worker out of the inference worker	Mohamed Bassem	2024-12-26	1	-17/+18
\|
*	feature: Store crawling status code and allow users to find broken links. ↵	Mohamed Bassem	2024-12-08	1	-4/+6
\| \| \| \|	Fixes #169
*	feature(workers): Allow running hoarder without chrome as a hard dependency. ↵	Mohamed Bassem	2024-11-30	1	-11/+35
\| \| \| \|	Fixes #650
*	fix(workers): Set a timeout on the screenshot call and completely skip it if ↵	Mohamed Bassem	2024-11-23	1	-13/+32
\| \| \| \|	screenshotting is disabled
*	fix(workers): Don't block connection to chrome when failing to download ↵	Mohamed Bassem	2024-11-21	1	-6/+22
\| \| \| \|	adblock list. #674
*	chore(workers): Add extra logging for browser connection errors	Mohamed Bassem	2024-11-21	1	-1/+1
\|
*	fix: Only update bookmark tagging/crawling status when worker is out of retries	Mohamed Bassem	2024-11-09	1	-4/+4
\|
*	fix: Pass arguments to monolith and yt-dlp as array for better escaping	Mohamed Bassem	2024-11-03	1	-1/+1
\|
*	feature: Archive videos using yt-dlp. Fixes #215 (#525)	kamtschatka	2024-10-28	1	-49/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow downloading more content from a webpage and index it #215 Added a worker that allows downloading videos depending on the environment variables refactored the code a bit added new video asset updated documentation * Some tweaks * Drop the dependency on the yt-dlp wrapper * Update openapi specs * Dont log an error when the url is not supported * Better handle supported websites that dont download anything --------- Co-authored-by: Mohamed Bassem <me@mbassem.com>
*	deps: Extract the queue implementation into its own repos	Mohamed Bassem	2024-10-27	1	-1/+1
\|
*	refactor: Start tracking bookmark assets in the assets table	MohamedBassem	2024-10-06	1	-60/+83
\|
*	refactor: Include userId in the assets table	MohamedBassem	2024-10-06	1	-0/+5
\|
*	feature(web): Add ability to manually trigger full page archives. Fixes #398 ↵	kamtschatka	2024-09-30	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	(#418) * [Feature Request] Ability to select what to "crawl full page archive" #398 Added the ability to start a full page crawl for links and also in bulk operations added the ability to refresh links as a bulk operation as well * minor icon and wording changes --------- Co-authored-by: MohamedBassem <me@mbassem.com>
*	fix(workers): Log stacktrace on worker error. #424 (#429)	kamtschatka	2024-09-26	1	-1/+3
\| \| \|	extended logging when an exception occurrs, so it is possible to see the stacktrace of a failed execution
*	fix(workers): Shutdown workers on SIGTERM	MohamedBassem	2024-07-28	1	-0/+4
\|
*	fix: async/await issues with the new queue (#319)	kamtschatka	2024-07-21	1	-2/+2
\|
*	refactor: Replace the usage of bullMQ with the hoarder sqlite-based queue (#309)	Mohamed Bassem	2024-07-21	1	-31/+29
\|
*	fix: monolith not embedding SVG files correctly. Fixes #289 (#306)	kamtschatka	2024-07-14	1	-5/+2
\| \| \|	passing in the URL of the page to have the proper URL for resolving relative paths
*	refactor: added the bookmark type to the database (#256)	kamtschatka	2024-07-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* refactoring asset types Extracted out functions to silently delete assets and to update them after crawling Generalized the mapping of assets to bookmark fields to make extending them easier * Added the bookmark type to the database Introduced an enum to have better type safety cleaned up the code and based some code on the type directly * add BookmarkType.UNKNWON * lint and remove unused function --------- Co-authored-by: MohamedBassem <me@mbassem.com>
*	refactor: remove redundant code from crawler worker and refactor handling of ↵	kamtschatka	2024-06-29	1	-32/+49
\| \| \| \| \| \| \| \| \| \| \| \| \|	asset types (#253) * refactoring asset types Extracted out functions to silently delete assets and to update them after crawling Generalized the mapping of assets to bookmark fields to make extending them easier * revert silentDeleteAsset and hide better-sqlite3 --------- Co-authored-by: MohamedBassem <me@mbassem.com>
*	feature: Automatically transfer image urls into bookmared assets. Fixes #246	MohamedBassem	2024-06-23	1	-6/+16
\|
*	refactor: extract assets into their own database table. #215 (#220)	kamtschatka	2024-06-23	1	-29/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Allow downloading more content from a webpage and index it #215 added a new table that contains the information about assets for link bookmarks created migration code that transfers the existing data into the new table * Allow downloading more content from a webpage and index it #215 removed the old asset columns from the database updated the UI to use the data from the linkBookmarkAssets array * generalize the assets table to not be linked in particular to links * fix migrations post merge * fix missing asset ids in the getBookmarks call --------- Co-authored-by: MohamedBassem <me@mbassem.com>
*	feature: add support for PDF links. Fixes #28 (#216)	kamtschatka	2024-06-22	1	-57/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* feature request: pdf support #28 Added a new sourceUrl column to the asset bookmarks Added transforming a link bookmark pointing at a pdf to an asset bookmark made sure the "View Original" link is also shown for asset bookmarks that have a sourceURL updated gitignore for IDEA * remove pdf parsing from the crawler * extract the http logic into its own function to avoid duplicating the post-processing actions (openai/index) * Add 5s timeout to the content type fetch --------- Co-authored-by: MohamedBassem <me@mbassem.com>
*	fix: Trigger search re-index on bookmark tag manual updates. Fixes #208 (#210)	kamtschatka	2024-06-09	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \|	* re-index of database is not scanning all places when bookmark tags are changed. Manual indexing is working as workaround #208 introduced a new function to trigger a reindex to reduce copy/paste added missing reindexes when tags are deleted/bookmarks are updated * give functions a bit more descriptive name --------- Co-authored-by: kamtschatka <simon.schatka@gmx.at> Co-authored-by: MohamedBassem <me@mbassem.com>
*	fix(crawler): Only update the database if full page archival is enabled	MohamedBassem	2024-05-26	1	-19/+19
\|
*	feature: Full page archival with monolith. Fixes #132	MohamedBassem	2024-05-26	1	-1/+65
\|
*	feature(crawler): Allow connecting to browser's websocket address and ↵	MohamedBassem	2024-05-15	1	-28/+55
\| \| \| \|	launching the browser on demand. This enables support for browserless
*	feature: Take full page screenshots #143 (#148)	kamtschatka	2024-05-12	1	-1/+2
\| \| \| \| \| \|	Added the fullPage flag to take full screen screenshots updated the UI accordingly to properly show the screenshots instead of scaling it down Co-authored-by: kamtschatka <simon.schatka@gmx.at>
*	feature(crawler): Allow increasing crawler concurrency and configure storing ↵	MohamedBassem	2024-04-26	1	-0/+13
\| \| \| \|	images and screenshots
*	fix(crawler): Better extraction for amazon images	MohamedBassem	2024-04-23	1	-0/+2
\|
*	fix(workers): Set a modern user agent and update the default viewport size	MohamedBassem	2024-04-23	1	-0/+7
\|
*	feature: Allow recrawling bookmarks without running inference jobs	MohamedBassem	2024-04-20	1	-7/+29
\|
*	feature: Download images and screenshots	MohamedBassem	2024-04-20	1	-28/+130
\|
*	feature: Recrawl failed links from admin UI (#95)	Ahmad Mujahid	2024-04-11	1	-0/+20
\| \| \| \| \|	* feature: Retry failed crawling URLs * fix: Enhancing visuals and some minor changes.
*	fix: Increase default navigation timeout to 30s, make it configurable and ↵	MohamedBassem	2024-04-11	1	-1/+1
\| \| \| \|	add retries to crawling jobs
*	fix(crawler): Skip validating URLs in metascrapper as it was already being ↵	MohamedBassem	2024-04-09	1	-0/+3
\| \| \| \|	validated. Fixes #22
*	fix(workers): Increase default timeout to 60s, make it configurable and ↵	MohamedBassem	2024-04-06	1	-11/+21
\| \| \| \|	improve logging