aboutsummaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorMohamedBassem <me@mbassem.com>2024-05-26 10:14:42 +0000
committerMohamedBassem <me@mbassem.com>2024-05-26 10:14:42 +0000
commit9198c1b7e15c79a9b0452e8c2a6b702df6a37b60 (patch)
tree5311b9758727315161bd23c21aa32ef8f4c602f3 /docs
parentdedc5fb24536832eae2c18d84efa2a92272c955c (diff)
downloadkarakeep-9198c1b7e15c79a9b0452e8c2a6b702df6a37b60.tar.zst
docs: Document the new CRAWLER_FULL_PAGE_ARCHIVE flag
Diffstat (limited to 'docs')
-rw-r--r--docs/docs/03-configuration.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/docs/docs/03-configuration.md b/docs/docs/03-configuration.md
index fc9e70db..277d182e 100644
--- a/docs/docs/03-configuration.md
+++ b/docs/docs/03-configuration.md
@@ -47,5 +47,6 @@ Either `OPENAI_API_KEY` or `OLLAMA_BASE_URL` need to be set for automatic taggin
| CRAWLER_DOWNLOAD_BANNER_IMAGE | No | true | Whether to cache the banner image used in the cards locally or fetch it each time directly from the website. Caching it consumes more storage space, but is more resilient against link rot and rate limits from websites. |
| CRAWLER_STORE_SCREENSHOT | No | true | Whether to store a screenshot from the crawled website or not. Screenshots act as a fallback for when we fail to extract an image from a website. You can also view the stored screenshots for any link. |
| CRAWLER_FULL_PAGE_SCREENSHOT | No | false | Whether to store a screenshot of the full page or not. Disabled by default, as it can lead to much higher disk usage. If disabled, the screenshot will only include the visible part of the page |
+| CRAWLER_FULL_PAGE_ARCHIVE | No | false | Whether to store a full local copy of the page or not. Disabled by default, as it can lead to much higher disk usage. If disabled, only the readable text of the page is archived. |
| CRAWLER_JOB_TIMEOUT_SEC | No | 60 | How long to wait for the crawler job to finish before timing out. If you have a slow internet connection or a low powered device, you might want to bump this up a bit |
| CRAWLER_NAVIGATE_TIMEOUT_SEC | No | 30 | How long to spend navigating to the page (along with its redirects). Increase this if you have a slow internet connection |