aboutsummaryrefslogtreecommitdiffstats
path: root/README.md
blob: 6e01b7047f387d99600828c7261b8c733bf35ac1 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
# kbLean 4 Knowledge Base CLI

A persistent, formally verified knowledge base stored as a Git repository of Lean 4 files. Every declaration is type-checked by Lean's kernel. An LLM agent interacts with it entirely through the `kb` CLI.

---

## Core Concepts

- **Persistent**: stored as `.lean` files in a Git repositorysurvives across sessions.
- **Formally verified**: every addition is type-checked by Lean's kernel at `kb check` / `kb commit` time.
- **Domain-organised**: declarations are grouped into named domain files. Each domain is one `.lean` file. An agent can load a specific domain into its context without loading the entire KB.
- **Version-controlled**: full history, branching, and reproducibility via Git.

---

## Planned Domain Structure

The KB is organised into four top-level sections, each subdivided into domain files that correspond to a specific area of knowledge. The subsections are the natural unit an agent loadsfor example, loading only `Physics` or only `MoneyFinance` without pulling in unrelated facts.

### Mathematics

| Domain file | Coverage |
|-------------|----------|
| `ElementaryMath` | Arithmetic, fractions, percentages; reasoning about parity and sign of underspecified values |
| `Algebra` | Variables, number systems, algebraic operations and symbols |
| `Geometry` | Plane, solid, coordinate, transformations, high-dimensional, packing, curves, polyforms, topology, tilings |
| `Plotting` | Functions, equations, parametric plots, inequalities, number lines, polar plots |
| `Calculus` | Integrals, derivatives, limits, sequences, products, series, sums, vector analysis, transforms, continuity, domain & range |
| `DifferentialEquations` | Ordinary (ODE) and partial (PDE) differential equations |
| `Statistics` | Descriptive statistics, inference, regression, random variables |
| `Functions` | Domain & range, injectivity, surjectivity, continuity, periodic, odd/even, special, number-theoretic, representation formulas |

### Science & Technology

| Domain file | Coverage |
|-------------|----------|
| `Physics` | Mechanics, oscillations & waves, statistical physics, thermodynamics, electricity & magnetism, optics, relativity, nuclear, quantum, particle, astrophysics, constants, principles, effects, fluid mechanics |
| `Chemistry` | Matter from atoms and ions to biomolecules |
| `UnitsMeasures` | Conversion and analysis of scientific and everyday measurement systems |
| `Engineering` | Aerospace, energy, electrical, control systems, mechanical, civil, fluid mechanics, acoustics, steam tables, measurement devices |
| `ComputationalSciences` | Cellular automata, substitution systems, Turing machines, complexity, algebraic codes, fractals, image processing, hashing |
| `EarthSciences` | Geology, geodesy, oceanography, climatology |
| `LifeSciences` | Human anatomy, animal anatomy, neuroscience, genomics, molecular biology, animals and plants |
| `SpaceAstronomy` | Phenomena and technology beyond Earth's atmosphere |

### Society & Culture

| Domain file | Coverage |
|-------------|----------|
| `People` | Biographical and demographic data on notable individuals and public figures |
| `ArtsMedia` | Music, films, comics, sculptures, paintings, and related media |
| `History` | Timelines, monetary value analysis, historical events, inventions, conflicts |
| `WordsLinguistics` | Word properties, languages, document length, notable texts, transliterations, Morse code, Soundex, encodings, emotions |
| `MoneyFinance` | Stock data, mortgages & loans, salaries, bonds, derivative valuation, income tax, sales tax |
| `DatesTimes` | Dates, calendars, time systems |
| `FoodNutrition` | Nutritional data and dietary guidelines |
| `Demographics` | Health care, age/sex, gender equality, business, citizenship, crime, education, income & employment, language, race, marriage, mobility, religion, military, HDI |

### Everyday Life

| Domain file | Coverage |
|-------------|----------|
| `PersonalHealth` | Physical exercise, food and nutrition, child growth & development |
| `PersonalFinance` | Mortgages & loans, stock data, currency conversion, cost of living, gasoline, food, electricity prices |
| `Entertainment` | Music, films, artworks, cultural productions |
| `HouseholdScience` | Scientific reasoning applied to everyday physical phenomena |
| `HouseholdMath` | Everyday calculations and measurements |
| `Programming` | Algorithms, data structures, programming concepts |
| `Hobbies` | Recreational activities and personal projects |
| `TodaysWorld` | World economy, Finland economy, weather |

---

## Knowledge Representation

| Kind | Description | Lean form |
|------|-------------|-----------|
| **Type** | A category or class | `opaque Person : Type` |
| **Subtype** | A type extending another | `opaque Philosopher : Type` + coercion axiom |
| **Instance** | A named individual | `opaque socrates : Person` |
| **Property** | A function from a type to a value | `opaque birthYear : PersonNat` |
| **Relation** | A binary predicate | `opaque bornIn : PersonPlaceProp` |
| **Fact** | An asserted value or relation | `axiom fact_birthYear_socrates : birthYear socrates = 470` |
| **Rule** | A universally quantified axiom | `axiom ancestors_transitive : ∀ x y z, …` |
| **Definition** | A named computable expression | `def yearsToDouble (r : Float) : Float := …` |

---

## CLI Commands

### Adding knowledge  *(all require `in <Domain>`)*

| Command | Description |
|---------|-------------|
| `add type <n> [extends <Parent>] in <D>` | Declare a type, optionally as a subtype |
| `add instance <n> : <Type> in <D>` | Declare a named individual |
| `add prop <n> : <DomType> -> <Range> in <D>` | Add a property |
| `add rel <n> : <D1> -> <D2> -> Prop in <D>` | Add a binary relation |
| `add fact <fact> in <D>` | Assert a fact: `prop(inst) = val` or `rel(a, b)` |
| `add rule <n> : <statement> in <D>` | Add a universally quantified axiom |
| `add def <n> <params> : <Type> := <body> in <D>` | Define a reusable computed shortcut |

Both `→` (Unicode) and `->` (ASCII) are accepted as type arrows.

### Inspecting knowledge

| Command | Description |
|---------|-------------|
| `list <types\|instances\|props\|rels\|facts\|rules\|defs\|all> [of <Domain>]` | List declarations, optionally filtered by domain |
| `show def <n>` | Print the Lean source of a declaration |
| `domains` | List all domains active this session with declaration counts |
| `domains stats` | Detailed breakdown per domain and kind |

### Querying

| Command | Description |
|---------|-------------|
| `eval <expr>` | Evaluate a computable Lean expression via `#eval` |
| `query <prop>` | Attempt to prove a proposition (Yes / No / Unknown) |

### Build and version control

| Command | Description |
|---------|-------------|
| `init <dir>` | Create a new KB repository |
| `check` | Sync imports and run `lake build` to verify the whole KB |
| `commit <message>` | Verify then `git commit` all changes |
| `status` | Show `git status` and session summary |
| `rm <n>` | Remove a declaration (warns about dependents) |

---

## How `check` Works

`kb check` does three things before running `lake build`:

1. Scans the KB root for all `*.lean` domain files.
2. Inspects which domains reference names declared in other domains, and injects the necessary `import` lines.
3. Rewrites `KnowledgeBase.lean` to import all domain files so `lake build` compiles everything.

Add a batch of declarations, then run `check` once at the end rather than after each individual addition.

---

## Example Workflow

```bash
# Initialise
kb init ./myKB
cd myKB

# Base types shared across domains
kb add type Entity in People
kb add type Person extends Entity in People
kb add type Place extends Entity in People
kb add type City extends Place in People

# Properties and relations
kb add prop birthYear : Person -> Nat in People
kb add rel bornIn : Person -> Place -> Prop in People

# Instances and facts
kb add instance socrates : Person in People
kb add instance athens : City in People
kb add fact birthYear(socrates) = 470 in People
kb add fact bornIn(socrates, athens) in People

# Verify and commit
kb check
kb commit "Added Socrates and basic person/place ontology"

# Agent loads only the People domain for a biographical query
kb list all of People
```

---

## Query Capabilities

- **Computational**: `kb eval "yearsToDouble 0.05"` — runs `#eval` on any computable Lean expression.
- **Logical**: `kb query "even (2 + 2)"` — attempts to prove the proposition using `decide`, `aesop`, or `simp`.
- **Selective loading**: `kb list all of Physics` — an agent pulls only the Physics domain into its context for a focused query.

---

## Extensibility

- Domain files from the planned structure are created on first use — no separate setup step.
- Definitions (`kb add def`) become permanent reusable shortcuts stored in Lean files.
- The KB grows incrementally across sessions — each `kb commit` is a checkpoint in Git history.