Compare commits
10 Commits
76fe7fb668
...
5336eb2c16
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
5336eb2c16 | ||
|
|
8986a515e2 | ||
|
|
89746b1076 | ||
|
|
443f9f4f4b | ||
|
|
df9b2b5f29 | ||
|
|
8699d8f7ab | ||
|
|
7e67c3fcf9 | ||
|
|
de886e30ea | ||
|
|
f807a3efa5 | ||
|
|
034fc2b8b6 |
2
.gitignore
vendored
2
.gitignore
vendored
@ -1 +1,3 @@
|
||||
venv/
|
||||
instance/
|
||||
__pycache__/
|
||||
|
||||
78
WORK.md
Normal file
78
WORK.md
Normal file
@ -0,0 +1,78 @@
|
||||
# What’s going on inside `mathstream`
|
||||
|
||||
Figured I’d write down how the whole thing is wired, more like a brain dump than polished docs. If you’re diving in to fix something, this should save you a bunch of spelunking.
|
||||
|
||||
## Directory map (so you know where to poke)
|
||||
|
||||
```
|
||||
mathstream/
|
||||
__init__.py # re-export central
|
||||
engine.py # arithmetic guts
|
||||
exceptions.py # custom errors
|
||||
number.py # StreamNumber, manual GC, watcher
|
||||
utils.py # sqlite junk drawer
|
||||
|
||||
test.py # smoke/integration script
|
||||
```
|
||||
|
||||
## StreamNumber – the heart of it
|
||||
|
||||
`mathstream/number.py` owns the `StreamNumber` class. The class does a couple jobs:
|
||||
|
||||
- wraps a file of digits (either you give us a path or a literal; literals get canonicalised and dropped into `instance/log/literal_<hash>.txt`).
|
||||
- streaming happens via `.stream(chunk_size)` so we never load the whole thing; every time we read we call `touch_log_file` so the usage timestamp keeps moving.
|
||||
- when a new stream gets created we check if it lives under `LOG_DIR`. If yes, we register it with the sqlite tracker (`register_log_file`) and also bump a ref counter via `register_reference`.
|
||||
- there’s a weakref finaliser plus a global `_ACTIVE_COUNTER` that keeps tabs on python-side references. If the object falls out of scope we run `_finalize_instance`, which decrements the counter and if it was the last one we call `release_reference` (that may nuke the file instantly).
|
||||
- explicit `free()` exists for people who want deterministic cleanup. It’s basically like `free()` in C: drop ref count and optionally delete the file now. There’s an alias `free_stream` and the class is a context manager so `with StreamNumber(...) as sn:` cleans up automatically.
|
||||
|
||||
So any time you’ve got a staged result hanging around in memory, the watcher knows about it. Once you ditch it—either by `free()` or just letting the object die—the sqlite ref count drops.
|
||||
|
||||
## Engine – maths without ints
|
||||
|
||||
Living in `mathstream/engine.py`. All the operators (`add/sub/mul/div/mod/pow`) pull chunks from the `StreamNumber` inputs, normalise them into sign + digit strings, run grade-school algorithms, then write the result back into `LOG_DIR`.
|
||||
|
||||
- `_write_result` is the important bit: writes to disk, calls `register_log_file`, then wraps the file in a new `StreamNumber`. Because of that call, every staged result is tracked automatically.
|
||||
- We’re careful about signs: division and modulo follow Python’s floor division rules. Divide-by-zero is intercepted and converted into `DivideByZeroError`.
|
||||
- `clear_logs()` wipes the folder and calls `wipe_log_records()` to empty sqlite so the next run isn’t polluted.
|
||||
|
||||
## Exceptions
|
||||
|
||||
`mathstream/exceptions.py` just defines `MathStreamError` and the more specific `DivideByZeroError`. Nothing fancy, just so we don’t leak raw `ZeroDivisionError`.
|
||||
|
||||
## SQLite watcher (`mathstream/utils.py`)
|
||||
|
||||
This is the garbage-collection HQ. On import we run `_ensure_db(reset=True)` so every run starts from a clean DB (no migrations, no surprises). Two tables:
|
||||
|
||||
- `logs` → metadata about every staged file: created time, last access, access count.
|
||||
- `refs` → current reference count (think “how many StreamNumber instances think they own this file”).
|
||||
|
||||
Important functions:
|
||||
|
||||
- `register_log_file(path)` – ensure both tables have a row (initial ref count 0).
|
||||
- `register_reference(path)` – increments the ref count, updates last access, access count etc. Called whenever a new `StreamNumber` points at the staged file.
|
||||
- `touch_log_file(path)` – called from `.stream()` so we know the file is being read.
|
||||
- `release_reference(path, delete_file=True)` – the inverse of register. If the count hits zero we remove the DB row and (optionally) delete the file right away.
|
||||
- `collect_garbage(score_threshold)` – this is the periodic sweeper. Computes `score = age / ((ref_count + 1) * (access_count + 1))`. Bigger score means older + less used. If score >= threshold it gets unlinked and removed from DB. Negative thresholds blow up on purpose.
|
||||
- `tracked_files()` – dumb helper that dumps `{path: ref_count}` out of the DB.
|
||||
- `wipe_log_records()` – nukes both tables; used by `clear_logs`.
|
||||
|
||||
## How cleanup flows
|
||||
|
||||
1. You run an operation (`add`, `mul`, whatever). Result file lands in `LOG_DIR`, gets registered, comes back as a `StreamNumber`.
|
||||
2. You stream it or create more streams from it – metadata keeps getting updated via `touch_log_file`/`register_reference`.
|
||||
3. When you’re done, call `.free()` or just drop references. Manual free is immediate. Otherwise the weakref finaliser catches it eventually.
|
||||
4. `release_reference` is what actually removes the sqlite entries and unlinks the data file when there are no logical references left.
|
||||
5. If you still have detritus (e.g. you crashed before refs hit zero), run `collect_garbage(threshold)` to sweep anything whose age outweighs usage.
|
||||
6. `active_streams()` reports what’s still alive in Python land; `tracked_files()` shows what the DB thinks is referenced.
|
||||
|
||||
## Example run (`test.py`)
|
||||
|
||||
`test.py` is half regression, half reference script. It:
|
||||
|
||||
- seeds some numbers, runs every operation, checks results.
|
||||
- makes sure `DivideByZeroError` fires.
|
||||
- frees every staged number to prove files vanish on the spot.
|
||||
- runs `collect_garbage(0)` just to make sure nothing else lingers.
|
||||
- dumps `active_streams()` and `tracked_files()` so you can see python vs sqlite state.
|
||||
|
||||
If the logs ever seem suspicious, run that script—it’ll tell you immediately whether something’s still referenced or if the GC is forgetting to clean up.
|
||||
211
collatz.py
Normal file
211
collatz.py
Normal file
@ -0,0 +1,211 @@
|
||||
#!/usr/bin/env python3
|
||||
import curses
|
||||
import time
|
||||
import os
|
||||
from pathlib import Path
|
||||
from mathstream import StreamNumber, add, mul, div, is_even, clear_logs
|
||||
|
||||
|
||||
LOG_DIR = Path("instance/log")
|
||||
|
||||
|
||||
def collatz_step(n, three, two, one):
|
||||
return div(n, two) if is_even(n) else add(mul(n, three), one)
|
||||
|
||||
|
||||
def draw_header(win, step, elapsed, avg_step, digits_len):
|
||||
"""Render header above graph and panels."""
|
||||
win.erase()
|
||||
cols = curses.COLS - 1
|
||||
bar = "─" * cols
|
||||
lines = [
|
||||
f" Collatz (3n + 1) Streamed Viewer ",
|
||||
f" Step: {step}",
|
||||
f" Elapsed: {elapsed:8.2f}s | Avg/Step: {avg_step:8.5f}s",
|
||||
f" Digits: {digits_len:,} | ↑↓ scroll number | PgUp/PgDn scroll log | q quit",
|
||||
]
|
||||
for i, line in enumerate(lines):
|
||||
win.addstr(i, 0, line[:cols], curses.color_pair(1))
|
||||
win.addstr(len(lines), 0, bar, curses.color_pair(2))
|
||||
win.noutrefresh()
|
||||
|
||||
|
||||
def draw_graph(win, graph_buf, direction_up, width):
|
||||
"""Render a single-line graph: grey ░ for empty, colored █ for data."""
|
||||
win.erase()
|
||||
# Derive the actual width of the window so we never draw past its edge.
|
||||
_, max_x = win.getmaxyx()
|
||||
effective_width = max_x or width
|
||||
padding = 3 # leave space for arrow and a small gap
|
||||
cols = max(0, effective_width - padding)
|
||||
|
||||
# Clamp buffer to visible width
|
||||
visible = graph_buf[-cols:] if len(graph_buf) > cols else graph_buf
|
||||
fill_len = len(visible)
|
||||
|
||||
# arrow first
|
||||
if effective_width > 0:
|
||||
arrow = "▲" if direction_up else "▼"
|
||||
arrow_color = curses.color_pair(6 if direction_up else 5)
|
||||
try:
|
||||
win.addstr(0, 0, arrow, arrow_color)
|
||||
except curses.error:
|
||||
pass
|
||||
|
||||
# draw filled section
|
||||
for i, val in enumerate(visible):
|
||||
col = padding + i
|
||||
if col >= effective_width:
|
||||
break
|
||||
color = curses.color_pair(6 if val > 0 else 5)
|
||||
try:
|
||||
win.addstr(0, col, "█", color)
|
||||
except curses.error:
|
||||
break
|
||||
|
||||
# fill remaining with ░
|
||||
remaining = max(0, cols - fill_len)
|
||||
fill_start = padding + fill_len
|
||||
if remaining > 0 and fill_start < effective_width:
|
||||
run = min(remaining, effective_width - fill_start)
|
||||
if run > 0:
|
||||
try:
|
||||
win.addstr(0, fill_start, "░" * run, curses.color_pair(7))
|
||||
except curses.error:
|
||||
pass
|
||||
|
||||
win.noutrefresh()
|
||||
|
||||
|
||||
def draw_number(win, digits, scroll):
|
||||
win.erase()
|
||||
cols = curses.COLS
|
||||
lines = [digits[i:i + cols - 1] for i in range(0, len(digits), cols - 1)]
|
||||
max_lines = win.getmaxyx()[0]
|
||||
scroll = max(0, min(scroll, max(0, len(lines) - max_lines)))
|
||||
for i, chunk in enumerate(lines[scroll:scroll + max_lines]):
|
||||
try:
|
||||
win.addstr(i, 0, chunk, curses.color_pair(3))
|
||||
except curses.error:
|
||||
pass
|
||||
win.noutrefresh()
|
||||
return scroll
|
||||
|
||||
|
||||
def draw_log_list(win, scroll):
|
||||
win.erase()
|
||||
if not LOG_DIR.exists():
|
||||
LOG_DIR.mkdir(parents=True, exist_ok=True)
|
||||
files = sorted(LOG_DIR.iterdir(), key=os.path.getmtime, reverse=True)
|
||||
names = [f"{f.name}" for f in files]
|
||||
max_lines = win.getmaxyx()[0]
|
||||
scroll = max(0, min(scroll, max(0, len(names) - max_lines)))
|
||||
for i, name in enumerate(names[scroll:scroll + max_lines]):
|
||||
try:
|
||||
win.addstr(i, 0, name[: curses.COLS - 1], curses.color_pair(4))
|
||||
except curses.error:
|
||||
pass
|
||||
win.noutrefresh()
|
||||
return scroll
|
||||
|
||||
|
||||
def run_collatz(stdscr):
|
||||
curses.curs_set(0)
|
||||
curses.start_color()
|
||||
curses.use_default_colors()
|
||||
curses.init_pair(1, curses.COLOR_CYAN, -1) # header text
|
||||
curses.init_pair(2, curses.COLOR_BLACK, curses.COLOR_CYAN) # separator
|
||||
curses.init_pair(3, curses.COLOR_WHITE, -1) # number
|
||||
curses.init_pair(4, curses.COLOR_YELLOW, -1) # log list
|
||||
curses.init_pair(5, curses.COLOR_RED, -1)
|
||||
curses.init_pair(6, curses.COLOR_GREEN, -1)
|
||||
curses.init_pair(7, curses.COLOR_WHITE, -1) # grey for empty
|
||||
|
||||
stdscr.nodelay(True)
|
||||
stdscr.timeout(100)
|
||||
|
||||
start_file = Path("start.txt")
|
||||
if not start_file.exists():
|
||||
stdscr.addstr(0, 0, "Missing start.txt — please create one with your starting number.")
|
||||
stdscr.refresh()
|
||||
stdscr.getch()
|
||||
return
|
||||
|
||||
clear_logs()
|
||||
n = StreamNumber(start_file)
|
||||
one, two, three = (StreamNumber(literal=s) for s in ("1", "2", "3"))
|
||||
|
||||
start_time = time.time()
|
||||
step = 0
|
||||
num_scroll = 0
|
||||
log_scroll = 0
|
||||
|
||||
header_h = 5
|
||||
graph_h = 1
|
||||
num_h = (curses.LINES - header_h - graph_h) * 3 // 4
|
||||
log_h = curses.LINES - header_h - graph_h - num_h - 1
|
||||
|
||||
num_win = curses.newwin(num_h, curses.COLS, header_h + graph_h + 1, 0)
|
||||
graph_win = curses.newwin(graph_h, curses.COLS, header_h + 1, 0)
|
||||
log_win = curses.newwin(log_h, curses.COLS, header_h + graph_h + num_h + 2, 0)
|
||||
|
||||
last_len = 0
|
||||
graph_buf = []
|
||||
|
||||
while True:
|
||||
step += 1
|
||||
n = collatz_step(n, three, two, one)
|
||||
digits = "".join(n.stream()) or "0"
|
||||
cur_len = len(digits)
|
||||
diff = cur_len - last_len
|
||||
last_len = cur_len
|
||||
|
||||
graph_width = curses.COLS - 4
|
||||
|
||||
# Add new value to graph buffer
|
||||
if diff != 0:
|
||||
graph_buf.append(1 if diff > 0 else -1)
|
||||
else:
|
||||
graph_buf.append(0)
|
||||
|
||||
# Shift left if full
|
||||
if len(graph_buf) > graph_width:
|
||||
graph_buf = graph_buf[-graph_width:]
|
||||
|
||||
direction_up = diff >= 0
|
||||
elapsed = time.time() - start_time
|
||||
avg_step = elapsed / step if step else 0.0
|
||||
|
||||
draw_header(stdscr, step, elapsed, avg_step, len(digits))
|
||||
draw_graph(graph_win, graph_buf, direction_up, curses.COLS)
|
||||
num_scroll = draw_number(num_win, digits, num_scroll)
|
||||
log_scroll = draw_log_list(log_win, log_scroll)
|
||||
|
||||
curses.doupdate()
|
||||
|
||||
ch = stdscr.getch()
|
||||
if ch == ord("q"):
|
||||
break
|
||||
elif ch == curses.KEY_UP:
|
||||
num_scroll = max(0, num_scroll - 1)
|
||||
elif ch == curses.KEY_DOWN:
|
||||
num_scroll += 1
|
||||
elif ch == curses.KEY_PPAGE:
|
||||
log_scroll = max(0, log_scroll - 3)
|
||||
elif ch == curses.KEY_NPAGE:
|
||||
log_scroll += 3
|
||||
|
||||
if digits == "1":
|
||||
stdscr.nodelay(False)
|
||||
stdscr.addstr(curses.LINES - 1, 0, "Reached 1 — press any key to exit.")
|
||||
stdscr.refresh()
|
||||
stdscr.getch()
|
||||
break
|
||||
|
||||
|
||||
def main():
|
||||
curses.wrapper(run_collatz)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
93
mathstream/README.md
Normal file
93
mathstream/README.md
Normal file
@ -0,0 +1,93 @@
|
||||
# Mathstream Library
|
||||
|
||||
`mathstream` offers streamed, string-based arithmetic for very large integers that you may not want to load entirely into memory. Instead of parsing numbers into Python `int` values, you work with digit files on disk via `StreamNumber` and call math operations that operate chunk-by-chunk.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
python -m venv venv
|
||||
source venv/bin/activate
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
Create digit files anywhere you like (the examples below use `instance/log`), or supply ad-hoc literals, then construct `StreamNumber` objects and call the helpers:
|
||||
|
||||
```python
|
||||
from mathstream import (
|
||||
StreamNumber,
|
||||
add,
|
||||
sub,
|
||||
mul,
|
||||
div,
|
||||
mod,
|
||||
pow,
|
||||
is_even,
|
||||
is_odd,
|
||||
free_stream,
|
||||
collect_garbage,
|
||||
)
|
||||
|
||||
a = StreamNumber("instance/log/huge.txt")
|
||||
b = StreamNumber(literal="34567")
|
||||
e = StreamNumber(literal="3")
|
||||
|
||||
print("sum =", "".join(add(a, b).stream()))
|
||||
print("difference =", "".join(sub(a, b).stream()))
|
||||
print("product =", "".join(mul(a, b).stream()))
|
||||
print("quotient =", "".join(div(a, b).stream()))
|
||||
print("modulo =", "".join(mod(a, b).stream()))
|
||||
print("power =", "".join(pow(a, e).stream()))
|
||||
print("a is even?", is_even(a))
|
||||
print("b is odd?", is_odd(b))
|
||||
|
||||
# drop staged artifacts immediately when you are done
|
||||
free_stream(b)
|
||||
|
||||
# reclaim space for files whose age outweighs their use
|
||||
collect_garbage(0.5)
|
||||
```
|
||||
|
||||
Each arithmetic call writes its result back into `instance/log` (configurable via `mathstream.number.LOG_DIR`) so you can stream the digits later or reuse them in further operations.
|
||||
|
||||
## Core Concepts
|
||||
|
||||
- **StreamNumber(path | literal=...)** – Wraps a digit text file or creates one for an integer literal inside `LOG_DIR`. Literal operands are persisted as `literal_<hash>.txt`, so repeated runs reuse the same staged file (note that `clear_logs()` removes these cache files too).
|
||||
- **`.stream(chunk_size)`** – Yields strings of digits with the provided chunk size. Operations in `mathstream.engine` consume these streams to avoid loading the entire number at once.
|
||||
- **Automatic staging** – Outputs are stored under `LOG_DIR` with hashes based on input file paths, letting you compose operations without manual bookkeeping.
|
||||
- **Sign-aware** – Addition, subtraction, multiplication, division (`//` behavior), modulo, and exponentiation (non-negative exponents) all respect operand sign. Division/modulo follow Python’s floor-division rules.
|
||||
- **Utilities** – `clear_logs()` wipes prior staged results so you can start fresh.
|
||||
- **Manual freeing** – Call `stream.free()` (or `free_stream(stream)`) once you are done with a staged number to release its reference immediately. Logger metadata keeps per-path reference counts so the final free removes the backing file on the spot.
|
||||
- **Parity helpers** – `is_even` and `is_odd` inspect the streamed digits without materializing the integer.
|
||||
- **Garbage collection** – `collect_garbage(score_threshold)` computes a score from file age, access count, and reference count (tracked in `instance/mathstream_logs.sqlite`, freshly truncated each run). Files whose score meets or exceeds the threshold are deleted, letting you tune how aggressively to reclaim space. Both staged results and literal caches participate. Use `tracked_files()` or `active_streams()` to inspect current state.
|
||||
|
||||
Divide-by-zero scenarios raise the custom `DivideByZeroError` so callers can distinguish mathstream issues from Python’s native exceptions.
|
||||
|
||||
## Example Script
|
||||
|
||||
`test.py` in the repository root demonstrates a minimal workflow:
|
||||
|
||||
1. Writes sample operands to `tests/*.txt`.
|
||||
2. Calls every arithmetic primitive plus the modulo/parity helpers.
|
||||
3. Asserts that the streamed outputs match known values (helpful for quick regression checks).
|
||||
|
||||
Run it via:
|
||||
|
||||
```bash
|
||||
python test.py
|
||||
```
|
||||
|
||||
## Extending
|
||||
|
||||
- To hook into other storage backends, implement your own `StreamNumber` variant with the same `.stream()` interface.
|
||||
- Need modulo or gcd? Compose the existing primitives (e.g., repeated subtraction or using `div` + remainder tracking inside `_divide_abs`) or add new helpers following the same streamed pattern.
|
||||
- For more control over output locations, override `LOG_DIR` before using the operations:
|
||||
|
||||
```python
|
||||
from mathstream import engine
|
||||
from pathlib import Path
|
||||
|
||||
engine.LOG_DIR = Path("/tmp/my_mathstage")
|
||||
engine.clear_logs()
|
||||
```
|
||||
|
||||
With these building blocks, you can manipulate arbitrarily large integers while keeping memory usage constant. Happy streaming!
|
||||
@ -1,2 +1,23 @@
|
||||
from .engine import clear_logs, add, sub, mul, div
|
||||
from .number import StreamNumber
|
||||
from .engine import clear_logs, add, sub, mul, div, mod, pow, is_even, is_odd
|
||||
from .exceptions import MathStreamError, DivideByZeroError
|
||||
from .number import StreamNumber, free_stream, active_streams
|
||||
from .utils import collect_garbage, tracked_files
|
||||
|
||||
__all__ = [
|
||||
"clear_logs",
|
||||
"collect_garbage",
|
||||
"tracked_files",
|
||||
"add",
|
||||
"sub",
|
||||
"mul",
|
||||
"div",
|
||||
"mod",
|
||||
"pow",
|
||||
"is_even",
|
||||
"is_odd",
|
||||
"StreamNumber",
|
||||
"free_stream",
|
||||
"active_streams",
|
||||
"MathStreamError",
|
||||
"DivideByZeroError",
|
||||
]
|
||||
|
||||
@ -1,45 +1,329 @@
|
||||
from pathlib import Path
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Iterable, Tuple
|
||||
|
||||
from .exceptions import DivideByZeroError
|
||||
from .number import StreamNumber, LOG_DIR
|
||||
from .utils import register_log_file, wipe_log_records
|
||||
|
||||
|
||||
def _ensure_log_dir() -> None:
|
||||
LOG_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
def _strip_leading_zeros(digits: str) -> str:
|
||||
digits = digits.lstrip("0")
|
||||
return digits or "0"
|
||||
|
||||
|
||||
def _normalize_stream(num: StreamNumber) -> Tuple[int, str]:
|
||||
"""Return (sign, digits) tuple for the streamed number."""
|
||||
parts: list[str] = []
|
||||
for chunk in num.stream():
|
||||
chunk = chunk.strip()
|
||||
if not chunk:
|
||||
continue
|
||||
parts.append(chunk)
|
||||
raw = "".join(parts)
|
||||
|
||||
if not raw:
|
||||
raise ValueError(f"Stream for {num.path} is empty")
|
||||
|
||||
sign = 1
|
||||
if raw[0] in "+-":
|
||||
sign = -1 if raw[0] == "-" else 1
|
||||
raw = raw[1:]
|
||||
|
||||
if not raw.isdigit():
|
||||
raise ValueError(f"Non-digit characters found in stream for {num.path}")
|
||||
|
||||
digits = _strip_leading_zeros(raw)
|
||||
if digits == "0":
|
||||
sign = 1
|
||||
return sign, digits
|
||||
|
||||
|
||||
def _compare_abs(a: str, b: str) -> int:
|
||||
"""Compare two positive digit strings."""
|
||||
if len(a) != len(b):
|
||||
return 1 if len(a) > len(b) else -1
|
||||
if a == b:
|
||||
return 0
|
||||
return 1 if a > b else -1
|
||||
|
||||
|
||||
def _add_abs(a: str, b: str) -> str:
|
||||
carry = 0
|
||||
idx_a = len(a) - 1
|
||||
idx_b = len(b) - 1
|
||||
out: list[str] = []
|
||||
|
||||
while idx_a >= 0 or idx_b >= 0 or carry:
|
||||
da = ord(a[idx_a]) - 48 if idx_a >= 0 else 0
|
||||
db = ord(b[idx_b]) - 48 if idx_b >= 0 else 0
|
||||
total = da + db + carry
|
||||
carry, digit = divmod(total, 10)
|
||||
out.append(str(digit))
|
||||
idx_a -= 1
|
||||
idx_b -= 1
|
||||
|
||||
return "".join(reversed(out))
|
||||
|
||||
|
||||
def _sub_abs(a: str, b: str) -> str:
|
||||
"""Return a - b for digit strings assuming a >= b."""
|
||||
borrow = 0
|
||||
idx_a = len(a) - 1
|
||||
idx_b = len(b) - 1
|
||||
out: list[str] = []
|
||||
|
||||
while idx_a >= 0:
|
||||
da = ord(a[idx_a]) - 48
|
||||
db = ord(b[idx_b]) - 48 if idx_b >= 0 else 0
|
||||
diff = da - borrow - db
|
||||
if diff < 0:
|
||||
diff += 10
|
||||
borrow = 1
|
||||
else:
|
||||
borrow = 0
|
||||
out.append(str(diff))
|
||||
idx_a -= 1
|
||||
idx_b -= 1
|
||||
|
||||
return _strip_leading_zeros("".join(reversed(out)))
|
||||
|
||||
|
||||
def _multiply_abs(a: str, b: str) -> str:
|
||||
if a == "0" or b == "0":
|
||||
return "0"
|
||||
result = [0] * (len(a) + len(b))
|
||||
for i in range(len(a) - 1, -1, -1):
|
||||
ai = ord(a[i]) - 48
|
||||
carry = 0
|
||||
for j in range(len(b) - 1, -1, -1):
|
||||
bj = ord(b[j]) - 48
|
||||
pos = i + j + 1
|
||||
total = result[pos] + ai * bj + carry
|
||||
carry, result[pos] = divmod(total, 10)
|
||||
result[i] += carry
|
||||
return _strip_leading_zeros("".join(str(d) for d in result))
|
||||
|
||||
|
||||
def _multiply_digit(num: str, digit: int) -> str:
|
||||
if digit == 0 or num == "0":
|
||||
return "0"
|
||||
carry = 0
|
||||
out: list[str] = []
|
||||
for i in range(len(num) - 1, -1, -1):
|
||||
total = (ord(num[i]) - 48) * digit + carry
|
||||
carry, d = divmod(total, 10)
|
||||
out.append(str(d))
|
||||
if carry:
|
||||
out.append(str(carry))
|
||||
return "".join(reversed(out))
|
||||
|
||||
|
||||
def _divide_abs(dividend: str, divisor: str) -> Tuple[str, str]:
|
||||
if divisor == "0":
|
||||
raise DivideByZeroError("division by zero")
|
||||
if dividend == "0":
|
||||
return "0", "0"
|
||||
|
||||
quotient_digits: list[str] = []
|
||||
remainder = "0"
|
||||
|
||||
for digit in dividend:
|
||||
remainder = _strip_leading_zeros(remainder + digit)
|
||||
q_digit = 0
|
||||
for guess in range(9, -1, -1):
|
||||
candidate = _multiply_digit(divisor, guess)
|
||||
if _compare_abs(candidate, remainder) <= 0:
|
||||
q_digit = guess
|
||||
remainder = _sub_abs(remainder, candidate) if guess else remainder
|
||||
break
|
||||
quotient_digits.append(str(q_digit))
|
||||
|
||||
quotient = _strip_leading_zeros("".join(quotient_digits))
|
||||
remainder = _strip_leading_zeros(remainder)
|
||||
return quotient, remainder
|
||||
|
||||
|
||||
def _is_zero(digits: str) -> bool:
|
||||
return digits == "0"
|
||||
|
||||
|
||||
def _is_odd(digits: str) -> bool:
|
||||
return (ord(digits[-1]) - 48) % 2 == 1
|
||||
|
||||
|
||||
def _halve(digits: str) -> str:
|
||||
carry = 0
|
||||
out: list[str] = []
|
||||
for ch in digits:
|
||||
current = carry * 10 + (ord(ch) - 48)
|
||||
quotient = current // 2
|
||||
carry = current % 2
|
||||
out.append(str(quotient))
|
||||
return _strip_leading_zeros("".join(out))
|
||||
|
||||
|
||||
def _write_result(operation: str, operands: Iterable[StreamNumber], digits: str) -> StreamNumber:
|
||||
_ensure_log_dir()
|
||||
operand_hash = "_".join(num.hash for num in operands)
|
||||
out_file = LOG_DIR / f"{operation}_{operand_hash}.bin"
|
||||
with open(out_file, "w", encoding="utf-8") as out:
|
||||
out.write(digits)
|
||||
register_log_file(out_file)
|
||||
return StreamNumber(out_file)
|
||||
|
||||
|
||||
def clear_logs():
|
||||
if LOG_DIR.exists():
|
||||
for p in LOG_DIR.glob("*"):
|
||||
p.unlink()
|
||||
LOG_DIR.mkdir(parents=True, exist_ok=True)
|
||||
_ensure_log_dir()
|
||||
wipe_log_records()
|
||||
|
||||
|
||||
def add(num_a: StreamNumber, num_b: StreamNumber) -> StreamNumber:
|
||||
"""Digit-by-digit streamed addition."""
|
||||
out_file = LOG_DIR / f"{num_a.hash}_add_{num_b.hash}.bin"
|
||||
"""Return num_a + num_b without loading full ints into memory."""
|
||||
sign_a, a_digits = _normalize_stream(num_a)
|
||||
sign_b, b_digits = _normalize_stream(num_b)
|
||||
|
||||
carry = 0
|
||||
a_buf = list(num_a.stream(1))
|
||||
b_buf = list(num_b.stream(1))
|
||||
if sign_a == sign_b:
|
||||
digits = _add_abs(a_digits, b_digits)
|
||||
sign = sign_a
|
||||
else:
|
||||
cmp = _compare_abs(a_digits, b_digits)
|
||||
if cmp == 0:
|
||||
digits = "0"
|
||||
sign = 1
|
||||
elif cmp > 0:
|
||||
digits = _sub_abs(a_digits, b_digits)
|
||||
sign = sign_a
|
||||
else:
|
||||
digits = _sub_abs(b_digits, a_digits)
|
||||
sign = sign_b
|
||||
|
||||
# align lengths
|
||||
max_len = max(len(a_buf), len(b_buf))
|
||||
a_buf = ["0"] * (max_len - len(a_buf)) + a_buf
|
||||
b_buf = ["0"] * (max_len - len(b_buf)) + b_buf
|
||||
result = digits if sign > 0 or digits == "0" else f"-{digits}"
|
||||
return _write_result("add", (num_a, num_b), result)
|
||||
|
||||
with open(out_file, "wb") as out:
|
||||
for i in range(max_len - 1, -1, -1):
|
||||
s = int(a_buf[i]) + int(b_buf[i]) + carry
|
||||
carry, digit = divmod(s, 10)
|
||||
out.write(str(digit).encode())
|
||||
if carry:
|
||||
out.write(str(carry).encode())
|
||||
return StreamNumber(out_file)
|
||||
|
||||
def sub(num_a, num_b):
|
||||
"""Basic streamed subtraction (assumes a >= b)."""
|
||||
# similar pattern with borrow propagation...
|
||||
pass
|
||||
def sub(num_a: StreamNumber, num_b: StreamNumber) -> StreamNumber:
|
||||
"""Return num_a - num_b using streamed integer arithmetic."""
|
||||
sign_a, a_digits = _normalize_stream(num_a)
|
||||
sign_b, b_digits = _normalize_stream(num_b)
|
||||
|
||||
def mul(num_a, num_b):
|
||||
"""Chunked multiplication using repeated addition."""
|
||||
# create temporary stage files for partial sums
|
||||
pass
|
||||
if sign_a != sign_b:
|
||||
digits = _add_abs(a_digits, b_digits)
|
||||
sign = sign_a
|
||||
else:
|
||||
cmp = _compare_abs(a_digits, b_digits)
|
||||
if cmp == 0:
|
||||
digits = "0"
|
||||
sign = 1
|
||||
elif cmp > 0:
|
||||
digits = _sub_abs(a_digits, b_digits)
|
||||
sign = sign_a
|
||||
else:
|
||||
digits = _sub_abs(b_digits, a_digits)
|
||||
sign = -sign_a
|
||||
|
||||
def div(num_a, num_b):
|
||||
"""Long division, streamed stage by stage."""
|
||||
# create multiple intermediate files: div_stage_1, div_stage_2, etc.
|
||||
pass
|
||||
result = digits if sign > 0 or digits == "0" else f"-{digits}"
|
||||
return _write_result("sub", (num_a, num_b), result)
|
||||
|
||||
|
||||
def mul(num_a: StreamNumber, num_b: StreamNumber) -> StreamNumber:
|
||||
"""Return num_a * num_b with grade-school multiplication."""
|
||||
sign_a, a_digits = _normalize_stream(num_a)
|
||||
sign_b, b_digits = _normalize_stream(num_b)
|
||||
|
||||
digits = _multiply_abs(a_digits, b_digits)
|
||||
sign = 1 if digits == "0" else sign_a * sign_b
|
||||
result = digits if sign > 0 else f"-{digits}"
|
||||
return _write_result("mul", (num_a, num_b), result)
|
||||
|
||||
|
||||
def div(num_a: StreamNumber, num_b: StreamNumber) -> StreamNumber:
|
||||
"""Return floor division num_a // num_b with streamed long division."""
|
||||
sign_a, a_digits = _normalize_stream(num_a)
|
||||
sign_b, b_digits = _normalize_stream(num_b)
|
||||
|
||||
quotient, remainder = _divide_abs(a_digits, b_digits)
|
||||
|
||||
if quotient == "0" and remainder == "0":
|
||||
return _write_result("div", (num_a, num_b), "0")
|
||||
|
||||
sign_product = sign_a * sign_b
|
||||
if sign_product < 0 and remainder != "0":
|
||||
quotient = _add_abs(quotient, "1")
|
||||
sign = -1
|
||||
else:
|
||||
sign = sign_product if quotient != "0" else 1
|
||||
|
||||
result = quotient if sign > 0 else f"-{quotient}"
|
||||
return _write_result("div", (num_a, num_b), result)
|
||||
|
||||
|
||||
def mod(num_a: StreamNumber, num_b: StreamNumber) -> StreamNumber:
|
||||
"""Return num_a % num_b following Python's floor-division semantics."""
|
||||
sign_a, a_digits = _normalize_stream(num_a)
|
||||
sign_b, b_digits = _normalize_stream(num_b)
|
||||
|
||||
if b_digits == "0":
|
||||
raise DivideByZeroError("modulo by zero")
|
||||
|
||||
_, remainder = _divide_abs(a_digits, b_digits)
|
||||
|
||||
if remainder == "0":
|
||||
return _write_result("mod", (num_a, num_b), "0")
|
||||
|
||||
if sign_a == sign_b:
|
||||
digits = remainder
|
||||
else:
|
||||
digits = _sub_abs(b_digits, remainder)
|
||||
|
||||
sign = 1 if sign_b > 0 else -1
|
||||
result = digits if sign > 0 else f"-{digits}"
|
||||
return _write_result("mod", (num_a, num_b), result)
|
||||
|
||||
|
||||
def pow(num_a: StreamNumber, num_b: StreamNumber) -> StreamNumber:
|
||||
"""Return num_a ** num_b using repeated squaring (integer exponent only)."""
|
||||
base_sign, base_digits = _normalize_stream(num_a)
|
||||
exp_sign, exp_digits = _normalize_stream(num_b)
|
||||
|
||||
if exp_sign < 0:
|
||||
raise ValueError("Negative exponents are not supported for integer streams.")
|
||||
|
||||
if exp_digits == "0":
|
||||
return _write_result("pow", (num_a, num_b), "1")
|
||||
|
||||
result_digits = "1"
|
||||
base_abs = base_digits
|
||||
exponent = exp_digits
|
||||
|
||||
while not _is_zero(exponent):
|
||||
if _is_odd(exponent):
|
||||
result_digits = _multiply_abs(result_digits, base_abs)
|
||||
exponent = _halve(exponent)
|
||||
if not _is_zero(exponent):
|
||||
base_abs = _multiply_abs(base_abs, base_abs)
|
||||
|
||||
base_negative = base_sign < 0
|
||||
result_sign = -1 if base_negative and _is_odd(exp_digits) else 1
|
||||
if result_digits == "0":
|
||||
result_sign = 1
|
||||
result = result_digits if result_sign > 0 else f"-{result_digits}"
|
||||
return _write_result("pow", (num_a, num_b), result)
|
||||
|
||||
|
||||
def is_even(num: StreamNumber) -> bool:
|
||||
"""Return True if the streamed integer is even."""
|
||||
_, digits = _normalize_stream(num)
|
||||
return (ord(digits[-1]) - 48) % 2 == 0
|
||||
|
||||
|
||||
def is_odd(num: StreamNumber) -> bool:
|
||||
"""Return True if the streamed integer is odd."""
|
||||
return not is_even(num)
|
||||
|
||||
6
mathstream/exceptions.py
Normal file
6
mathstream/exceptions.py
Normal file
@ -0,0 +1,6 @@
|
||||
class MathStreamError(Exception):
|
||||
"""Base class for mathstream-specific errors."""
|
||||
|
||||
|
||||
class DivideByZeroError(MathStreamError):
|
||||
"""Raised when division or modulo operations encounter a zero divisor."""
|
||||
@ -1,27 +1,151 @@
|
||||
import hashlib
|
||||
import weakref
|
||||
from collections import Counter
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Union
|
||||
|
||||
from .utils import (
|
||||
register_log_file,
|
||||
register_reference,
|
||||
touch_log_file,
|
||||
release_reference,
|
||||
)
|
||||
|
||||
LOG_DIR = Path("./instance/log")
|
||||
|
||||
|
||||
def _ensure_log_dir() -> None:
|
||||
LOG_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
def _canonicalize_literal(value: str) -> str:
|
||||
raw = value.strip()
|
||||
if not raw:
|
||||
raise ValueError("Literal value cannot be empty.")
|
||||
|
||||
sign = ""
|
||||
digits = raw
|
||||
if raw[0] in "+-":
|
||||
sign = "-" if raw[0] == "-" else ""
|
||||
digits = raw[1:]
|
||||
|
||||
if not digits or not digits.isdigit():
|
||||
raise ValueError(f"Literal must be an integer string, got: {value!r}")
|
||||
|
||||
digits = digits.lstrip("0") or "0"
|
||||
if digits == "0":
|
||||
sign = ""
|
||||
return f"{sign}{digits}"
|
||||
|
||||
|
||||
def _is_in_log_dir(path: Path) -> bool:
|
||||
try:
|
||||
path.resolve().relative_to(LOG_DIR.resolve())
|
||||
return True
|
||||
except ValueError:
|
||||
return False
|
||||
|
||||
|
||||
class StreamNumber:
|
||||
def __init__(self, file_path):
|
||||
self.path = Path(file_path)
|
||||
if not self.path.exists():
|
||||
raise FileNotFoundError(self.path)
|
||||
def __init__(
|
||||
self,
|
||||
file_path: Optional[Union[str, Path]] = None,
|
||||
*,
|
||||
literal: Optional[str] = None,
|
||||
):
|
||||
if (file_path is None) == (literal is None):
|
||||
raise ValueError("Provide exactly one of file_path or literal.")
|
||||
|
||||
if literal is not None:
|
||||
normalized = _canonicalize_literal(literal)
|
||||
_ensure_log_dir()
|
||||
literal_hash = hashlib.sha1(normalized.encode()).hexdigest()[:10]
|
||||
self.path = LOG_DIR / f"literal_{literal_hash}.txt"
|
||||
self.path.write_text(normalized, encoding="utf-8")
|
||||
else:
|
||||
self.path = Path(file_path)
|
||||
if not self.path.exists():
|
||||
raise FileNotFoundError(self.path)
|
||||
|
||||
self.hash = hashlib.sha1(str(self.path).encode()).hexdigest()[:10]
|
||||
self._normalized_path = str(self.path.resolve())
|
||||
self._released = False
|
||||
|
||||
_increment_active(self.path)
|
||||
|
||||
if _is_in_log_dir(self.path):
|
||||
register_log_file(self.path)
|
||||
register_reference(self.path)
|
||||
|
||||
self._finalizer = weakref.finalize(
|
||||
self, _finalize_instance, self._normalized_path
|
||||
)
|
||||
|
||||
def __repr__(self):
|
||||
return f"<StreamNumber {self.path.name}>"
|
||||
|
||||
def stream(self, chunk_size=4096):
|
||||
"""Yield chunks of digits as strings."""
|
||||
if _is_in_log_dir(self.path):
|
||||
touch_log_file(self.path)
|
||||
with open(self.path, "r", encoding="utf-8") as f:
|
||||
while chunk := f.read(chunk_size):
|
||||
yield chunk.strip().replace(",", ".")
|
||||
|
||||
def write_stage(self, stage, data: str):
|
||||
"""Write intermediate stage result."""
|
||||
_ensure_log_dir()
|
||||
stage_file = LOG_DIR / f"{self.hash}_stage_{stage}.bin"
|
||||
with open(stage_file, "wb") as f:
|
||||
f.write(data.encode())
|
||||
register_log_file(stage_file)
|
||||
return stage_file
|
||||
|
||||
def free(self, *, delete_file: bool = True) -> None:
|
||||
"""Release this stream's reference and optionally delete the staged file."""
|
||||
if self._released:
|
||||
return
|
||||
self._released = True
|
||||
if self._finalizer.alive:
|
||||
self._finalizer.detach()
|
||||
_decrement_active(Path(self._normalized_path), delete_file=delete_file)
|
||||
|
||||
def __enter__(self):
|
||||
return self
|
||||
|
||||
def __exit__(self, exc_type, exc, tb):
|
||||
self.free()
|
||||
|
||||
|
||||
_ACTIVE_COUNTER: Counter[str] = Counter()
|
||||
|
||||
|
||||
def _increment_active(path: Path) -> None:
|
||||
key = str(path.resolve())
|
||||
_ACTIVE_COUNTER[key] += 1
|
||||
|
||||
|
||||
def _decrement_active(path: Path, delete_file: bool = True) -> None:
|
||||
key = str(path.resolve())
|
||||
current = _ACTIVE_COUNTER.get(key, 0)
|
||||
if current <= 1:
|
||||
_ACTIVE_COUNTER.pop(key, None)
|
||||
else:
|
||||
_ACTIVE_COUNTER[key] = current - 1
|
||||
|
||||
if _is_in_log_dir(path):
|
||||
release_reference(path, delete_file=delete_file)
|
||||
|
||||
|
||||
def _finalize_instance(path_str: str) -> None:
|
||||
_decrement_active(Path(path_str))
|
||||
|
||||
|
||||
def free_stream(number: StreamNumber, *, delete_file: bool = True) -> None:
|
||||
"""Convenience helper mirroring manual memory management semantics."""
|
||||
number.free(delete_file=delete_file)
|
||||
|
||||
|
||||
def active_streams() -> Dict[str, int]:
|
||||
"""Return the active StreamNumber paths mapped to in-memory reference counts."""
|
||||
return dict(_ACTIVE_COUNTER)
|
||||
|
||||
@ -0,0 +1,220 @@
|
||||
import sqlite3
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Iterable, List, Dict
|
||||
|
||||
LOG_DB_PATH = Path("./instance/mathstream_logs.sqlite")
|
||||
|
||||
|
||||
def _normalize_paths(paths: Iterable[Path]) -> List[str]:
|
||||
return [str(Path(p).resolve()) for p in paths]
|
||||
|
||||
|
||||
def _ensure_db(reset: bool = False) -> None:
|
||||
LOG_DB_PATH.parent.mkdir(parents=True, exist_ok=True)
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS logs (
|
||||
path TEXT PRIMARY KEY,
|
||||
created_at REAL,
|
||||
last_access REAL,
|
||||
access_count INTEGER DEFAULT 0
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS refs (
|
||||
path TEXT PRIMARY KEY,
|
||||
ref_count INTEGER DEFAULT 0
|
||||
)
|
||||
"""
|
||||
)
|
||||
if reset:
|
||||
conn.execute("DELETE FROM logs")
|
||||
conn.execute("DELETE FROM refs")
|
||||
conn.commit()
|
||||
|
||||
|
||||
_ensure_db(reset=True)
|
||||
|
||||
|
||||
def register_log_file(path: Path) -> None:
|
||||
"""Ensure the log database is aware of a file's existence."""
|
||||
normalized = _normalize_paths([path])[0]
|
||||
_ensure_db()
|
||||
timestamp = datetime.now(timezone.utc).timestamp()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO logs (path, created_at, last_access, access_count)
|
||||
VALUES (?, ?, ?, 0)
|
||||
ON CONFLICT(path)
|
||||
DO NOTHING
|
||||
""",
|
||||
(normalized, timestamp, timestamp),
|
||||
)
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO refs (path, ref_count)
|
||||
VALUES (?, 0)
|
||||
ON CONFLICT(path)
|
||||
DO NOTHING
|
||||
""",
|
||||
(normalized,),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
|
||||
def register_reference(path: Path) -> None:
|
||||
"""Increment reference count similarly to Python's ref counter."""
|
||||
normalized = _normalize_paths([path])[0]
|
||||
_ensure_db()
|
||||
timestamp = datetime.now(timezone.utc).timestamp()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO logs (path, created_at, last_access, access_count)
|
||||
VALUES (?, ?, ?, 1)
|
||||
ON CONFLICT(path)
|
||||
DO NOTHING
|
||||
""",
|
||||
(normalized, timestamp, timestamp),
|
||||
)
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO refs (path, ref_count)
|
||||
VALUES (?, 1)
|
||||
ON CONFLICT(path)
|
||||
DO UPDATE SET ref_count = ref_count + 1
|
||||
""",
|
||||
(normalized,),
|
||||
)
|
||||
conn.execute(
|
||||
"""
|
||||
UPDATE logs
|
||||
SET last_access = ?, access_count = access_count + 1
|
||||
WHERE path = ?
|
||||
""",
|
||||
(timestamp, normalized),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
|
||||
def touch_log_file(path: Path) -> None:
|
||||
"""Refresh access metadata when a file is streamed."""
|
||||
normalized = _normalize_paths([path])[0]
|
||||
_ensure_db()
|
||||
timestamp = datetime.now(timezone.utc).timestamp()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO logs (path, created_at, last_access, access_count)
|
||||
VALUES (?, ?, ?, 1)
|
||||
ON CONFLICT(path)
|
||||
DO UPDATE SET
|
||||
last_access = excluded.last_access,
|
||||
access_count = logs.access_count + 1
|
||||
""",
|
||||
(normalized, timestamp, timestamp),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
|
||||
def wipe_log_records() -> None:
|
||||
"""Drop all bookkeeping (used after manual log purges)."""
|
||||
_ensure_db()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
conn.execute("DELETE FROM logs")
|
||||
conn.execute("DELETE FROM refs")
|
||||
conn.commit()
|
||||
|
||||
|
||||
def _delete_records(paths: List[Path]) -> None:
|
||||
if not paths:
|
||||
return
|
||||
normalized = [(str(p.resolve()),) for p in paths]
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
conn.executemany("DELETE FROM logs WHERE path = ?", normalized)
|
||||
conn.executemany("DELETE FROM refs WHERE path = ?", normalized)
|
||||
conn.commit()
|
||||
|
||||
|
||||
def collect_garbage(score_threshold: float) -> list[Path]:
|
||||
"""Remove seldom-used staged files based on an age/refcount score."""
|
||||
if score_threshold < 0:
|
||||
raise ValueError("score_threshold must be non-negative")
|
||||
_ensure_db()
|
||||
now = datetime.now(timezone.utc).timestamp()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
rows = conn.execute(
|
||||
"""
|
||||
SELECT
|
||||
l.path,
|
||||
COALESCE(l.created_at, ?),
|
||||
COALESCE(l.last_access, l.created_at, ?),
|
||||
COALESCE(l.access_count, 0),
|
||||
COALESCE(r.ref_count, 0)
|
||||
FROM logs l
|
||||
LEFT JOIN refs r ON l.path = r.path
|
||||
""",
|
||||
(now, now),
|
||||
).fetchall()
|
||||
|
||||
removed: list[Path] = []
|
||||
for path_str, created_at, last_access, access_count, ref_count in rows:
|
||||
path = Path(path_str)
|
||||
age = now - (last_access or created_at or now)
|
||||
score = age / ((ref_count + 1) * (access_count + 1))
|
||||
if score < score_threshold:
|
||||
continue
|
||||
if path.exists():
|
||||
try:
|
||||
path.unlink()
|
||||
except OSError:
|
||||
continue
|
||||
removed.append(path)
|
||||
|
||||
_delete_records(removed)
|
||||
return removed
|
||||
|
||||
|
||||
def release_reference(path: Path, delete_file: bool = True) -> bool:
|
||||
"""Decrease the reference count and optionally delete the file when it hits zero."""
|
||||
normalized = _normalize_paths([path])[0]
|
||||
_ensure_db()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
row = conn.execute(
|
||||
"SELECT ref_count FROM refs WHERE path = ?", (normalized,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return False
|
||||
current = row[0] or 0
|
||||
new_count = max(current - 1, 0)
|
||||
if new_count > 0:
|
||||
conn.execute(
|
||||
"UPDATE refs SET ref_count = ? WHERE path = ?", (new_count, normalized)
|
||||
)
|
||||
conn.commit()
|
||||
return False
|
||||
conn.execute("DELETE FROM refs WHERE path = ?", (normalized,))
|
||||
conn.execute("DELETE FROM logs WHERE path = ?", (normalized,))
|
||||
conn.commit()
|
||||
|
||||
removed = False
|
||||
if delete_file and path.exists():
|
||||
try:
|
||||
path.unlink()
|
||||
removed = True
|
||||
except OSError:
|
||||
removed = False
|
||||
return removed
|
||||
|
||||
|
||||
def tracked_files() -> Dict[str, int]:
|
||||
"""Return a mapping of tracked file paths to their reference counts."""
|
||||
_ensure_db()
|
||||
with sqlite3.connect(LOG_DB_PATH) as conn:
|
||||
rows = conn.execute("SELECT path, ref_count FROM refs").fetchall()
|
||||
return {path: ref_count for path, ref_count in rows}
|
||||
98
seed_start.py
Normal file
98
seed_start.py
Normal file
@ -0,0 +1,98 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Ultra-fast seed generator for mathstream start.txt.
|
||||
|
||||
Usage:
|
||||
python seed_start.py --seed 10 --mode huge
|
||||
|
||||
Modes:
|
||||
ur = /dev/urandom (1 byte per step)
|
||||
ran = Python random.randint(0,255)
|
||||
asc = random printable ASCII ord()
|
||||
seq = deterministic sequence 0–255 loop
|
||||
huge = massive random digit chunks (SSD-limited chaos)
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import random
|
||||
import time
|
||||
from pathlib import Path
|
||||
from mathstream import StreamNumber, add, clear_logs
|
||||
from tqdm import tqdm
|
||||
|
||||
|
||||
def archive_start_file(start_path: Path):
|
||||
"""Archive old start.txt and reset to 0."""
|
||||
if start_path.exists():
|
||||
timestamp = int(time.time())
|
||||
backup = start_path.with_name(f"start.{timestamp}.txt")
|
||||
backup.write_text(start_path.read_text())
|
||||
start_path.write_text("0")
|
||||
|
||||
|
||||
def seed_once(start_path: Path, byte_val: str):
|
||||
"""Add a single number (string form) to start.txt using mathstream (streamed)."""
|
||||
current = StreamNumber(start_path)
|
||||
delta = StreamNumber(literal=byte_val)
|
||||
result = add(current, delta)
|
||||
new_value = "".join(result.stream())
|
||||
start_path.write_text(new_value)
|
||||
|
||||
|
||||
def fast_huge_random_string(size_bytes=65536):
|
||||
"""Return a huge decimal string generated from /dev/urandom bytes."""
|
||||
with open("/dev/urandom", "rb") as rnd:
|
||||
chunk = rnd.read(size_bytes)
|
||||
# Convert to digits quickly
|
||||
digits = ''.join(str(b % 10) for b in chunk)
|
||||
# Trim leading zeros so mathstream doesn’t choke on '00000'
|
||||
return digits.lstrip('0') or "0"
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Fast seeding for start.txt using mathstream")
|
||||
parser.add_argument("--seed", type=int, required=True, help="number of random additions")
|
||||
parser.add_argument("--mode", choices=["ur", "ran", "asc", "seq", "huge"], default="ur", help="random mode")
|
||||
parser.add_argument("--chunk", type=int, default=65536,
|
||||
help="bytes per chunk for huge mode (default 64KB)")
|
||||
args = parser.parse_args()
|
||||
|
||||
start_path = Path("start.txt")
|
||||
|
||||
clear_logs()
|
||||
archive_start_file(start_path)
|
||||
|
||||
print(f"Seeding {args.seed} iterations with mode '{args.mode}'")
|
||||
|
||||
seq_val = 0
|
||||
|
||||
if args.mode == "ur":
|
||||
with open("/dev/urandom", "rb") as rnd:
|
||||
for _ in tqdm(range(args.seed), desc="Seeding", unit="byte", ncols=80):
|
||||
byte_val = rnd.read(1)[0]
|
||||
seed_once(start_path, str(byte_val))
|
||||
|
||||
elif args.mode == "ran":
|
||||
for _ in tqdm(range(args.seed), desc="Seeding", unit="val", ncols=80):
|
||||
seed_once(start_path, str(random.randint(0, 255)))
|
||||
|
||||
elif args.mode == "asc":
|
||||
printable = [chr(i) for i in range(32, 127)]
|
||||
for _ in tqdm(range(args.seed), desc="Seeding", unit="char", ncols=80):
|
||||
seed_once(start_path, str(ord(random.choice(printable))))
|
||||
|
||||
elif args.mode == "seq":
|
||||
for _ in tqdm(range(args.seed), desc="Seeding", unit="seq", ncols=80):
|
||||
seed_once(start_path, str(seq_val))
|
||||
seq_val = (seq_val + 1) % 256
|
||||
|
||||
elif args.mode == "huge":
|
||||
for _ in tqdm(range(args.seed), desc="Seeding", unit="huge", ncols=80):
|
||||
huge_str = fast_huge_random_string(args.chunk)
|
||||
seed_once(start_path, huge_str)
|
||||
|
||||
print(f"\nFinal start.txt value: {start_path.read_text().strip()}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1
start.txt
Normal file
1
start.txt
Normal file
@ -0,0 +1 @@
|
||||
55569392576944383732069997790263232211253447162098935262971634652345115098934212633724484589756741539606575
|
||||
139
test.py
Normal file
139
test.py
Normal file
@ -0,0 +1,139 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from mathstream import (
|
||||
StreamNumber,
|
||||
add,
|
||||
sub,
|
||||
mul,
|
||||
div,
|
||||
mod,
|
||||
pow,
|
||||
is_even,
|
||||
is_odd,
|
||||
clear_logs,
|
||||
collect_garbage,
|
||||
DivideByZeroError,
|
||||
active_streams,
|
||||
tracked_files,
|
||||
)
|
||||
|
||||
NUMBERS_DIR = Path(__file__).parent / "tests"
|
||||
|
||||
def write_number(name: str, digits: str) -> StreamNumber:
|
||||
"""Persist digits to disk and return a streamable handle."""
|
||||
NUMBERS_DIR.mkdir(parents=True, exist_ok=True)
|
||||
target = NUMBERS_DIR / f"{name}.txt"
|
||||
target.write_text(digits, encoding="utf-8")
|
||||
return StreamNumber(target)
|
||||
|
||||
|
||||
def read_number(num: StreamNumber) -> str:
|
||||
"""Collapse streamed chunks back into a concrete string."""
|
||||
return "".join(num.stream())
|
||||
|
||||
|
||||
def check(label: str, result: StreamNumber, expected: str) -> None:
|
||||
actual = read_number(result)
|
||||
assert (
|
||||
actual == expected
|
||||
), f"{label} expected {expected}, got {actual}"
|
||||
print(f"{label} = {actual}")
|
||||
|
||||
|
||||
def check_bool(label: str, value: bool, expected: bool) -> None:
|
||||
assert value is expected, f"{label} expected {expected}, got {value}"
|
||||
print(f"{label} = {value}")
|
||||
|
||||
|
||||
def main() -> None:
|
||||
clear_logs()
|
||||
|
||||
# Build a handful of example operands on disk.
|
||||
big = write_number("huge", "98765432123456789")
|
||||
small = write_number("tiny", "34567")
|
||||
negative = write_number("negative", "-1200")
|
||||
exponent = write_number("power", "5")
|
||||
negative_divisor = write_number("neg_divisor", "-34567")
|
||||
literal_even = StreamNumber(literal="2000")
|
||||
literal_odd = StreamNumber(literal="-3")
|
||||
zero_literal = StreamNumber(literal="0")
|
||||
|
||||
# Showcase the core operations.
|
||||
total = add(big, small)
|
||||
difference = sub(big, small)
|
||||
product = mul(small, negative)
|
||||
quotient = div(big, small)
|
||||
powered = pow(small, exponent)
|
||||
modulus = mod(big, small)
|
||||
neg_mod_pos = mod(negative, small)
|
||||
pos_mod_neg = mod(small, negative)
|
||||
neg_mod_neg = mod(negative, negative_divisor)
|
||||
literal_combo = add(literal_even, literal_odd)
|
||||
|
||||
print("Operands stored under:", NUMBERS_DIR)
|
||||
check("huge + tiny", total, "98765432123491356")
|
||||
check("huge - tiny", difference, "98765432123422222")
|
||||
check("tiny * negative", product, "-41480400")
|
||||
check("huge // tiny", quotient, "2857217349595")
|
||||
check("tiny ** power", powered, "49352419431622775997607")
|
||||
check("huge % tiny", modulus, "6424")
|
||||
check("negative % tiny", neg_mod_pos, "33367")
|
||||
check("tiny % negative", pos_mod_neg, "-233")
|
||||
check("negative % neg_divisor", neg_mod_neg, "-1200")
|
||||
check("literal_even + literal_odd", literal_combo, "1997")
|
||||
check_bool("is_even(negative)", is_even(negative), True)
|
||||
check_bool("is_even(tiny)", is_even(small), False)
|
||||
check_bool("is_odd(tiny)", is_odd(small), True)
|
||||
check_bool("is_odd(negative)", is_odd(negative), False)
|
||||
check_bool("is_even(literal_even)", is_even(literal_even), True)
|
||||
check_bool("is_odd(literal_odd)", is_odd(literal_odd), True)
|
||||
|
||||
# Custom exception coverage
|
||||
try:
|
||||
div(literal_even, zero_literal)
|
||||
except DivideByZeroError:
|
||||
print("div(literal_even, zero_literal) raised DivideByZeroError as expected")
|
||||
else:
|
||||
raise AssertionError("div by zero did not raise DivideByZeroError")
|
||||
|
||||
try:
|
||||
mod(literal_even, zero_literal)
|
||||
except DivideByZeroError:
|
||||
print("mod(literal_even, zero_literal) raised DivideByZeroError as expected")
|
||||
else:
|
||||
raise AssertionError("mod by zero did not raise DivideByZeroError")
|
||||
|
||||
# manual frees should immediately drop staged files
|
||||
staged = [
|
||||
total,
|
||||
difference,
|
||||
product,
|
||||
quotient,
|
||||
powered,
|
||||
modulus,
|
||||
neg_mod_pos,
|
||||
pos_mod_neg,
|
||||
neg_mod_neg,
|
||||
literal_combo,
|
||||
]
|
||||
for stream in staged:
|
||||
stream.free()
|
||||
|
||||
literal_even.free()
|
||||
literal_odd.free()
|
||||
zero_literal.free()
|
||||
|
||||
check_bool("total freed file gone", total.path.exists(), False)
|
||||
check_bool("literal_even freed file gone", literal_even.path.exists(), False)
|
||||
|
||||
removed = collect_garbage(0)
|
||||
print(f"collect_garbage removed {len(removed)} files after manual free")
|
||||
check_bool("huge operand persists", big.path.exists(), True)
|
||||
print("Active streams:", active_streams())
|
||||
print("Tracked files:", tracked_files())
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1
tests/huge.txt
Normal file
1
tests/huge.txt
Normal file
@ -0,0 +1 @@
|
||||
98765432123456789
|
||||
1
tests/neg_divisor.txt
Normal file
1
tests/neg_divisor.txt
Normal file
@ -0,0 +1 @@
|
||||
-34567
|
||||
1
tests/negative.txt
Normal file
1
tests/negative.txt
Normal file
@ -0,0 +1 @@
|
||||
-1200
|
||||
1
tests/power.txt
Normal file
1
tests/power.txt
Normal file
@ -0,0 +1 @@
|
||||
5
|
||||
1
tests/start.1762331106.txt
Normal file
1
tests/start.1762331106.txt
Normal file
@ -0,0 +1 @@
|
||||
993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887993284887
|
||||
1
tests/tiny.txt
Normal file
1
tests/tiny.txt
Normal file
@ -0,0 +1 @@
|
||||
34567
|
||||
Loading…
x
Reference in New Issue
Block a user