mathy/mathstream/README.md
2025-11-05 08:35:01 +01:00

89 lines
3.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Mathstream Library
`mathstream` offers streamed, string-based arithmetic for very large integers that you may not want to load entirely into memory. Instead of parsing numbers into Python `int` values, you work with digit files on disk via `StreamNumber` and call math operations that operate chunk-by-chunk.
## Quick Start
```bash
python -m venv venv
source venv/bin/activate
pip install -e .
```
Create digit files anywhere you like (the examples below use `instance/log`), or supply ad-hoc literals, then construct `StreamNumber` objects and call the helpers:
```python
from mathstream import (
StreamNumber,
add,
sub,
mul,
div,
mod,
pow,
is_even,
is_odd,
collect_garbage,
)
a = StreamNumber("instance/log/huge.txt")
b = StreamNumber(literal="34567")
e = StreamNumber(literal="3")
print("sum =", "".join(add(a, b).stream()))
print("difference =", "".join(sub(a, b).stream()))
print("product =", "".join(mul(a, b).stream()))
print("quotient =", "".join(div(a, b).stream()))
print("modulo =", "".join(mod(a, b).stream()))
print("power =", "".join(pow(a, e).stream()))
print("a is even?", is_even(a))
print("b is odd?", is_odd(b))
# reclaim space for files whose age outweighs their use
collect_garbage(0.5)
```
Each arithmetic call writes its result back into `instance/log` (configurable via `mathstream.number.LOG_DIR`) so you can stream the digits later or reuse them in further operations.
## Core Concepts
- **StreamNumber(path | literal=...)** Wraps a digit text file or creates one for an integer literal inside `LOG_DIR`. Literal operands are persisted as `literal_<hash>.txt`, so repeated runs reuse the same staged file (note that `clear_logs()` removes these cache files too).
- **`.stream(chunk_size)`** Yields strings of digits with the provided chunk size. Operations in `mathstream.engine` consume these streams to avoid loading the entire number at once.
- **Automatic staging** Outputs are stored under `LOG_DIR` with hashes based on input file paths, letting you compose operations without manual bookkeeping.
- **Sign-aware** Addition, subtraction, multiplication, division (`//` behavior), modulo, and exponentiation (non-negative exponents) all respect operand sign. Division/modulo follow Pythons floor-division rules.
- **Utilities** `clear_logs()` wipes prior staged results so you can start fresh.
- **Parity helpers** `is_even` and `is_odd` inspect the streamed digits without materializing the integer.
- **Garbage collection** `collect_garbage(score_threshold)` computes a score from file age, access count, and reference count (tracked in `instance/mathstream_logs.sqlite`, freshly truncated each run). Files whose score meets or exceeds the threshold are deleted, letting you tune how aggressively to reclaim space. Both staged results and literal caches participate.
Divide-by-zero scenarios raise the custom `DivideByZeroError` so callers can distinguish mathstream issues from Pythons native exceptions.
## Example Script
`test.py` in the repository root demonstrates a minimal workflow:
1. Writes sample operands to `tests/*.txt`.
2. Calls every arithmetic primitive plus the modulo/parity helpers.
3. Asserts that the streamed outputs match known values (helpful for quick regression checks).
Run it via:
```bash
python test.py
```
## Extending
- To hook into other storage backends, implement your own `StreamNumber` variant with the same `.stream()` interface.
- Need modulo or gcd? Compose the existing primitives (e.g., repeated subtraction or using `div` + remainder tracking inside `_divide_abs`) or add new helpers following the same streamed pattern.
- For more control over output locations, override `LOG_DIR` before using the operations:
```python
from mathstream import engine
from pathlib import Path
engine.LOG_DIR = Path("/tmp/my_mathstage")
engine.clear_logs()
```
With these building blocks, you can manipulate arbitrarily large integers while keeping memory usage constant. Happy streaming!