Persistence
Oxibase provides durable storage through a combination of Write-Ahead Logging (WAL) and periodic snapshots. This architecture ensures data durability while maintaining high performance.
Overview
Oxibase’s persistence layer consists of two main components:
- Write-Ahead Log (WAL): Records all changes before they’re applied to memory
- Snapshots: Periodic full copies of the database state
This dual approach provides:
- Durability: Changes are persisted before acknowledgment
- Fast Recovery: Snapshots reduce recovery time
- Crash Safety: WAL ensures no committed transactions are lost
Enabling Persistence
To enable persistence, use a file:// connection string:
use oxibase::Database;
// In-memory only (no persistence)
let db = Database::open("memory://")?;
// With disk persistence
let db = Database::open("file:///path/to/database")?;
Command line:
# In-memory
oxibase
# With persistence
oxibase --db "file:///path/to/database"
Write-Ahead Log (WAL)
How WAL Works
- When a transaction commits, changes are first written to the WAL file
- The WAL is synced to disk (based on sync_mode)
- Changes are then applied to the in-memory structures
- Transaction is acknowledged to the client
This sequence ensures that committed transactions survive crashes.
WAL Manager Responsibilities
The WAL manager handles sequential writes with durability guarantees:
- Sequential writes with CRC32 checksums for corruption detection
- Two-phase recovery: Phase 1 identifies committed transactions, Phase 2 applies their changes
- Optional compression for reduced I/O when enabled
- Atomic writes ensuring either complete entries or none
WAL Entry Structure:
[WAL Header]
Magic: 0x57414C00
Version: 1
Checksum: CRC32
[Entry 1]
Timestamp: i64
TxnId: i64
OpType: Insert/Update/Delete
TableId: i64
RowData: [serialized row]
Checksum: CRC32
[Entry 2]
...
WAL Operations
The WAL records these operations:
- INSERT: New row insertions with full row data
- UPDATE: Row modifications with old→new value pairs
- DELETE: Row deletions with deletion markers
- CREATE TABLE: Table creation with schema metadata (DDL)
- DROP TABLE: Table deletion with cleanup operations (DDL)
- CREATE INDEX: Index creation with index specifications (DDL)
- DROP INDEX: Index deletion operations (DDL)
- ALTER TABLE: Schema changes with transformation rules
WAL Segment Management
WAL files are managed in segments for efficient rotation:
- Active segment: Currently being written to
- Archived segments: Completed segments waiting for snapshot
- Cleanup: Old segments deleted after successful snapshots
- Recovery: Segments replayed in order during startup
WAL Configuration
Configure WAL behavior using PRAGMA:
-- Sync mode: 0=None, 1=Normal (default), 2=Full
PRAGMA sync_mode = 1;
-- Number of operations before automatic WAL flush
PRAGMA wal_flush_trigger = 32768;
| Sync Mode | Value | Behavior |
|---|---|---|
| None | 0 | No sync (fastest, but data may be lost on crash) |
| Normal | 1 | Sync on commit (balanced performance and durability) |
| Full | 2 | Sync every operation (slowest, maximum durability) |
WAL Files
WAL files are stored in the database directory:
/path/to/database/
wal/
wal_000001.log
wal_000002.log
...
Old WAL files are automatically cleaned up after successful snapshots.
Snapshots
How Snapshots Work
Snapshots capture the complete database state at a point in time:
- All table data and schema
- All index definitions
- Current transaction state
After a snapshot is created, older WAL entries can be safely deleted.
Snapshot System Architecture
The snapshot system uses binary format with atomic writes:
- Binary format with magic bytes (
0x50414E53= “SNAP”) - Atomic 3-phase writes: temp file → sync → rename for crash safety
- Tracks source LSN for incremental recovery
- Compressed storage to reduce disk usage
Snapshot File Structure:
[Snapshot Header]
Magic: 0x50414E53
Version: 1
Timestamp: i64
LSN: i64 (WAL position)
TableCount: u32
Checksum: CRC32
[Table 1 Schema]
TableId: i64
Name: String
Columns: [Column definitions]
Indexes: [Index definitions]
[Table 1 Data]
RowCount: u64
Rows: [serialized rows...]
[Table 2 Schema]
...
[Footer]
TotalSize: u64
Checksum: CRC32
3-Phase Atomic Write Process
Snapshots use atomic writes to prevent corruption:
- Phase 1 - Write: Create temporary file with complete snapshot data
- Phase 2 - Sync: Force all writes to disk with fsync()
- Phase 3 - Rename: Atomically move temp file to final location
This ensures either a complete, valid snapshot exists, or none at all.
Snapshot Configuration
-- Interval between automatic snapshots (in seconds, default: 300)
PRAGMA snapshot_interval = 300;
-- Number of snapshots to retain (default: 5)
PRAGMA keep_snapshots = 5;
-- Manually create a snapshot
PRAGMA create_snapshot;
Snapshot Files
Snapshots are stored as binary files:
/path/to/database/
snapshots/
snapshot_1704067200.bin
snapshot_1704067500.bin
...
The filename includes the Unix timestamp of creation for easy sorting.
Snapshot Retention Policy
- Automatic cleanup: Old snapshots deleted when
keep_snapshotsexceeded - Time-based retention: Can be configured for compliance requirements
- Manual retention: Important snapshots can be preserved indefinitely
Recovery Process
When opening a database, Oxibase performs recovery automatically:
2-Phase Recovery Algorithm
Phase 1 - Analysis:
- Read snapshot metadata to get checkpoint LSN (Log Sequence Number)
- Scan WAL entries to identify committed vs. uncommitted transactions
- Build transaction dependency graph for proper ordering
Phase 2 - Redo:
- Load table snapshots for fastest recovery path
- Replay WAL entries from checkpoint LSN forward
- Apply only entries from committed transactions
- Rebuild indexes in single pass after replay completes
Detailed Recovery Flow
graph TD
A[Database Open] --> B{Find Latest Snapshot?}
B -->|Yes| C[Load Snapshot Metadata]
B -->|No| D[Start from Empty State]
C --> E[Get Checkpoint LSN]
D --> E
E --> F[Phase 1: Analyze WAL]
F --> G[Identify Committed Transactions]
G --> H[Build Transaction Graph]
H --> I[Phase 2: Replay WAL]
I --> J[Apply Committed Entries]
J --> K[Rebuild Indexes]
K --> L[Verify Data Consistency]
L --> M[Database Ready]
WAL Replay Process
During WAL replay, Oxibase:
- Validates checksums on each WAL entry for corruption detection
- Applies operations in strict commit order using LSN sequencing
- Handles DDL operations by reconstructing schema changes
- Rebuilds indexes from scratch after all data is loaded (faster than incremental updates)
- Validates constraints to ensure data integrity
Index Rebuilding
After WAL replay, indexes are rebuilt efficiently:
- Single-pass rebuild: All indexes populated in one table scan
- Batch operations: Reduces lock contention during rebuild
- Parallel building: Multiple indexes built concurrently where possible
- Memory optimization: Uses arena-based storage to minimize allocations
Recovery Performance
Recovery time depends on:
- Snapshot recency: More recent snapshots = less WAL to replay
- WAL size: Amount of uncheckpointed changes
- Index complexity: Time to rebuild indexes after replay
- Hardware: SSDs provide much faster recovery than HDDs
Error Recovery
If corruption is detected during recovery:
- Partial recovery: Recover up to last valid WAL entry
- Snapshot fallback: Use older snapshot if current is corrupted
- Logging: Detailed logs of recovery progress and any issues
- Graceful degradation: Database opens in consistent state even with some data loss
Recovery is transparent to the application.
Recovery Example
// Opening automatically triggers recovery if needed
let db = Database::open("file:///path/to/database")?;
// Database is ready with all committed data restored
let results = db.query("SELECT * FROM users")?;
Recovery Monitoring
Monitor recovery progress programmatically:
-- Check if database is in recovery mode
PRAGMA recovery_status;
-- Get recovery statistics after startup
PRAGMA recovery_stats;
Configuration Options
Connection String Parameters
file:///path/to/database?sync_mode=2&snapshot_interval=60&keep_snapshots=3
| Parameter | Description | Default |
|---|---|---|
| sync_mode | WAL sync mode (0, 1, 2) | 1 |
| snapshot_interval | Seconds between snapshots | 300 |
| keep_snapshots | Number of snapshots to keep | 5 |
PRAGMA Commands
-- Read current settings
PRAGMA sync_mode;
PRAGMA snapshot_interval;
PRAGMA keep_snapshots;
PRAGMA wal_flush_trigger;
-- Modify settings
PRAGMA sync_mode = 2;
PRAGMA snapshot_interval = 60;
PRAGMA keep_snapshots = 3;
PRAGMA wal_flush_trigger = 10000;
-- Manually trigger snapshot
PRAGMA create_snapshot;
Performance Characteristics
WAL Performance
| Sync Mode | Durability | Performance Impact | Use Case |
|---|---|---|---|
| 0 (None) | None | ~5-10x faster | Development only |
| 1 (Normal) | Commit durability | ~2x slower | General production |
| 2 (Full) | Every operation | ~10x slower | Critical data |
Snapshot Performance
- Creation time: Proportional to database size
- Memory usage: Minimal (shared data structures)
- I/O pattern: Sequential writes, atomic rename
- Compression: Optional, reduces size by 20-50%
Recovery Performance
| Scenario | Recovery Time | Notes |
|---|---|---|
| Clean shutdown | < 1 second | No WAL replay needed |
| Recent snapshot | 1-30 seconds | Minimal WAL replay |
| No recent snapshot | 1-5 minutes | Full WAL replay |
| Large database | 5-30 minutes | Index rebuilding |
WAL Compression
When enabled, WAL compression provides:
- Space savings: 30-70% reduction in WAL size
- CPU overhead: 5-15% increase in write latency
- Recovery speed: Faster due to smaller files
- Memory usage: Minimal additional overhead
Best Practices
Durability vs Performance
Choose sync_mode based on your requirements:
| Use Case | Recommended sync_mode | Trade-offs |
|---|---|---|
| Development/Testing | 0 (None) | Fastest, no durability |
| General Use | 1 (Normal) | Balanced performance/durability |
| Financial/Critical Data | 2 (Full) | Maximum durability, slower |
Advanced Configuration
WAL Tuning
-- For high-write workloads
PRAGMA wal_flush_trigger = 10000; -- Flush every 10k operations
PRAGMA sync_mode = 1; -- Normal sync
-- For low-write, high-durability
PRAGMA wal_flush_trigger = 1000; -- Flush more frequently
PRAGMA sync_mode = 2; -- Full sync
Snapshot Strategy
-- For high-availability systems
PRAGMA snapshot_interval = 60; -- Every minute
PRAGMA keep_snapshots = 10; -- Keep more snapshots
-- For cost-optimized systems
PRAGMA snapshot_interval = 3600; -- Every hour
PRAGMA keep_snapshots = 3; -- Fewer snapshots
Snapshot Frequency
- Frequent snapshots (low interval): Faster recovery, more disk I/O
- Infrequent snapshots (high interval): Slower recovery, less disk I/O
For databases with high write rates, consider shorter intervals.
Disk Space Management
Monitor disk usage:
- WAL files grow until the next snapshot
- Old snapshots are retained based on
keep_snapshots - Plan for peak WAL size between snapshots
Disk Space Formula:
Peak WAL Size = Write Rate × Snapshot Interval
Total Snapshots = keep_snapshots × Average Snapshot Size
Backup Strategy
For backups:
- Create a manual snapshot:
PRAGMA create_snapshot; - Copy the entire database directory while the database is idle
- For hot backups, use filesystem snapshots (ZFS, LVM)
Monitoring and Maintenance
Health Checks
-- WAL statistics
PRAGMA wal_stats;
-- Snapshot information
PRAGMA snapshot_info;
-- Recovery status
PRAGMA recovery_status;
Maintenance Tasks
-- Force WAL checkpoint (creates snapshot)
PRAGMA create_snapshot;
-- Clean up old snapshots manually
PRAGMA cleanup_snapshots;
-- Validate database integrity
PRAGMA integrity_check;
Directory Structure
A persistent database creates this directory structure:
/path/to/database/
db.lock # Lock file for single-writer
wal/
wal_NNNNNN.log # WAL segment files
snapshots/
snapshot_TIMESTAMP.bin # Snapshot files
Error Handling
Corrupt WAL
If WAL corruption is detected during recovery:
- Oxibase attempts to recover up to the last valid entry
- Corrupted entries at the end are discarded
- A warning is logged
Disk Full
If disk becomes full:
- WAL writes will fail
- Transactions will be rolled back
- Free disk space before continuing
Lock Contention
Only one process can open a database directory:
- A lock file (
db.lock) prevents concurrent access - If a previous process crashed, the lock is automatically released on open
Example: Complete Configuration
use oxibase::Database;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Open with custom persistence settings
let db = Database::open(
"file:///var/lib/myapp/data?sync_mode=2&snapshot_interval=120"
)?;
// Fine-tune at runtime
db.execute("PRAGMA wal_flush_trigger = 5000")?;
db.execute("PRAGMA keep_snapshots = 7")?;
// Your application logic...
db.execute("CREATE TABLE events (id INTEGER PRIMARY KEY AUTO_INCREMENT, data JSON)")?;
// Force a snapshot before maintenance
db.execute("PRAGMA create_snapshot")?;
Ok(())
}