Persistence

Oxibase provides durable storage through a combination of Write-Ahead Logging (WAL) and periodic snapshots. This architecture ensures data durability while maintaining high performance.

Overview

Oxibase’s persistence layer consists of two main components:

Write-Ahead Log (WAL): Records all changes before they’re applied to memory
Snapshots: Periodic full copies of the database state

This dual approach provides:

Durability: Changes are persisted before acknowledgment
Fast Recovery: Snapshots reduce recovery time
Crash Safety: WAL ensures no committed transactions are lost

Enabling Persistence

To enable persistence, use a file:// connection string:

use oxibase::Database;

// In-memory only (no persistence)
let db = Database::open("memory://")?;

// With disk persistence
let db = Database::open("file:///path/to/database")?;

Command line:

# In-memory
oxibase

# With persistence
oxibase --db "file:///path/to/database"

Write-Ahead Log (WAL)

How WAL Works

When a transaction commits, changes are first written to the WAL file
The WAL is synced to disk (based on sync_mode)
Changes are then applied to the in-memory structures
Transaction is acknowledged to the client

This sequence ensures that committed transactions survive crashes.

WAL Manager Responsibilities

The WAL manager handles sequential writes with durability guarantees:

Sequential writes with CRC32 checksums for corruption detection
Two-phase recovery: Phase 1 identifies committed transactions, Phase 2 applies their changes
Optional compression for reduced I/O when enabled
Atomic writes ensuring either complete entries or none

WAL Entry Structure:

[WAL Header]
Magic: 0x57414C00
Version: 1
Checksum: CRC32

[Entry 1]
Timestamp: i64
TxnId: i64
OpType: Insert/Update/Delete
TableId: i64
RowData: [serialized row]
Checksum: CRC32

[Entry 2]
...

WAL Operations

The WAL records these operations:

INSERT: New row insertions with full row data
UPDATE: Row modifications with old→new value pairs
DELETE: Row deletions with deletion markers
CREATE TABLE: Table creation with schema metadata (DDL)
DROP TABLE: Table deletion with cleanup operations (DDL)
CREATE INDEX: Index creation with index specifications (DDL)
DROP INDEX: Index deletion operations (DDL)
ALTER TABLE: Schema changes with transformation rules

WAL Segment Management

WAL files are managed in segments for efficient rotation:

Active segment: Currently being written to
Archived segments: Completed segments waiting for snapshot
Cleanup: Old segments deleted after successful snapshots
Recovery: Segments replayed in order during startup

WAL Configuration

Configure WAL behavior using PRAGMA:

-- Sync mode: 0=None, 1=Normal (default), 2=Full
PRAGMA sync_mode = 1;

-- Number of operations before automatic WAL flush
PRAGMA wal_flush_trigger = 32768;

Sync Mode	Value	Behavior
None	0	No sync (fastest, but data may be lost on crash)
Normal	1	Sync on commit (balanced performance and durability)
Full	2	Sync every operation (slowest, maximum durability)

WAL Files

WAL files are stored in the database directory:

/path/to/database/
  wal/
    wal_000001.log
    wal_000002.log
    ...

Old WAL files are automatically cleaned up after successful snapshots.

Snapshots

How Snapshots Work

Snapshots capture the complete database state at a point in time:

All table data and schema
All index definitions
Current transaction state

After a snapshot is created, older WAL entries can be safely deleted.

Snapshot System Architecture

The snapshot system uses binary format with atomic writes:

Binary format with magic bytes (0x50414E53 = “SNAP”)
Atomic 3-phase writes: temp file → sync → rename for crash safety
Tracks source LSN for incremental recovery
Compressed storage to reduce disk usage

Snapshot File Structure:

[Snapshot Header]
Magic: 0x50414E53
Version: 1
Timestamp: i64
LSN: i64 (WAL position)
TableCount: u32
Checksum: CRC32

[Table 1 Schema]
TableId: i64
Name: String
Columns: [Column definitions]
Indexes: [Index definitions]

[Table 1 Data]
RowCount: u64
Rows: [serialized rows...]

[Table 2 Schema]
...

[Footer]
TotalSize: u64
Checksum: CRC32

3-Phase Atomic Write Process

Snapshots use atomic writes to prevent corruption:

Phase 1 - Write: Create temporary file with complete snapshot data
Phase 2 - Sync: Force all writes to disk with fsync()
Phase 3 - Rename: Atomically move temp file to final location

This ensures either a complete, valid snapshot exists, or none at all.

Snapshot Configuration

-- Interval between automatic snapshots (in seconds, default: 300)
PRAGMA snapshot_interval = 300;

-- Number of snapshots to retain (default: 5)
PRAGMA keep_snapshots = 5;

-- Manually create a snapshot
PRAGMA create_snapshot;

Snapshot Files

Snapshots are stored as binary files:

/path/to/database/
  snapshots/
    snapshot_1704067200.bin
    snapshot_1704067500.bin
    ...

The filename includes the Unix timestamp of creation for easy sorting.

Snapshot Retention Policy

Automatic cleanup: Old snapshots deleted when keep_snapshots exceeded
Time-based retention: Can be configured for compliance requirements
Manual retention: Important snapshots can be preserved indefinitely

Recovery Process

When opening a database, Oxibase performs recovery automatically:

2-Phase Recovery Algorithm

Phase 1 - Analysis:

Read snapshot metadata to get checkpoint LSN (Log Sequence Number)
Scan WAL entries to identify committed vs. uncommitted transactions
Build transaction dependency graph for proper ordering

Phase 2 - Redo:

Load table snapshots for fastest recovery path
Replay WAL entries from checkpoint LSN forward
Apply only entries from committed transactions
Rebuild indexes in single pass after replay completes

Detailed Recovery Flow

graph TD
    A[Database Open] --> B{Find Latest Snapshot?}
    B -->|Yes| C[Load Snapshot Metadata]
    B -->|No| D[Start from Empty State]

    C --> E[Get Checkpoint LSN]
    D --> E

    E --> F[Phase 1: Analyze WAL]
    F --> G[Identify Committed Transactions]
    G --> H[Build Transaction Graph]

    H --> I[Phase 2: Replay WAL]
    I --> J[Apply Committed Entries]
    J --> K[Rebuild Indexes]

    K --> L[Verify Data Consistency]
    L --> M[Database Ready]

WAL Replay Process

During WAL replay, Oxibase:

Validates checksums on each WAL entry for corruption detection
Applies operations in strict commit order using LSN sequencing
Handles DDL operations by reconstructing schema changes
Rebuilds indexes from scratch after all data is loaded (faster than incremental updates)
Validates constraints to ensure data integrity

Index Rebuilding

After WAL replay, indexes are rebuilt efficiently:

Single-pass rebuild: All indexes populated in one table scan
Batch operations: Reduces lock contention during rebuild
Parallel building: Multiple indexes built concurrently where possible
Memory optimization: Uses arena-based storage to minimize allocations

Recovery Performance

Recovery time depends on:

Snapshot recency: More recent snapshots = less WAL to replay
WAL size: Amount of uncheckpointed changes
Index complexity: Time to rebuild indexes after replay
Hardware: SSDs provide much faster recovery than HDDs

Error Recovery

If corruption is detected during recovery:

Partial recovery: Recover up to last valid WAL entry
Snapshot fallback: Use older snapshot if current is corrupted
Logging: Detailed logs of recovery progress and any issues
Graceful degradation: Database opens in consistent state even with some data loss

Recovery is transparent to the application.

Recovery Example

// Opening automatically triggers recovery if needed
let db = Database::open("file:///path/to/database")?;

// Database is ready with all committed data restored
let results = db.query("SELECT * FROM users")?;

Recovery Monitoring

Monitor recovery progress programmatically:

-- Check if database is in recovery mode
PRAGMA recovery_status;

-- Get recovery statistics after startup
PRAGMA recovery_stats;

Configuration Options

Connection String Parameters

file:///path/to/database?sync_mode=2&snapshot_interval=60&keep_snapshots=3

Parameter	Description	Default
sync_mode	WAL sync mode (0, 1, 2)	1
snapshot_interval	Seconds between snapshots	300
keep_snapshots	Number of snapshots to keep	5

PRAGMA Commands

-- Read current settings
PRAGMA sync_mode;
PRAGMA snapshot_interval;
PRAGMA keep_snapshots;
PRAGMA wal_flush_trigger;

-- Modify settings
PRAGMA sync_mode = 2;
PRAGMA snapshot_interval = 60;
PRAGMA keep_snapshots = 3;
PRAGMA wal_flush_trigger = 10000;

-- Manually trigger snapshot
PRAGMA create_snapshot;

Performance Characteristics

WAL Performance

Sync Mode	Durability	Performance Impact	Use Case
0 (None)	None	~5-10x faster	Development only
1 (Normal)	Commit durability	~2x slower	General production
2 (Full)	Every operation	~10x slower	Critical data

Snapshot Performance

Creation time: Proportional to database size
Memory usage: Minimal (shared data structures)
I/O pattern: Sequential writes, atomic rename
Compression: Optional, reduces size by 20-50%

Recovery Performance

Scenario	Recovery Time	Notes
Clean shutdown	< 1 second	No WAL replay needed
Recent snapshot	1-30 seconds	Minimal WAL replay
No recent snapshot	1-5 minutes	Full WAL replay
Large database	5-30 minutes	Index rebuilding

WAL Compression

When enabled, WAL compression provides:

Space savings: 30-70% reduction in WAL size
CPU overhead: 5-15% increase in write latency
Recovery speed: Faster due to smaller files
Memory usage: Minimal additional overhead

Best Practices

Durability vs Performance

Choose sync_mode based on your requirements:

Use Case	Recommended sync_mode	Trade-offs
Development/Testing	0 (None)	Fastest, no durability
General Use	1 (Normal)	Balanced performance/durability
Financial/Critical Data	2 (Full)	Maximum durability, slower

Advanced Configuration

WAL Tuning

-- For high-write workloads
PRAGMA wal_flush_trigger = 10000;  -- Flush every 10k operations
PRAGMA sync_mode = 1;               -- Normal sync

-- For low-write, high-durability
PRAGMA wal_flush_trigger = 1000;   -- Flush more frequently
PRAGMA sync_mode = 2;               -- Full sync

Snapshot Strategy

-- For high-availability systems
PRAGMA snapshot_interval = 60;     -- Every minute
PRAGMA keep_snapshots = 10;        -- Keep more snapshots

-- For cost-optimized systems
PRAGMA snapshot_interval = 3600;   -- Every hour
PRAGMA keep_snapshots = 3;         -- Fewer snapshots

Snapshot Frequency

Frequent snapshots (low interval): Faster recovery, more disk I/O
Infrequent snapshots (high interval): Slower recovery, less disk I/O

For databases with high write rates, consider shorter intervals.

Disk Space Management

Monitor disk usage:

WAL files grow until the next snapshot
Old snapshots are retained based on keep_snapshots
Plan for peak WAL size between snapshots

Disk Space Formula:

Peak WAL Size = Write Rate × Snapshot Interval
Total Snapshots = keep_snapshots × Average Snapshot Size

Backup Strategy

For backups:

Create a manual snapshot: PRAGMA create_snapshot;
Copy the entire database directory while the database is idle
For hot backups, use filesystem snapshots (ZFS, LVM)

Monitoring and Maintenance

Health Checks

-- WAL statistics
PRAGMA wal_stats;

-- Snapshot information
PRAGMA snapshot_info;

-- Recovery status
PRAGMA recovery_status;

Maintenance Tasks

-- Force WAL checkpoint (creates snapshot)
PRAGMA create_snapshot;

-- Clean up old snapshots manually
PRAGMA cleanup_snapshots;

-- Validate database integrity
PRAGMA integrity_check;

Directory Structure

A persistent database creates this directory structure:

/path/to/database/
  db.lock              # Lock file for single-writer
  wal/
    wal_NNNNNN.log     # WAL segment files
  snapshots/
    snapshot_TIMESTAMP.bin  # Snapshot files

Error Handling

Corrupt WAL

If WAL corruption is detected during recovery:

Oxibase attempts to recover up to the last valid entry
Corrupted entries at the end are discarded
A warning is logged

Disk Full

If disk becomes full:

WAL writes will fail
Transactions will be rolled back
Free disk space before continuing

Lock Contention

Only one process can open a database directory:

A lock file (db.lock) prevents concurrent access
If a previous process crashed, the lock is automatically released on open

Example: Complete Configuration

use oxibase::Database;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Open with custom persistence settings
    let db = Database::open(
        "file:///var/lib/myapp/data?sync_mode=2&snapshot_interval=120"
    )?;

    // Fine-tune at runtime
    db.execute("PRAGMA wal_flush_trigger = 5000")?;
    db.execute("PRAGMA keep_snapshots = 7")?;

    // Your application logic...
    db.execute("CREATE TABLE events (id INTEGER PRIMARY KEY AUTO_INCREMENT, data JSON)")?;

    // Force a snapshot before maintenance
    db.execute("PRAGMA create_snapshot")?;

    Ok(())
}