Making SQLite Fly as a Desktop Video Database

FrameQuery stores everything about your video library in a single SQLite database file. Video metadata, source folders, processing status, tags, projects, face data, perceptual hashes. One file, portable and easy to back up. But as the schema grew to 30+ tables and libraries scaled to tens of thousands of clips, we had to move past the default SQLite configuration to keep the app responsive.

The single-mutex problem

Our first implementation was straightforward. One connection wrapped in a Rust Arc<Mutex<Connection>>, shared across all threads.

This works. It is also slow. SQLite in WAL (Write-Ahead Logging) mode supports concurrent readers alongside a single writer. The mutex eliminated that advantage entirely. Every read waited for every write. Every write waited for every read. The UI thread blocked while a background scan inserted new videos. Searches stalled while the dashboard loaded stats.

WAL mode is one of SQLite's best features for desktop applications, and we were throwing it away with a single lock.

Reader/writer split

The fix was separating read and write connections:

/// Singleton DB connection for writes
static DB: OnceLock<Arc<Mutex<Connection>>> = OnceLock::new();
/// Pool of read-only connections for concurrent reads
static READ_POOL: OnceLock<ReadConnectionPool> = OnceLock::new();

The write connection stays behind a mutex because SQLite only supports one writer at a time. That constraint is fundamental to SQLite and not something we can work around. But reads are a different story.

ReadConnectionPool maintains a Vec<Connection> with a configurable maximum size. When a thread needs to read, it calls acquire(), which pops an existing connection from the pool or creates a new one if the pool is empty. Each read connection is opened with the SQLITE_OPEN_READ_ONLY flag, which tells SQLite this connection will never write.

A ReadPoolGuard RAII wrapper ensures the connection returns to the pool when the guard goes out of scope. No manual cleanup, no forgotten connections.

The result: the UI can query video metadata, run searches, and load dashboard stats while a background thread inserts newly discovered clips. Reads no longer serialize behind writes.

PRAGMA tuning

SQLite's defaults are conservative. They prioritize safety and compatibility over performance. For a desktop application where the database file lives on a local SSD and a power loss means at most losing the last few seconds of work, we can be more aggressive.

PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
PRAGMA cache_size=-131072;     -- 128 MB
PRAGMA temp_store=MEMORY;
PRAGMA mmap_size=1073741824;   -- 1 GB
PRAGMA foreign_keys=ON;
PRAGMA busy_timeout=5000;      -- 5 seconds

Each of these is a deliberate choice.

journal_mode=WAL enables the concurrent reader/writer model. Without WAL, SQLite uses a rollback journal that locks the entire database during writes. WAL is strictly better for our workload.

synchronous=NORMAL is the key performance setting. The default (FULL) calls fsync after every transaction commit, which is expensive. In WAL mode, NORMAL is still safe against data corruption from a crash. You can lose the last transaction if the OS crashes mid-write, but the database will not become corrupted. For a desktop app, that trade-off is easy to accept.

cache_size=-131072 gives SQLite 128 MB of page cache. The negative number means kibibytes, not pages. This keeps hot pages (indexes, frequently accessed tables) in memory. When a user browses their library, the same pages get hit repeatedly. Keeping them cached avoids redundant disk reads.

mmap_size=1073741824 enables memory-mapped I/O for up to 1 GB of the database file. Instead of read() syscalls, SQLite accesses pages through the virtual memory system. This reduces context switches and lets the OS page cache work more efficiently. For read-heavy workloads on a local SSD, the improvement is measurable.

busy_timeout=5000 tells SQLite to retry for up to 5 seconds when it encounters a lock conflict instead of immediately returning SQLITE_BUSY. This smooths out brief contention between the write connection and read pool without requiring application-level retry logic.

Prepared statement caching

Every query in FrameQuery uses prepare_cached() instead of prepare():

conn.prepare_cached(&sql).map_err(|e| ...)?

This is a small change with a meaningful impact. SQLite parses SQL into a bytecode program before executing it. Parsing is not free, especially for complex queries with joins and subqueries. prepare_cached() keeps the compiled bytecode in a per-connection cache keyed by the SQL string. The second time the same query runs, parsing is skipped entirely.

We use this pattern across roughly 100 query sites in the codebase, on both read and write paths. For queries that run thousands of times per session (fetching thumbnails, checking processing status, resolving tags), the cumulative savings are significant.

Trigger-maintained dashboard stats

The FrameQuery dashboard shows summary statistics: total videos, total duration, total file size. The naive implementation runs aggregate queries:

SELECT COUNT(*) FROM videos;
SELECT SUM(duration) FROM videos;
SELECT SUM(size) FROM videos;

These are full table scans. For a library with 50,000 videos, they are fast enough individually but add latency to every dashboard load. More importantly, they hold read locks while scanning, competing with other queries.

Starting in schema version 32, we pre-compute these values and maintain them with triggers:

CREATE TABLE IF NOT EXISTS stats_cache (
    key TEXT PRIMARY KEY,
    value_int INTEGER NOT NULL DEFAULT 0,
    value_real REAL NOT NULL DEFAULT 0.0
);

Three triggers keep the cache in sync:

trg_stats_video_insert increments the counts and sums when a video is inserted.
trg_stats_video_delete decrements them when a video is deleted.
trg_stats_video_update adjusts duration and size when those columns change.

Reading a stat becomes a single primary-key lookup instead of a table scan:

get_cached_stat_int(conn, "total_videos")   // returns 0 if key is missing
get_cached_stat_real(conn, "total_duration")

The fallback to zero is intentional. Users upgrading from older versions (pre-v32) will not have the stats_cache table populated yet. The migration backfills the values, but returning zero gracefully during the transition is safer than crashing.

The trade-off is complexity. Every write path that touches the videos table now has trigger overhead. In practice, the triggers add negligible cost to inserts and deletes (a few integer additions), and they eliminate repeated full table scans on reads. For a dashboard that refreshes frequently, this is clearly worth it.

Indexing strategy

FrameQuery's schema has 30+ indexes. A few patterns are worth calling out.

Compound indexes match common query patterns. When the UI filters videos by source folder and processing status, this index covers it without touching the main table:

CREATE INDEX idx_videos_source_status ON videos(source_id, status);

Partial indexes reduce index size for sparse columns. Not every video has been hashed yet, so we only index rows where the hash exists:

CREATE INDEX idx_videos_src_size_qhash
   ON videos(source_id, size, quick_hash) WHERE quick_hash IS NOT NULL;
CREATE INDEX idx_videos_src_size_dhash
   ON videos(source_id, size, detail_hash) WHERE detail_hash IS NOT NULL;

These indexes power duplicate detection. When a new file is discovered, we check if any existing video in the same source has the same size and hash. The partial index means we only pay storage costs for videos that have actually been hashed, and the query planner can skip un-hashed rows entirely.

Junction table indexes cover both directions of many-to-many relationships:

CREATE INDEX idx_video_tags_map_video ON video_tags_map(video_id);
CREATE INDEX idx_video_tags_map_tag ON video_tags_map(tag_id);

This lets us efficiently answer both "what tags does this video have" and "what videos have this tag" without full table scans on the junction table.

Full-text search with Tantivy

SQLite's built-in FTS5 is capable, but FrameQuery's search needs go beyond what it handles well. We need field-level boosting, stemming, phrase queries, range filters on numeric fields, and biometric filtering. We use Tantivy, a Rust full-text search library, as a separate index alongside SQLite.

The tokenizer pipeline handles English text:

pub const EN_STEM_TOKENIZER: &str = "en_stem";

let analyzer = TextAnalyzer::builder(SimpleTokenizer::default())
    .filter(RemoveLongFilter::limit(40))
    .filter(LowerCaser)
    .filter(StopWordFilter::new(Language::English)?)
    .filter(Stemmer::new(Language::English))
    .build();

Tokens pass through whitespace splitting, removal of anything over 40 characters (garbage strings from OCR or corrupt metadata), lowercasing, English stop word removal, and English stemming. This means "running" and "run" match each other, and common words like "the" and "is" do not pollute results.

The search schema (currently version 6) indexes seven text fields and six numeric fields per document. Each text field gets a different BM25 boost weight:

Filename (3.0) > match_text (2.5) > transcript (2.0) >
tags (1.8) > scene_desc (1.5) > objects (1.2) > catch-all (1.0)

The reasoning: if a user searches for "interview," a file literally named "interview.mp4" is almost certainly the best result. A transcript containing the word "interview" is a strong signal. An object detection label is weaker. Boosted queries for each field are combined with Occur::Should, so documents matching multiple fields rank higher.

The Tantivy writer uses a 50 MB heap and commits once per video. A single commit batches the video document, all its scene documents, and all transcript segments. This keeps the index consistent (you never see partial results for a video) and amortizes the commit overhead across many documents.

Perceptual and visual hashes

Schema version 31 added two binary columns to the videos table:

thumb_hash (28 bytes): A ThumbHash encoding of the video's representative frame. This is small enough to store inline and decode instantly into a blurred placeholder image. When you scroll through a large library, FrameQuery can show placeholders before the actual thumbnails load from disk.
perceptual_hash (32 bytes): A perceptual hash for near-duplicate detection. Two videos with a small Hamming distance between their perceptual hashes are visually similar, even if they differ in resolution, codec, or compression level.

Both are stored as BLOBs rather than hex strings. This halves the storage cost and makes Hamming distance comparisons faster since they operate on raw bytes instead of parsing hex.

What we would do differently

If we started over, we would implement the reader/writer split from day one. The single-mutex approach was simpler to build but created performance problems that were hard to diagnose. Connection pool issues look like slow queries, and slow queries look like missing indexes. Separating reads from writes made everything easier to reason about.

We would also add the stats_cache triggers earlier. Aggregate queries are fine when your table has 100 rows. They become a problem so gradually that you do not notice until the dashboard feels sluggish and you are not sure why.

The PRAGMA tuning was the easiest win. A few lines of SQL, applied once when the database is opened, with a large and immediate impact on throughput. If you are building a desktop application with SQLite, start with these settings.

Want a video search tool built on these foundations? Download FrameQuery to try FrameQuery.