Why Our Thumbnails Were Secretly Lossless (and How We Fixed It)

We had a constant defined in image_utils.rs that looked perfectly reasonable:

pub const THUMBNAIL_WEBP_QUALITY: u8 = 82;
pub const THUMBNAIL_MAX_WIDTH: u32 = 640;

Quality 82 is a solid choice for thumbnails. Visually indistinguishable from the source at 640px wide, and significantly smaller than lossless. Except our thumbnails were not quality 82. They were lossless. Every single one.

The discovery

We noticed thumbnail storage was significantly larger than our back-of-the-envelope estimates. A library of a few thousand clips was producing gigabytes of thumbnail data where we expected hundreds of megabytes. The thumbnails looked fine, they were just too big.

The culprit was the image crate (version 0.24), which is the standard Rust library for image encoding and decoding. Its default WebP encoder path is lossless-only. The quality parameter we were passing was silently ignored. No warning, no error. The API accepts the value and does nothing with it.

This is the kind of bug that survives code review because the code reads correctly. You see a quality parameter being set, you assume it works.

The fix: use the deprecated API

The image crate does have a lossy WebP encoder. It is just not the default path, and it is marked as deprecated with planned removal in a future major version. We had to use it anyway.

// Use the lossy VP8 encoder with the caller's quality setting.
// Note: lossy encoding is deprecated in the image crate (planned
// removal in a future major version). When that happens, migrate to
// ravif (AVIF) or another lossy codec.
#[expect(deprecated, reason = "image 0.24 deprecates lossy WebP; will migrate when removed")]
let encoder = image::codecs::webp::WebPEncoder::new_with_quality(
    &mut writer,
    image::codecs::webp::WebPQuality::lossy(quality),
);
encoder.encode(rgb.as_raw(), w, h, image::ColorType::Rgb8)?;

The #[expect(deprecated)] attribute suppresses the deprecation warning while documenting why we are using a deprecated API. When the image crate eventually removes this path, the expect attribute will itself produce a warning, reminding us to migrate.

The planned replacement is AVIF via the ravif crate. AVIF achieves 20 to 50 percent smaller file sizes than WebP at equivalent perceptual quality. We will switch when the lossy WebP path is actually removed, not before. Working code on a deprecated API beats a premature migration.

Rebuilding the pipeline

Fixing the encoding bug meant touching the thumbnail generation code. While we were in there, we rebuilt the entire pipeline. What started as a one-line fix turned into a week of work that touched resizing, placeholders, deduplication, tone mapping, and memory allocation.

SIMD-accelerated resizing

The image crate's built-in resize is functional but slow. We replaced it with fast_image_resize, which uses SIMD instructions for the heavy lifting.

pub fn fast_resize(
    img: &image::DynamicImage,
    dst_width: u32,
    dst_height: u32,
) -> Result<image::DynamicImage, Box<dyn std::error::Error + Send + Sync>> {
    use fast_image_resize as fir;

    let src_image = fir::images::Image::from_vec_u8(
        src_w, src_h, rgb.into_raw(), fir::PixelType::U8x3,
    )?;

    let mut resizer = fir::Resizer::new();
    resizer.resize(
        &src_image,
        &mut dst_image,
        &fir::ResizeOptions::new()
            .resize_alg(fir::ResizeAlg::Convolution(fir::FilterType::Lanczos3)),
    )?;
}

The speedup is roughly 14x over the image crate's resize on the same hardware. The library dispatches at runtime to AVX2 on x86-64 machines, NEON on ARM, or a scalar fallback when neither is available. We use Lanczos3 filtering, which is slower than bilinear but produces noticeably sharper downscaled images.

This function is not just used for thumbnails. Face detection models expect 640x640 input, face embedding models expect 112x112, and all of these resizes now go through the same fast path.

ThumbHash placeholders

When you open a large library in FrameQuery, there is a moment before the real thumbnails load from disk. Without placeholders, you see a grid of blank rectangles that pop into existence one by one. It looks broken even when it is working correctly.

We solved this with ThumbHash, a compact image placeholder format.

pub fn compute_thumb_hash(
    img: &image::DynamicImage,
) -> Result<Vec<u8>, Box<dyn std::error::Error + Send + Sync>> {
    let small = if w > tw { fast_resize(img, tw, th)? } else { img.clone() };
    let rgba = small.to_rgba8();
    let (sw, sh) = (rgba.width() as usize, rgba.height() as usize);
    Ok(thumbhash::rgba_to_thumb_hash(sw, sh, rgba.as_raw()))
}

A ThumbHash is approximately 28 bytes. It encodes the aspect ratio, average color, and a blurred approximation of the image structure. We downscale to 100 pixels wide before hashing and store the result as a BLOB in SQLite alongside each clip's metadata.

On the frontend, ThumbHash decodes are essentially instant. The grid fills immediately with blurred color approximations, and real thumbnails replace them as they load. The visual difference between "nothing" and "blurred preview" is significant for perceived performance.

Perceptual hashing for deduplication

Video editors accumulate duplicates. Different exports of the same clip, copies across drives, re-encoded versions with slightly different compression. We needed a way to detect near-duplicates without doing pixel-by-pixel comparison across the entire library.

pub fn compute_perceptual_hash(img: &image::DynamicImage) -> Vec<u8> {
    let hash: blockhash::Blockhash256 = blockhash::blockhash256(&Img(img));
    let bytes: [u8; 32] = hash.into();
    bytes.to_vec()
}

Blockhash produces a 256-bit hash that is deterministic, integer-only (no floating point), and robust to compression artifacts. Two visually identical images that differ only in encoding will produce hashes with a small Hamming distance. We store the hash as a 32-byte BLOB in SQLite and flag pairs with a Hamming distance under roughly 25 out of 256 bits as near-duplicates.

The choice of blockhash over alternatives like pHash or dHash came down to simplicity and reliability. Blockhash uses integer-only operations (no floating point), is fully deterministic, and is robust to compression artefacts. When the same clip gets re-encoded at different quality levels, the hash stays stable. It is not particularly robust to crops or major aspect ratio changes, but for our use case of detecting the same footage re-exported with different codecs or bitrates, it works well.

HDR tone mapping for RAW footage

Professional cinema cameras shoot in wide color gamuts (Rec.2020, DCI-P3) and high dynamic range. Thumbnails need to be sRGB for display. Naive clamping destroys highlights and shifts colors. We needed proper tone mapping.

const HABLE_A: f32 = 0.15;  // Shoulder strength
const HABLE_B: f32 = 0.50;  // Linear strength
const HABLE_C: f32 = 0.10;  // Linear angle
const HABLE_D: f32 = 0.20;  // Toe strength
const HABLE_E: f32 = 0.02;  // Toe numerator
const HABLE_F: f32 = 0.30;  // Toe denominator
const WHITE_POINT: f32 = 11.2;

pub fn tonemap_rgb16_to_srgb8(src: &[u16]) -> Vec<u8> {
    // 1. Normalize u16 to [0, WHITE_POINT] linear
    // 2. Hable filmic tone map to [0, 1]
    // 3. OKLAB gamut map to sRGB [0, 1]
    // 4. Apply sRGB transfer function to [0, 255]
}

We use the Hable filmic curve, originally developed for Uncharted 2. It compresses highlights gracefully instead of clipping them, which preserves detail in bright skies and specular reflections. The constants above are the standard Hable parameters, well-tested in both game engines and film pipelines.

After tone mapping, colors may still fall outside the sRGB gamut. We handle this with OKLAB gamut mapping: a binary search (8 iterations, less than 0.4 percent chroma error) that reduces chroma until the color fits within sRGB while preserving lightness and hue. This means an out-of-gamut red becomes a less saturated red, not a shifted orange. The palette crate handles the OKLAB and OKLCH color space conversions.

FFmpeg seeking optimisation

FrameQuery uses FFmpeg to extract frames from video files before indexing them through the Rust pipeline. The order of arguments matters more than you might expect.

# Fast (input seek, keyframe-based):
ffmpeg -ss <time> -i input.mp4 -vframes 1 thumb.jpg

# Slow (output seek, frame-by-frame decode):
ffmpeg -i input.mp4 -ss <time> -vframes 1 thumb.jpg

Placing -ss before -i tells FFmpeg to seek to the nearest keyframe before opening the input, then decode only a few frames to reach the exact timestamp. Placing it after -i means FFmpeg decodes every frame from the start of the file until it reaches the target. For a timestamp 30 minutes into a file, that is the difference between milliseconds and minutes.

We have FFmpeg output PNG rather than WebP because FFmpeg builds do not always include a WebP encoder. The pattern is: extract a temporary PNG via FFmpeg, then encode to lossy WebP in Rust where we control the encoder. One extra step, but it works on every FFmpeg build.

Arena allocation for batch processing

Thumbnail generation processes clips in batches. Each clip involves multiple allocations: decoded frames, resized buffers, intermediate color conversions. In the standard allocator, each of these is individually allocated and freed, which adds up when processing thousands of clips.

pub fn rgb16_to_rgb8_arena<'a>(arena: &'a bumpalo::Bump, src: &[u16]) -> &'a [u8] {
    let dst = arena.alloc_slice_fill_default(src.len());
    for (d, &s) in dst.iter_mut().zip(src.iter()) {
        *d = (s >> 8) as u8;
    }
    dst
}

We use bumpalo, an arena allocator, to batch all temporary allocations for a single clip into one contiguous region. When the clip is done, the entire arena resets in one operation. No per-allocation free, no fragmentation, no deallocation overhead. For batch processing, this is measurably faster than the default allocator.

What we learned

A silently ignored quality parameter led us to rebuild an entire pipeline. The original bug was a one-line fix (use the explicit lossy encoder), but investigating it revealed that the surrounding code had accumulated technical debt: slow resizing, no placeholders, no deduplication, naive tone mapping.

Sometimes a small bug is a signal that an entire subsystem needs attention. The thumbnail pipeline went from "technically working but subtly wrong" to something we are confident in. Thumbnails are smaller, load faster, handle HDR correctly, and detect duplicates.

We are still on the deprecated WebP API. When the image crate removes it, we will migrate to AVIF. Until then, it works.

Download FrameQuery to try FrameQuery.