Technology
CUDA, Metal, and CPU: Cross-Platform GPU Acceleration in a Desktop Video App
FrameQuery uses NVIDIA CUDA on Windows and Linux, Apple Metal on macOS, and falls back to CPU when no GPU is available. All from a single Rust codebase.
FrameQuery runs on Windows, macOS, and Linux. On each platform, it needs to decode professional video formats as fast as possible. That means GPU acceleration. But every platform has a different GPU API: NVIDIA CUDA on Windows and Linux, Apple Metal on macOS. And some machines have no suitable GPU at all.
We handle all three cases from a single Rust codebase.
Why GPU acceleration matters for video
Professional cinema RAW formats like RED R3D, ARRI ARRIRAW, and Blackmagic BRAW store raw sensor data that must be debayered before it becomes a viewable image. Debayering is computationally expensive. At full 8K resolution, a single frame can take hundreds of milliseconds on CPU. Multiply that by thousands of frames and proxy generation becomes painfully slow.
GPUs are perfect for this. Debayering is a per-pixel operation that parallelises beautifully. A GPU kernel can process a full 8K frame in a fraction of the time it takes a CPU.
Three platforms, three GPU backends
Rust's compile-time platform features let us include platform-specific GPU code only where it belongs. CUDA code compiles on Windows and Linux. Metal code compiles on macOS. If neither is available, the application still works, just without GPU acceleration.
At runtime, FrameQuery probes the GPU before committing to a decode path. A Windows machine might not have an NVIDIA GPU. A Mac might have integrated graphics that does not support the Metal features we need. The app detects what is available and selects the best option automatically.
NVIDIA GPUs (Windows and Linux)
On machines with NVIDIA GPUs, FrameQuery uses CUDA to overlap CPU and GPU work during decode. While the GPU is processing one frame, the CPU is already preparing the next. This pipelining keeps both the CPU and GPU busy and minimises idle time, which is critical when decoding thousands of frames for proxy generation.
Apple Silicon and AMD (macOS)
On macOS, FrameQuery uses Metal for GPU-accelerated decode. Apple Silicon's unified memory architecture is particularly well-suited to this workload. Data decoded by the CPU is already accessible to the GPU without needing to be copied across a bus, which means a simpler and faster pipeline on Mac hardware.
CPU fallback
When no supported GPU is available, FrameQuery falls back to CPU decode. The RED, ARRI, and Blackmagic SDKs all support this natively. CPU decode is slower but completely reliable and produces identical output. For single-frame operations like thumbnail generation, the CPU path is sometimes preferable.
Performance you can feel
With GPU acceleration enabled, proxy generation from cinema RAW files is fast enough to be practical for large libraries. Processing a day's worth of RED or Blackmagic footage takes a fraction of the shooting time, not a multiple of it.
The decoded frames stream directly into hardware-accelerated encoding where available (NVENC on NVIDIA, Quick Sync on Intel, AMF on AMD), with software encoding as a universal fallback.
Why this matters for editors
You should not need to think about GPU APIs or decode pipelines. When you drop R3D, ARRIRAW, or BRAW files into FrameQuery, the app automatically uses the fastest decode path your hardware supports. No configuration, no GPU settings to manage, no "please install CUDA toolkit" dialogs.
A single install works on a high-end workstation with dual NVIDIA GPUs and on a MacBook Air. The experience scales with your hardware, but always works.
Join the waitlist to try FrameQuery on your own hardware.