Computer vision and image processing applications demand high performance, reliability, and often real-time capabilities. From autonomous vehicles and robotics to augmented reality and medical imaging, these systems process enormous amounts of visual data and must do so efficiently and safely. Rust, with its combination of performance comparable to C/C++ and memory safety guarantees without garbage collection, has emerged as an excellent choice for computer vision development.
In this comprehensive guide, we’ll explore Rust’s ecosystem for computer vision and image processing as it stands in early 2025. We’ll examine the libraries, frameworks, and tools that have matured over the years, providing developers with robust building blocks for creating efficient and reliable vision applications. Whether you’re building real-time video processing systems, image analysis tools, or integrating computer vision with machine learning, this guide will help you navigate the rich landscape of Rust’s computer vision ecosystem.
Image Processing Foundations
At the core of computer vision are libraries for loading, manipulating, and saving images:
Image: Rust’s Core Image Library
// Using the image crate for basic image processing
// Cargo.toml:
// [dependencies]
// image = "0.24"
use image::{GenericImageView, ImageBuffer, Rgb};
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load an image from file
let img = image::open("input.jpg")?;
// Get image dimensions
let (width, height) = img.dimensions();
println!("Image dimensions: {}x{}", width, height);
// Convert to RGB format
let rgb_img = img.to_rgb8();
// Create a new image for the output
let mut output = ImageBuffer::new(width, height);
// Process each pixel - simple grayscale conversion
for (x, y, pixel) in rgb_img.enumerate_pixels() {
let r = pixel[0] as u32;
let g = pixel[1] as u32;
let b = pixel[2] as u32;
// Standard grayscale conversion formula
let gray = ((0.299 * r as f32) + (0.587 * g as f32) + (0.114 * b as f32)) as u8;
output.put_pixel(x, y, Rgb([gray, gray, gray]));
}
// Save the processed image
output.save("grayscale.jpg")?;
println!("Grayscale image saved as grayscale.jpg");
// Create a simple image filter - horizontal blur
let mut blurred = ImageBuffer::new(width, height);
for y in 0..height {
for x in 0..width {
let mut r_sum = 0;
let mut g_sum = 0;
let mut b_sum = 0;
let mut count = 0;
// Simple horizontal blur with a 5-pixel kernel
for dx in 0..5 {
let nx = x.saturating_add(dx).saturating_sub(2);
if nx < width {
let pixel = rgb_img.get_pixel(nx, y);
r_sum += pixel[0] as u32;
g_sum += pixel[1] as u32;
b_sum += pixel[2] as u32;
count += 1;
}
}
// Calculate average
let r_avg = (r_sum / count) as u8;
let g_avg = (g_sum / count) as u8;
let b_avg = (b_sum / count) as u8;
blurred.put_pixel(x, y, Rgb([r_avg, g_avg, b_avg]));
}
}
// Save the blurred image
blurred.save("blurred.jpg")?;
println!("Blurred image saved as blurred.jpg");
Ok(())
}
ImageProc: Advanced Image Processing
// Using imageproc for advanced image processing
// Cargo.toml:
// [dependencies]
// image = "0.24"
// imageproc = "0.23"
use image::{GrayImage, Luma, Rgb, RgbImage};
use imageproc::{
corners::corners_fast9,
edges::canny,
filter::{gaussian_blur_f32, sobel_gradients},
morphology::{dilate, erode},
rect::Rect,
drawing::{draw_cross_mut, draw_hollow_rect_mut},
};
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load an image
let img = image::open("input.jpg")?;
let rgb_img = img.to_rgb8();
let gray_img = img.to_luma8();
// Edge detection with Canny algorithm
let edges = canny(&gray_img, 50.0, 100.0);
edges.save("edges.jpg")?;
println!("Edge detection saved as edges.jpg");
// Gaussian blur
let blurred = gaussian_blur_f32(&rgb_img, 2.0);
blurred.save("gaussian_blur.jpg")?;
println!("Gaussian blur saved as gaussian_blur.jpg");
// Sobel gradient
let gradients = sobel_gradients(&gray_img);
// Convert gradient to displayable image
let mut gradient_img = GrayImage::new(gray_img.width(), gray_img.height());
for (x, y, pixel) in gradient_img.enumerate_pixels_mut() {
let gradient = gradients.get_pixel(x, y);
let magnitude = ((gradient[0] as f32).powi(2) + (gradient[1] as f32).powi(2)).sqrt();
*pixel = Luma([magnitude.min(255.0) as u8]);
}
gradient_img.save("gradients.jpg")?;
println!("Gradient magnitude saved as gradients.jpg");
// Morphological operations
let dilated = dilate(&gray_img, Rect::at(0, 0).of_size(3, 3));
dilated.save("dilated.jpg")?;
println!("Dilated image saved as dilated.jpg");
let eroded = erode(&gray_img, Rect::at(0, 0).of_size(3, 3));
eroded.save("eroded.jpg")?;
println!("Eroded image saved as eroded.jpg");
// Corner detection
let corners = corners_fast9(&gray_img, 50);
// Draw corners on the original image
let mut corner_img = rgb_img.clone();
for corner in corners {
draw_cross_mut(&mut corner_img, Rgb([255, 0, 0]), corner.0 as i32, corner.1 as i32);
}
corner_img.save("corners.jpg")?;
println!("Corner detection saved as corners.jpg");
Ok(())
}
Computer Vision Algorithms
Rust provides libraries for implementing various computer vision algorithms:
Feature Detection and Matching
// Feature detection and matching
// Cargo.toml:
// [dependencies]
// image = "0.24"
// opencv = { version = "0.82", features = ["opencv-4"] }
use opencv::{
core::{DMatch, KeyPoint, Mat, Vector},
features2d::{BFMatcher, BFMatcher_create, ORB, ORB_create},
imgcodecs::{imread, IMREAD_GRAYSCALE},
prelude::*,
};
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load two images for feature matching
let img1 = imread("scene.jpg", IMREAD_GRAYSCALE)?;
let img2 = imread("object.jpg", IMREAD_GRAYSCALE)?;
// Create ORB feature detector
let mut orb = ORB_create(500, 1.2, 8, 31, 0, 2, ORB::HARRIS_SCORE, 31, 20)?;
// Detect keypoints and compute descriptors
let mut keypoints1 = Vector::<KeyPoint>::new();
let mut keypoints2 = Vector::<KeyPoint>::new();
let mut descriptors1 = Mat::default();
let mut descriptors2 = Mat::default();
orb.detect_and_compute(
&img1,
&Mat::default(),
&mut keypoints1,
&mut descriptors1,
false,
)?;
orb.detect_and_compute(
&img2,
&Mat::default(),
&mut keypoints2,
&mut descriptors2,
false,
)?;
println!("Found {} keypoints in image 1", keypoints1.len());
println!("Found {} keypoints in image 2", keypoints2.len());
// Match descriptors using Brute Force matcher
let matcher = BFMatcher_create(opencv::core::NORM_HAMMING, false)?;
let mut matches = Vector::<DMatch>::new();
matcher.match_(&descriptors1, &descriptors2, &mut matches, &Mat::default())?;
// Sort matches by distance
let mut matches_vec = matches.to_vec();
matches_vec.sort_by(|a, b| a.distance.partial_cmp(&b.distance).unwrap());
// Take only good matches (e.g., top 10%)
let num_good_matches = (matches_vec.len() as f32 * 0.1) as usize;
let good_matches = matches_vec.into_iter().take(num_good_matches).collect::<Vec<_>>();
println!("Found {} good matches", good_matches.len());
// Draw matches
let mut img_matches = Mat::default();
let mut good_matches_vec = Vector::<DMatch>::new();
for m in good_matches {
good_matches_vec.push(m);
}
opencv::features2d::draw_matches(
&img1,
&keypoints1,
&img2,
&keypoints2,
&good_matches_vec,
&mut img_matches,
opencv::core::Scalar::new(0.0, 255.0, 0.0, 0.0),
opencv::core::Scalar::new(0.0, 0.0, 255.0, 0.0),
&Vector::<u8>::new(),
opencv::features2d::DrawMatchesFlags::DEFAULT,
)?;
// Save the result
opencv::imgcodecs::imwrite("matches.jpg", &img_matches, &Vector::<i32>::new())?;
println!("Feature matches saved as matches.jpg");
Ok(())
}
Optical Flow
// Optical flow calculation
// Cargo.toml:
// [dependencies]
// opencv = { version = "0.82", features = ["opencv-4"] }
use opencv::{
core::{Point2f, Scalar, Vector},
imgcodecs::{imread, IMREAD_COLOR},
imgproc::{circle, line},
prelude::*,
video::calcOpticalFlowPyrLK,
};
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load two consecutive frames
let frame1 = imread("frame1.jpg", IMREAD_COLOR)?;
let frame2 = imread("frame2.jpg", IMREAD_COLOR)?;
// Convert to grayscale
let mut gray1 = Mat::default();
let mut gray2 = Mat::default();
opencv::imgproc::cvt_color(&frame1, &mut gray1, opencv::imgproc::COLOR_BGR2GRAY, 0)?;
opencv::imgproc::cvt_color(&frame2, &mut gray2, opencv::imgproc::COLOR_BGR2GRAY, 0)?;
// Find good features to track
let mut points1 = Vector::<Point2f>::new();
opencv::imgproc::good_features_to_track(
&gray1,
&mut points1,
100, // max corners
0.01, // quality level
10.0, // min distance
&Mat::default(),
3, // block size
false,
0.04, // k
)?;
println!("Found {} feature points", points1.len());
// Calculate optical flow
let mut points2 = Vector::<Point2f>::new();
let mut status = Vector::<u8>::new();
let mut err = Vector::<f32>::new();
calcOpticalFlowPyrLK(
&gray1,
&gray2,
&points1,
&mut points2,
&mut status,
&mut err,
opencv::core::Size::new(21, 21),
3,
opencv::core::TermCriteria::new(
opencv::core::TermCriteria_Type::COUNT as i32 + opencv::core::TermCriteria_Type::EPS as i32,
30,
0.01,
)?,
0,
0.001,
)?;
// Draw the tracks
let mut result = frame2.clone();
for i in 0..points1.len() {
if status.get(i)? == 1 {
let p1 = points1.get(i)?;
let p2 = points2.get(i)?;
// Draw the movement line
line(
&mut result,
opencv::core::Point::new(p1.x as i32, p1.y as i32),
opencv::core::Point::new(p2.x as i32, p2.y as i32),
Scalar::new(0.0, 255.0, 0.0, 0.0), // Green
2,
opencv::imgproc::LINE_AA,
0,
)?;
// Draw the current position
circle(
&mut result,
opencv::core::Point::new(p2.x as i32, p2.y as i32),
5,
Scalar::new(0.0, 0.0, 255.0, 0.0), // Red
-1,
opencv::imgproc::LINE_AA,
0,
)?;
}
}
// Save the result
opencv::imgcodecs::imwrite("optical_flow.jpg", &result, &Vector::<i32>::new())?;
println!("Optical flow saved as optical_flow.jpg");
Ok(())
}
Deep Learning Integration
Rust provides tools for integrating computer vision with deep learning:
Tract: Neural Network Inference
// Using Tract for neural network inference
// Cargo.toml:
// [dependencies]
// tract-onnx = "0.19"
// image = "0.24"
// ndarray = "0.15"
use image::{GenericImageView, ImageBuffer, Rgb};
use ndarray::{Array4};
use std::path::Path;
use tract_onnx::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load an ONNX model
let model = tract_onnx::onnx()
.model_for_path("mobilenet_v2.onnx")?
.with_input_fact(0, InferenceFact::dt_shape(f32::datum_type(), tvec!(1, 3, 224, 224)))?
.into_optimized()?
.into_runnable()?;
// Load and preprocess the image
let img = image::open("cat.jpg")?;
let resized = img.resize_exact(224, 224, image::imageops::FilterType::Triangle);
// Convert to RGB if needed
let rgb_image = resized.to_rgb8();
// Prepare input tensor
let mut input = Array4::zeros((1, 3, 224, 224));
// Normalize pixel values and reorder channels to CHW format
for y in 0..224 {
for x in 0..224 {
let pixel = rgb_image.get_pixel(x, y);
// Normalize to [0, 1] and then apply ImageNet normalization
input[[0, 0, y as usize, x as usize]] = (pixel[0] as f32 / 255.0 - 0.485) / 0.229;
input[[0, 1, y as usize, x as usize]] = (pixel[1] as f32 / 255.0 - 0.456) / 0.224;
input[[0, 2, y as usize, x as usize]] = (pixel[2] as f32 / 255.0 - 0.406) / 0.225;
}
}
// Run inference
let result = model.run(tvec!(input.into_tensor()))?;
// Get output tensor
let output = result[0].to_array_view::<f32>()?;
// Find top 5 predictions
let mut scores: Vec<(usize, f32)> = output
.iter()
.enumerate()
.map(|(i, &score)| (i, score))
.collect();
scores.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
// Load class labels
let labels = std::fs::read_to_string("imagenet_classes.txt")?
.lines()
.map(|s| s.to_string())
.collect::<Vec<_>>();
// Print top 5 predictions
println!("Top 5 predictions:");
for (i, (class_idx, score)) in scores.iter().take(5).enumerate() {
println!(
"{}. {} - {:.2}%",
i + 1,
labels.get(*class_idx).unwrap_or(&format!("Class {}", class_idx)),
score * 100.0
);
}
Ok(())
}
Real-time Video Processing
Rust provides tools for real-time video processing:
Video Processing with OpenCV
// Real-time video processing with OpenCV
// Cargo.toml:
// [dependencies]
// opencv = { version = "0.82", features = ["opencv-4"] }
use opencv::{
core::Scalar,
highgui,
imgproc,
prelude::*,
videoio::{VideoCapture, VideoWriter, CAP_ANY, VideoWriter_fourcc},
};
use std::path::Path;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Open a video file or camera
let mut cap = VideoCapture::new(0, CAP_ANY)?; // 0 for default camera
if !cap.is_opened()? {
return Err("Could not open video capture device".into());
}
// Get video properties
let width = cap.get(opencv::videoio::CAP_PROP_FRAME_WIDTH)? as i32;
let height = cap.get(opencv::videoio::CAP_PROP_FRAME_HEIGHT)? as i32;
let fps = cap.get(opencv::videoio::CAP_PROP_FPS)?;
println!("Video: {}x{}, {} FPS", width, height, fps);
// Create video writer for output
let fourcc = VideoWriter_fourcc('M' as i8, 'J' as i8, 'P' as i8, 'G' as i8)?;
let mut writer = VideoWriter::new(
"processed_video.avi",
fourcc,
fps,
opencv::core::Size::new(width, height),
true,
)?;
// Create a window for display
highgui::named_window("Video Processing", highgui::WINDOW_AUTOSIZE)?;
// Process video frames
let mut frame_count = 0;
let max_frames = 100; // Process up to 100 frames
while frame_count < max_frames {
// Read a frame
let mut frame = Mat::default();
if !cap.read(&mut frame)? || frame.empty()? {
break;
}
// Convert to grayscale
let mut gray = Mat::default();
imgproc::cvt_color(&frame, &mut gray, imgproc::COLOR_BGR2GRAY, 0)?;
// Apply Gaussian blur
let mut blurred = Mat::default();
imgproc::gaussian_blur(
&gray,
&mut blurred,
opencv::core::Size::new(5, 5),
1.5,
1.5,
opencv::core::BORDER_DEFAULT,
)?;
// Apply Canny edge detection
let mut edges = Mat::default();
imgproc::canny(&blurred, &mut edges, 50.0, 150.0, 3, false)?;
// Convert edges back to BGR for display and saving
let mut edges_bgr = Mat::default();
imgproc::cvt_color(&edges, &mut edges_bgr, imgproc::COLOR_GRAY2BGR, 0)?;
// Add text with frame number
imgproc::put_text(
&mut edges_bgr,
&format!("Frame: {}", frame_count),
opencv::core::Point::new(10, 30),
imgproc::FONT_HERSHEY_SIMPLEX,
1.0,
Scalar::new(0.0, 255.0, 0.0, 0.0), // Green
2,
imgproc::LINE_AA,
false,
)?;
// Display the frame
highgui::imshow("Video Processing", &edges_bgr)?;
// Write the frame to output video
writer.write(&edges_bgr)?;
// Wait for a key press (1ms delay)
let key = highgui::wait_key(1)?;
if key > 0 && key != 255 {
// Break if any key is pressed
break;
}
frame_count += 1;
}
println!("Processed {} frames", frame_count);
println!("Processed video saved as processed_video.avi");
Ok(())
}
Conclusion
Rust’s ecosystem for computer vision and image processing has matured significantly, offering a comprehensive set of tools and libraries for building efficient and reliable vision applications. From low-level image manipulation to high-level machine learning integration, Rust provides the building blocks needed to tackle the unique challenges of computer vision development.
The key takeaways from this exploration of Rust’s computer vision ecosystem are:
- Strong foundations with libraries like
image
andimageproc
providing robust image processing capabilities - OpenCV bindings enabling access to a vast array of computer vision algorithms
- Deep learning integration through libraries like Tract for neural network inference
- Real-time processing capabilities for video and camera input
- Safety and performance that make Rust ideal for vision applications
As computer vision technology continues to evolve, Rust’s focus on performance, safety, and expressiveness makes it an excellent choice for developers building the next generation of vision applications. Whether you’re creating autonomous systems, augmented reality experiences, medical imaging tools, or any other computer vision application, Rust’s ecosystem provides the tools you need to succeed.