Whisper and On-device AI Optimization

Whisper Fundamentals: Understanding OpenAI’s Speech Recognition Model Architecture

2026년 02월 04일

AI/Deep Learning, Edge AI & Robotics, Whisper and On-device AI Optimization

Understanding Whisper's encoder-decoder architecture, compute bottlenecks, and model size trade-offs before optimizing it for on-device mobile deployment.

Read more →

Real-time Whisper Is a Battery Nightmare (Here’s How to Fix It)

2026년 01월 18일

AI/Deep Learning, Edge AI & Robotics, Whisper and On-device AI Optimization

Streaming Whisper drains 1% battery per minute. VAD, adaptive inference, and thermal management strategies to build production-ready on-device speech recognition.

Read more →

On-Device Inference: Running Whisper Efficiently with ONNX and Core ML

2026년 01월 17일

AI/Deep Learning, Edge AI & Robotics, Whisper and On-device AI Optimization

ONNX Runtime with CoreML provider beats native Core ML for Whisper on iOS. Here's why, with conversion scripts, memory tricks, and real benchmark data.

Read more →

Optimizing Whisper for Mobile: Model Quantization and Compression Techniques

2026년 01월 15일

AI/Deep Learning, Edge AI & Robotics, Whisper and On-device AI Optimization

Comparing post-training quantization, static quantization, and QAT for deploying Whisper on mobile — with real implementation failures and the ONNX workaround that actually works.

Read more →

Category: Whisper and On-device AI Optimization

Whisper Fundamentals: Understanding OpenAI’s Speech Recognition Model Architecture

Real-time Whisper Is a Battery Nightmare (Here’s How to Fix It)

On-Device Inference: Running Whisper Efficiently with ONNX and Core ML

Optimizing Whisper for Mobile: Model Quantization and Compression Techniques