Hybrid Edge AI in Production: On-Device Computer Vision with Dynamic Models and Continuous Training
Introduction
Deploying AI to mobile devices is no longer just about shrinking models—it requires rethinking the entire system architecture.
This case study explores a production-grade hybrid edge AI system that combines:
- On-device computer vision
- Dynamic model delivery
- Cloud-assisted inference fallback
- Continuous training pipelines powered by real user data
The system supports 29 wildlife species, processes images and videos, and operates reliably in environments with unpredictable connectivity.
The Client
A mobile-first outdoor technology platform building an AI-native application for real-time wildlife analysis, requiring high reliability in low-connectivity, field-based environments.
The Problem
1. Static Model Deployment Doesn’t Scale
Traditional mobile ML approaches bundle models with the app:
- Requires app updates for every model improvement
- Increases app size significantly
- Limits flexibility across species or use cases
2. Pure Edge or Pure Cloud Is Insufficient
- Edge-only systems:
- Limited by device constraints
- Cloud-only systems:
- Fail in offline environments
A hybrid approach was required
3. Training Data Bottleneck
- High-quality labeled data is hard to source
- Synthetic or pre-trained datasets lack domain specificity
- Need for continuous learning from real-world usage
4. Multi-Modal Input Complexity
The system needed to support:
- Single images
- Multiple images
- Video inputs
All while maintaining consistent measurement outputs.
The Hybrid Edge AI Solution
The system was designed as a bi-directional learning and inference loop:
Key Capabilities
- Dynamic model downloads (on-demand)
- On-device inference using Onyx models
- Cloud inference fallback when available
- Continuous training pipeline using real user data
- Support for video and multi-image inputs
- Pre-inference species validation model
Architecture
1. Model Distribution Layer
Instead of bundling models in the app:
- Models are downloaded on demand
- Users fetch only:
- Relevant species models
- Required capabilities
Benefits:
- Reduced app size
- Faster updates without app releases
- Fine-grained control over model deployment
2. On-Device Inference (Primary Path)
- Uses Onyx models optimized for mobile
- Executes:
- Detection
- Measurement
- Feature extraction
Runs entirely offline when needed.
3. Cloud Inference (Fallback Layer)
When connectivity is available:
- The app can switch to online models
- Enables:
- Higher accuracy models
- More compute-intensive processing
Hybrid Logic:
- Default → on-device
- Upgrade → cloud (when available)
- Fallback → on-device (when offline)
4. Input Processing Pipeline
Supports multiple input modes:
a. Single Image
- Fast inference path
b. Multi-Image
- Aggregates measurements across frames
- Improves accuracy via redundancy
c. Video Input
- Extracts frames
- Applies inference across sequence
- Selects optimal frames for measurement
5. Pre-Validation Classification Model
Before running heavy inference:
- A lightweight classification model:
- Detects species
- Validates input
If mismatch:
- Rejects input early
- Prevents incorrect scoring
This reduces:
- False positives
- Unnecessary compute
6. Training & Feedback Pipeline
A critical differentiator of the system:
Step 1: Data Collection
- User-uploaded images/videos
- Stored with metadata
Step 2: Admin Review
- Internal admin app used for:
- Validation
- Annotation
- Quality control
Step 3: Training Pipeline
- Triggered via on-demand Azure Functions
- Automates:
- Data preprocessing
- Model retraining
- Evaluation
Step 4: Model Deployment
- Updated models are:
- Packaged
- Made available for download
This creates a continuous improvement loop.
7. Mobile Application Layer
- Built using React Native
- Integrates:
- Camera APIs
- Local storage
- Model management system
Handles:
- Model versioning
- Download orchestration
- Inference routing (edge vs cloud)
Implementation Approach
Phase 1: Prototype Validation
- Built initial CV pipeline (as shown in the prototype flow )
- Validated measurement feasibility
Phase 2: Edge Model Deployment
- Converted models to Onyx format
- Integrated on-device inference
Phase 3: Hybrid Inference Layer
- Implemented routing between:
- Offline models
- Cloud models
Phase 4: Model Distribution System
- Built dynamic model download system
- Introduced versioning and updates
Phase 5: Training Pipeline Automation
- Integrated Azure Functions
- Enabled continuous retraining
Phase 6: Multi-Modal Input Support
- Added:
- Video processing
- Multi-image aggregation
Results
System Performance
- Real-time inference on-device
- Seamless switching between edge and cloud
Scalability
- Models updated without app releases
- Distributed model delivery
Accuracy Improvements
- Continuous learning from real-world data
- Reduced false positives via classification gate
Reliability
- Fully functional in low or no connectivity
- Graceful degradation between inference modes
Key Technical Learnings
1. Hybrid AI > Edge-Only or Cloud-Only
A dual-path system provides:
- Reliability (edge)
- Accuracy (cloud)
2. Model Distribution Is a First-Class Problem
Dynamic downloads enable:
- Faster iteration
- Smaller app footprint
- Better user targeting
3. Feedback Loops Drive Model Quality
Production data is:
- More valuable than synthetic datasets
- Essential for domain-specific accuracy
4. Input Validation Saves Compute
Early classification:
- Prevents wasted inference
- Improves system robustness
5. Multi-Frame Processing Improves Accuracy
Video + multi-image inputs:
- Reduce noise
- Improve measurement consistency
Business Impact (Technical Lens)
- Reduced dependency on app store release cycles
- Lower cloud inference costs via edge-first design
- Improved model accuracy through continuous retraining
- Created a scalable ML platform, not just a feature
Who This Is Ideal For
- Teams building edge AI systems with continuous learning
- Mobile AI platforms needing dynamic model updates
- Applications operating in unreliable connectivity environments
- Products requiring real-world data feedback loops
If you’re building AI systems that must operate reliably in production environments, the real challenge isn’t just inference it’s:
- Model lifecycle management
- Hybrid execution
- Continuous learning
The teams that solve this will define the next generation of AI-native products.