Ur-Score

Hybrid Edge AI in Production: On-Device Computer Vision with Dynamic Models and Continuous Training

Introduction

Deploying AI to mobile devices is no longer just about shrinking models—it requires rethinking the entire system architecture.

This case study explores a production-grade hybrid edge AI system that combines:

  • On-device computer vision
  • Dynamic model delivery
  • Cloud-assisted inference fallback
  • Continuous training pipelines powered by real user data

The system supports 29 wildlife species, processes images and videos, and operates reliably in environments with unpredictable connectivity.

The Client

A mobile-first outdoor technology platform building an AI-native application for real-time wildlife analysis, requiring high reliability in low-connectivity, field-based environments.

The Problem

1. Static Model Deployment Doesn’t Scale

Traditional mobile ML approaches bundle models with the app:

  • Requires app updates for every model improvement
  • Increases app size significantly
  • Limits flexibility across species or use cases

2. Pure Edge or Pure Cloud Is Insufficient

  • Edge-only systems:
    • Limited by device constraints
  • Cloud-only systems:
    • Fail in offline environments

A hybrid approach was required

3. Training Data Bottleneck

  • High-quality labeled data is hard to source
  • Synthetic or pre-trained datasets lack domain specificity
  • Need for continuous learning from real-world usage

4. Multi-Modal Input Complexity

The system needed to support:

  • Single images
  • Multiple images
  • Video inputs

All while maintaining consistent measurement outputs.

The Hybrid Edge AI Solution

The system was designed as a bi-directional learning and inference loop:

Key Capabilities
  • Dynamic model downloads (on-demand)
  • On-device inference using Onyx models
  • Cloud inference fallback when available
  • Continuous training pipeline using real user data
  • Support for video and multi-image inputs
  • Pre-inference species validation model

Architecture

1. Model Distribution Layer

Instead of bundling models in the app:

  • Models are downloaded on demand
  • Users fetch only:
    • Relevant species models
    • Required capabilities

Benefits:

  • Reduced app size
  • Faster updates without app releases
  • Fine-grained control over model deployment

2. On-Device Inference (Primary Path)

  • Uses Onyx models optimized for mobile
  • Executes:
    • Detection
    • Measurement
    • Feature extraction

Runs entirely offline when needed.

3. Cloud Inference (Fallback Layer)

When connectivity is available:

  • The app can switch to online models
  • Enables:
    • Higher accuracy models
    • More compute-intensive processing

Hybrid Logic:

  • Default → on-device
  • Upgrade → cloud (when available)
  • Fallback → on-device (when offline)

4. Input Processing Pipeline

Supports multiple input modes:

a. Single Image

  • Fast inference path

b. Multi-Image

  • Aggregates measurements across frames
  • Improves accuracy via redundancy

c. Video Input

  • Extracts frames
  • Applies inference across sequence
  • Selects optimal frames for measurement

5. Pre-Validation Classification Model

Before running heavy inference:

  • A lightweight classification model:
    • Detects species
    • Validates input

If mismatch:

  • Rejects input early
  • Prevents incorrect scoring

This reduces:

  • False positives
  • Unnecessary compute

6. Training & Feedback Pipeline

A critical differentiator of the system:

Step 1: Data Collection

  • User-uploaded images/videos
  • Stored with metadata

Step 2: Admin Review

  • Internal admin app used for:
    • Validation
    • Annotation
    • Quality control

Step 3: Training Pipeline

  • Triggered via on-demand Azure Functions
  • Automates:
    • Data preprocessing
    • Model retraining
    • Evaluation

Step 4: Model Deployment

  • Updated models are:
    • Packaged
    • Made available for download

This creates a continuous improvement loop.

7. Mobile Application Layer

  • Built using React Native
  • Integrates:
    • Camera APIs
    • Local storage
    • Model management system

Handles:

  • Model versioning
  • Download orchestration
  • Inference routing (edge vs cloud)

Implementation Approach

Phase 1: Prototype Validation

  • Built initial CV pipeline (as shown in the prototype flow )
  • Validated measurement feasibility

Phase 2: Edge Model Deployment

  • Converted models to Onyx format
  • Integrated on-device inference

Phase 3: Hybrid Inference Layer

  • Implemented routing between:
    • Offline models
    • Cloud models

Phase 4: Model Distribution System

  • Built dynamic model download system
  • Introduced versioning and updates

Phase 5: Training Pipeline Automation

  • Integrated Azure Functions
  • Enabled continuous retraining

Phase 6: Multi-Modal Input Support

  • Added:
    • Video processing
    • Multi-image aggregation

Results

System Performance

  • Real-time inference on-device
  • Seamless switching between edge and cloud

Scalability

  • Models updated without app releases
  • Distributed model delivery

Accuracy Improvements

  • Continuous learning from real-world data
  • Reduced false positives via classification gate

Reliability

  • Fully functional in low or no connectivity
  • Graceful degradation between inference modes

Key Technical Learnings

1. Hybrid AI > Edge-Only or Cloud-Only

A dual-path system provides:

  • Reliability (edge)
  • Accuracy (cloud)

2. Model Distribution Is a First-Class Problem

Dynamic downloads enable:

  • Faster iteration
  • Smaller app footprint
  • Better user targeting

3. Feedback Loops Drive Model Quality

Production data is:

  • More valuable than synthetic datasets
  • Essential for domain-specific accuracy

4. Input Validation Saves Compute

Early classification:

  • Prevents wasted inference
  • Improves system robustness

5. Multi-Frame Processing Improves Accuracy

Video + multi-image inputs:

  • Reduce noise
  • Improve measurement consistency

Business Impact (Technical Lens)

  • Reduced dependency on app store release cycles
  • Lower cloud inference costs via edge-first design
  • Improved model accuracy through continuous retraining
  • Created a scalable ML platform, not just a feature

Who This Is Ideal For

  • Teams building edge AI systems with continuous learning
  • Mobile AI platforms needing dynamic model updates
  • Applications operating in unreliable connectivity environments
  • Products requiring real-world data feedback loops

If you’re building AI systems that must operate reliably in production environments, the real challenge isn’t just inference it’s:

  • Model lifecycle management
  • Hybrid execution
  • Continuous learning

The teams that solve this will define the next generation of AI-native products.