Nagare Media Engine: Task Error Recovery in MPEG NBMP Workflows Through Event Sourcing

Neugebauer, Matthias


Abstract

Multimedia workflows have become complex distributed systems that are deployed in a multi-cloud and multi-edge fashion. Such systems are prone to errors either in hardware or software. Modern multimedia workflows therefore need to design an appropriate error-handling strategy. Because tasks are often computationally expensive and run for a long time, simply restarting and redoing prior work is an inadequate solution. Instead, tasks should create regular checkpoints and continue from the last good state after a restart. We propose to use an event sourcing approach for recording and potentially replaying state changes in the form of published events. In this paper, we adopt this approach for Network-Based Media Processing (NBMP), an MPEG standard published as ISO/IEC 23090-8 that defines APIs and data models for network-distributed multimedia workflows. Additionally, we developed our approach as an extension to Nagare Media Engine, our existing open source NBMP implementation based on the Kubernetes platform. We evaluated our approach in scene detection and video encoding scenarios with simulated disruptions and observed significant speedups.

Keywords
nbmp; network-distributed multimedia processing; error recovery; event sourcing; encoding



Publication type
Research article in proceedings (conference)

Peer reviewed
Yes

Publication status
Published

Year
2024

Conference
ACM Multimedia Systems Conference

Venue
Bari

Book title
Proceedings of the 15th ACM Multimedia Systems Conference

Editor
Association for Computing Machinery

Start page
257

End page
263

Publisher
ACM Press

Place
Bari, Italy

Language
English

ISBN
979-8-4007-0412-3/24/04

DOI