Towards Model-Driven Engineering for Big Data Analytics – An Exploratory Analysis of Domain-Specific Languages for Machine Learning
Graphical models and general purpose inference algorithms are powerful tools for moving from imperative towards declarative specification of machine learning problems. Although graphical models define the principle information necessary to adapt inference algorithms to specific probabilistic models, entirely model-driven development is not yet possible. However, generating executable code from graphical models could have several advantages. It could reduce the skills necessary to implement probabilistic models and may speed up development processes. Both advantages address pressing industry needs. They come along with increased supply of data scientist labor, the demand of which cannot be fulfilled at the moment. To explore the opportunities of model-driven big data analytics, I review the main modeling languages used in machine learning as well as inference algorithms and corresponding software implementations. Gaps hampering direct code generation from graphical models are identified and closed by proposing an initial conceptualization of a domain-specific modeling language.