Model-Driven Algorithm Selection - An Interactive Analysis Tool for Assessing Algorithm Selection Scenarios

In his seminal work, John Rice (1976) defined the Algorithm Selection Problem roughly as follows: given a portfolio of optimization algorithms and a set of instances, one wants to find a model that is able to pick the (ideally) best optimization algorithm for an unseen problem instance. Although this work is already more than four decades old, its success story mainly started during the last years. Noticeably, algorithm selectors have proven to be successful across a variety of optimization problems. Accounting for more transparency in research, the experimental data for several scenarios in which algorithm selection has been successfully applied, is stored publicly accessible within the Algorithm Selection Library (ASLib).

Within this thesis, the student will have to (a) train competitive algorithm selection models for the scenarios stored within the ASLib, and (b) implement a dashboard / GUI, which display the data and processed results in a user-friendly and visually appealing way, and thereby enables the user to analyze the underlying data.

To process the data and results in a meaningful way, the ASLib scenarios are supposed to be integrated within the (publicly accessible) online platform for machine learning algorithms: OpenML. Here, it may be necessary to create separate scenarios per supervised learning type (classification vs. regression) and performance measure (PAR10, ERT, solution quality, HV contribution, etc.). Once all data sets have been integrated into OpenML, several promising machine learning algorithms need to be trained (eventually even tuned) to find a powerful algorithm selection model. Here, we strongly recommend to make use of mlr, as it is a very powerful and probably the most comprehensive R package for machine learning, and also provides a nice interface to OpenML.
In the second (and eventually even more difficult) part of the thesis, the student needs to develop a dashboard (using the R-package shiny), which shall enable its user to import and analyze a given algorithm selection scenario - similarly to the python tool ASAPy or the recently published profiler tool IOHprofiler. Hence, the dashboard shall on the one hand provide (user-friendly) functionalities for visualizing and analyzing the underlying data (originally stored within the ASLib), and on the other hand also provide the means to analyze the algorithm selection models, which have been trained hereon. Given the rising demand for explainable machine learning algorithms, the dashboard also needs to provide the means for allowing further investigations of the explainability of the models (see, e.g., work on LIME by Ribeiro et al. for further details).

 

Links

 

References

  • Bischl, B., Lang, M., Kotthoff, L., Schiffner, J., Richter, J., Studerus, E., Casalicchio, G. and Jones, Z.M. (2016).
    mlr: Machine Learning in R.
    In: Journal of Machine Learning Research (JMLR), Volume 17, No. 170, pp. 1−5.
  • Bischl, B., Kerschke, P., Kotthoff, L., Lindauer, M., Malitsky, Y., Fréchette, A., Hoos, H. H., Hutter, F., Leyton-Brown, K., Tierney, K. and Vanschoren, J. (2016).
    ASlib: A Benchmark Library for Algorithm Selection.
    In: Artificial Intelligence Journal (AIJ), Volume 237, pp. 41–58.
    URL: https://www.sciencedirect.com/science/article/abs/pii/S0004370216300388
  • Casalicchio, G., Bossek, J., Lang, M., Kirchhoff, D., Kerschke, P., Hofner, B., Seibold, H., Vanschoren, J. and Bischl, B. (2017).
    OpenML: An R Package to Connect to the Machine Learning Platform OpenML.
    In: Computational Statistics, pp. 1–15.
    URL: https://link.springer.com/article/10.1007/s00180-017-0742-2
  • Rice, J. (1976).
    The Algorithm Selection Problem.
    In: Advances in Computers, Volume 15, pp. 65–118.
    URL: http://www.sciencedirect.com/science/article/pii/S0065245808605203
  • Ribeiro, M.T., Singh, S. and Guestrin, C. (2016).
    "Why Should I Trust You?": Explaining the Predictions of Any Classifier
    .
    In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16),
    New York, NY, USA, pp. 1135–1144, ACM.
    DOI: 10.1145/2939672.2939778. URL: https://doi.acm.org/10.1145/2939672.2939778.
  • Vanschoren, J., van Rijn, J.N., Bischl, B. andTorgo, L. (2013).
    OpenML: Networked Science in Machine Learning.
    In: ACM SIGKDD Explorations Newsletter, Volume 15, No. 2, pp. 49-60.
    URL: https://dl.acm.org/citation.cfm?id=2641198
  • Wang, H., Ye, F., Doerr, C., van Rijn, S. and Bäck, T. (2018).
    IOHprofiler: A New Tool for Benchmarking and Profiling Iterative Optimization Heuristics.
    In: ArXiV preprint.
    URL: https://arxiv.org/abs/1810.05281