Extracting information from images to enhance the performance of text clustering algorithms

​​​​Nowadays, researchers are confronted with large amounts of potentially unbounded data on social media that must be processed to extract desired information as quickly as possible in almost real-time. Thus, many algorithms have been developed to cluster text data to extract coherent discussions. However, social media data consists not only of text data but also includes images or videos. In this thesis, the existing text stream clustering algorithm “textClust” should be extended to include information from image data. The information should be extracted and transformed into text by consulting a model that can interpret images, such as Google’s Vision API or Aleph Alpha’s MAGMA model. Students should perform a literature review on current approaches and their technical background, implement one (or several) of the suitable approaches in textClust and perform a benchmark study on textClust performance with and without information from images. Good knowledge of  Python is necessary for the implementation.