Semantic Clustering-Based Approach for Data Abstraction in Enterprise Architecture
The rapid pace of organizational digitalization introduces constant change and a growing inventory of software applications within enterprises. As all kinds of companies become increasingly digital, managing IT infrastructure and aligning it with strategic objectives has made Enterprise Architecture Management (EAM) indispensable. Digital transformation requires enterprise organizations to oversee thousands of applications across all departments and projects, which is a complex and critical task. EAM includes a major challenge posed by an operational opportunity: Application Portfolio Rationalization (APR). APR describes the task of detecting redundancies and completing application rationalization by scoping, evaluating, tracking, and analyzing the application landscape This bachelor thesis proposes a semantic clustering-based approach for data abstraction in enterprise architecture. Hence, Natural Language Processing (NLP) models are used to convert textual data into computer-processable representations (embeddings) to provide input data for suitable clustering algorithms. A selection of appropriate models and algorithms will be compared and evaluated in a detailed analysis. The proposed model shown in Figure 1 results in segmented application clusters helping enterprises to identify potential redundancies based on the seed features of these applications.