Reverse engineering database queries from examples: State-of-the-art, challenges, and research opportunities

Martins DML


Abstract
With the popularization of data access and usage, an increasing number of users without expert knowledge of databases is required to perform data interactions. Often, these users face the challenges of writing and reformulating database queries, which consume a considerable amount of time and frequently yield unsatisfactory results. To facilitate this human–database interaction, researchers have investigated the Query By Example (QBE) paradigm in which database queries are (semi) automatically discovered from data examples given by users. This paradigm allows non-database experts to formulate queries without relying on complex query languages. In this context, this work aims to present a systematic review of the recent developments, open challenges, and research opportunities of the QBE reported in the literature. This work also describes strategies employed to leverage efficient example acquisition and query reverse engineering. The obtained results show that recent research developments have focused on enhancing the expressiveness of produced queries, minimizing user interaction, and enabling efficient query learning in the context of data retrieval, exploration, integration, and analytics. Our findings indicate that future research should concentrate efforts to provide innovative solutions to the challenges of improving controllability and transparency, considering diverse user preferences in the processes of learning personalized queries, ensuring data quality, and improving the support of additional SQL features and operators.

Keywords
Reverse engineering database queries, Databases, Query discovery, Query synthesis, Query learning



Publication type
Article in Journal

Peer reviewed
Yes

Publication status
Published

Year
2019

Journal
Information Systems

Volume
83

Pages range
89 - 100

ISSN
0306-4379

DOI

Full text