Dynamic Web Scraping Using GPT-4o: Integrating Visual AI for Adaptive Data Extraction
This paper presents a concept for a dynamic web scraper. The web scraper should automatically navigate through the website to get to the desired data without knowing the website’s code in advance. Here, classic web scraping technologies such as Selenium are combined with the latest innovations in the field of Large-Language Models (LLMs). The web scraper should be able to click through a website to obtain the data it is looking for. GPT-4o pro- vides the basis for the web scraper to gain an understanding of the website. The LLM published by OpenAI in May 2024 offers improved performance, especially in interpreting images [1]. By analyzing rendered screenshots of the website, GPT-4o decides how the web driver should interact with it. For example, buttons can be clicked, or strings can be entered in input fields. In an iterative process, GPT-4o navigates through the website until it finds the desired data.