Agentic Workflow
Apr 14, 2026 | 4:40 PM - 6:00 PM
Apr 14, 2026 | 4:40 PM - 6:00 PM
Description
Traditional web scraping relies on brittle rules, selectors, and HTML structures that break whenever a page changes, especially on JavaScript‑heavy or frequently updated sites. They also lack context awareness, to understand whether a page represents a product, a job posting, or an article, making it difficult to extract relevant information. Modern AI‑driven approaches address this by processing web content more like a human would, i.e., understanding context, layout, and semantics instead of relying on fixed patterns. In this training, we introduce how agentic, LLM‑driven scraping pipelines overcome these long‑standing limitations. We explore the architecture behind the Agentic Scraper: asynchronous fetching and parsing, concurrency through worker pools, and the design of rule‑based and LLM-driven agents. In the second part, we will run an end‑to‑end agentic scraping workflow and observe how different agent modes behave when provided with various URLs and contexts.
Level: Beginner‑friendly; basic Python familiarity is helpful but not required.