On this page
article
Parse HTML
Extract text and links from HTML with BeautifulSoup.
Category: beautifulsoup4
Problem
Extract text and links from HTML with BeautifulSoup.
Solution
soup = BeautifulSoup(html, "html.parser")
links = [a["href"] for a in soup.find_all("a", href=True)]
Notes
- Adapt variable names and paths to your project
- Add error handling for production use
- See related chapters in the Learning Path