Lucino Garcia

Data Analyst

View project on GitHub

Huberman Lab Podcast Transcript Extraction - Data Scraping

  • Utilizing Python’s Selenium package, transcripts for each episode were scraped from HubermanTranscripts.com.
  • Data reflects updates until April 4, 2024.
  • To optimize efficiency and reduce computational load while minimizing impact on website traffic, links were extracted directly from an HTML element on the site.

Scraping_Script.png

Link to Script
GitHub Repository for Huberman Lab Podcast NLP Topic Extraction Project
Back to Data Analytics Portfolio