Город МОСКОВСКИЙ
03:19:17

[Data on the Mind 2017] Collecting data from the web

Аватар
Фрилансерский Дизайнерский JavaScript
Просмотры:
21
Дата загрузки:
04.12.2023 16:53
Длительность:
03:19:17
Категория:
Обучение

Описание

Abstract: This workshop will explore different ways to collect data from the web with Python. Have you ever needed to copy and paste hundreds (or thousands!) of tables on different web pages? Or click through combinations of dropdown menu selections and download files? Are you interested in collecting social media or news data? After this workshop, you'll be well on your way to automating these processes. We will first consider getting data from RESTful APIs. We will walk through using the documentation to build a query for a GET request. We'll then write our response to a CSV spreadsheet. We'll then discuss web scraping, keeping in mind the Terms of Service for websites and ensuring we are not in violation. We will look at two ways of scraping: 1) parsing the HTML response of a GET request using BeautifulSoup, and 2) utilizing the Selenium web driver to interact directly with dynamic web content.

Instructor: Christopher Hench (University of California, Berkeley)

---

Part of the Data on the Mind 2017 summer workshop: http://www.dataonthemind.org/2017-workshop

Funded by the Estes Fund: http://www.psychonomic.org/page/estesfund

Co-produced with the Berkeley D-Lab: http://dlab.berkeley.edu/

Organized in collaboration with Data on the Mind: http://www.dataonthemind.org

Videography by DeNoise Studios: http://www.denoise.com

Workshop hashtag: #dataonthemind

Рекомендуемые видео