Does your algorithm care for copyrights, or for data protection regulation?
Does your algorithm care for copyrights, or for data protection regulation? | Photo: Markus Spiske | Free use

Web Scraping Social Media: Pitfalls of Copyright and Data Protection Law

The increasing popularity of web scraping methods does not come without a plethora of legal questions. In our first article, we analyzed the growing popularity of web scraping methods and how the Terms of Service of the social media platforms relate to this issue. In this article we discuss further questions of copyright law and data protection law regarding web scraping. The German legal situation in copyright law is discussed as an example here.


A scientist sitting in front of a computer, looking at datasheets.
Data from social media platforms is increasingly used for scientific research. | Photo: National Cancer Institute/Unsplash

Web Scraping Social Media: Legitimate Research or a Breach of Contract?

To make full use of the massive amounts of social media platform data for the purposes of scientific research, data is increasingly obtained using data collection methods such as web scraping. Web scraping methods make it possible to automatically access and retrieve information directly from social media web interfaces and other websites. The technical process requires two main steps: First, the website is accessed with the assistance of a webbot or a webcrawler. Second, the information is analyzed automatically and extracted, if necessary.