Skip to Main Content

Data scrapers win in court over grabbing public user data off social media

Data scrapers score a win in court, protecting their ability to grab public user data off sites and repurpose it for their own business activities. The ruling is a direct slap-down to LinkedIn, a professional networking site owned by Microsoft, and a warning shot to people naïve enough to think that their data entered onto a social media network could never be captured and used without the network’s permission.

The data scrapers case in a nutshell

The court battle originated with hiQ Labs, a Californian business analytics firm, taking publicly available LinkedIn user data to create tools for predicting when a company should invest in an employee’s development – or when an employee might leave for a different company. This process was not approved by LinkedIn which accused hiQ of being a free rider with its users’ data. A court battle ensued with both parties suing each other and hiQ getting a preliminary ruling that enabled it to keep operating as the case made its way to the Ninth US Circuit Court of Appeals.

“Public user data” is the key issue

The appeals court ruled in favor of hiQ with the judge explaining that the fight is over access to public user data, not restricted access information. In her decision, judge Marsha Berzon reportedly stated, “LinkedIn has no protected property interest in the data contributed by its users, as the users retain ownership over their profiles.” She also added that, “as to the publicly available profiles, the users quite evidently intend them to be accessed by others.” This is a restatement of the (apparently) obvious: A public profile on a site is just that – public. It is just not clear how this impacts the “Let recruiters know your open to opportunities” option under the Privacy section on the LinkedIn page.

The battle sides have been drawn

The fight centered over how the 1986 Computer Fraud and Abuse Act (CFAA) and its sanctions on those who access a connected computer either “without authorization” or in a way that “exceeds authorized access.” LinkedIn was supported in its argument by Craigslist, the classified ad website. They warned that allowing data scrapping actors such as hiQ could open the doors to them finding fresh targets for unwanted emails, texts, or phone messages. On the hiQ side was the Electronic Frontier Foundation, alternative search engine DuckDuckGo, and the Internet Achive. These parties pointed out that scraping data off a site is not just for hackers, but has lots of beneficial uses. Examples pointed out by EFF include digging into racial discrimination on Airbnb, Amazon price discrimination, and even the development of search engines.

What is data scraping and is it legal?

Technically, web scraping is automated internet browsing using a bot or crawler to access and record the same data that a living person could do by hand. In this case, the court ruled that the data scraping was OK as it was collecting publicly accessible data. However, other cases are pending such as the appeal of the Facebook vs. Power Venture ruling. That case resulted in Facebook collecting millions in damages and also stopping Power Venture from automatically aggregating social media posts even after getting permission to do so from the users.

The subtle variations and broad-brush similarities between these cases make it clear that data scrapers are not going away. This is also a US specific case. There will be similar cases filed in the EU following the passing of the new copyright law and its much-feared Article 13.

As a PR Consultant and journalist, Frink has covered IT security issues for a number of security software firms, as well as provided reviews and insight on the beer and automotive industries (but usually not at the same time). Otherwise, he’s known for making a great bowl of popcorn and extraordinary messes in a kitchen.