US courts develop nuanced web scraping views as investors wait for hiQ appeal

Sondra Campanelli, Head of News and Marketing (London)

Neudata News
Post feature

The following article was contributed by Kelly Koscuiszka, partner at Schulte Roth & Zabel LLP.

Web scraping – the automated collection of data on the internet – is increasingly used in information gathering including by “traditional” research vendors and journalists. Because scraping activity is frequently litigated, this is an evolving area of the law. The industry is closely monitoring the litigation between hiQ Labs and LinkedIn, in which the U.S. Supreme Court could potentially weigh in this term.

Early forms of web scraping (and an unfortunate name) earned the practice a negative reputation. Indeed, in 2010, ticket scalpers used web scraping techniques to attack the TicketMaster website, resulting in federal criminal charges. In that case, U.S. v. Lowson, the defendants used deceptive measures to impersonate individual ticket buyers, evade anti-fraud measures, and purchase more than a million tickets. The defendants, principals of the unwisely named Wiseguy Tickets Inc., were charged with conspiracy, wire fraud, and two counts under the Computer Fraud and Abuse Act (CFAA). The defendants pled guilty before trial such that key legal issues were never substantively reached.

More recent and common challenges to web scraping occur in the context of civil litigation. Prior to the Ninth Circuit’s decision in hiQ, target sites largely had been successful in stopping unwanted scraping and imposing liability on scrapers, with much of the focus on scraping by competitors.


The preliminary ruling in hiQ v LinkedIn embraced the notion that there are public parts of the internet to which target sites cannot indiscriminately deny automated access. 

In 2017, LinkedIn sent a cease-and-desist letter to hiQ Labs alleging that hiQ’s scraping of LinkedIn’s public profiles violated its user agreement as well as state and federal law, including the CFAA. In response, hiQ sued LinkedIn for injunctive relief and a declaratory judgment that hiQ’s activities were lawful.

In August 2017, the district court granted hiQ a preliminary injunction, ordering LinkedIn to withdraw its cease-and-desist letter and remove any technical barriers to hiQ’s access to public profiles while the case was pending. In September 2019, the Ninth Circuit affirmed the district court's decision.

The district court and Ninth Circuit distinguished hiQ from earlier web scraping cases because the information hiQ sought to scrape was public information that was not behind a paywall or login. The district court analogized this to data in the "public square," and the Ninth Circuit similarly agreed that public Linkedin profiles fall within the most open category of information on the internet – information  for which access is open to the general public and permission is not required. The Ninth Circuit expressed concern that giving companies "free rein" to decide to limit access to data they otherwise make publicly available "risks the possible creation of information monopolies.”


LinkedIn appealed the Ninth Circuit decision to the Supreme Court; its petition for certiorari is currently pending. If the Supreme Court agrees to hear the case, it will consider the narrow issue of whether the CFAA's “without authorization” prong applies to accessing public websites.

Earlier this term, the Supreme Court heard oral argument in a different CFAA case, Van Buren v. United States. Van Buren does not involve web scraping but nonetheless has potential implications for web scrapers because it considers what it means under the CFAA to exceed authorized access. Van Buren was a police officer who used his access to a police database to run a license plate for a friend in exchange for money.

Commentators and amicus curiae have expressed concern that an overly broad interpretation of exceeding authorized access under the CFAA could criminalize web scraping because it would reach conduct where the user was initially permitted on a site (i.e., did not hack or break into a computer system) but then engaged in activity not authorized by the site.

With Van Buren and hiQ, the Supreme Court has two potential opportunities this term to issue decisions that could impact the legality of web scraping. However, a narrow decision in Van Buren and a denial of certiorari in hiQ would allow the Supreme Court to avoid the web scraping issue altogether this term.


In the absence of the Supreme Court weighing in on this issue, hiQ's impact is widespread in the lower courts.

Most courts that have considered the issue agree with hiQ's notion of a "two-realm" internet -- the public internet versus authorization-based portions of the internet. For example, in November 2020, a federal district court denied a web scraper injunctive relief related to scraping data from the password-protected portions of the Facebook platform.

Questions remain about what constitutes a "permission requirement" on a site that would render the data behind it no longer public for CFAA purposes. For example, the district court in hiQ found that a CAPTCHA was not an "access control" such that information on CAPTCHA-protected web pages were deemed public. Similarly, in an unrelated criminal case, the federal district court in D.C. held that prohibitions on web scraping in a vendor's terms of use do not constitute "unauthorized access" for CFAA criminal liability purposes.

One outlier is Compulife Software, Inc. v. Newman, in which the Eleventh Circuit held, in the context of a trade secrets case, that scraping publicly accessible information, in large enough quantities, might be a form of misappropriation of trade secrets. The case, however, did not interpret the CFAA and was fact-specific to the trade secret dispute in which the scraper was a competitor of the scraping target.


Web scraping activities are increasingly prevalent, though the law remains unsettled. As we wait to see whether the Supreme Court will provide clarity, the law continues to develop in the lower courts in cases that present increasingly nuanced fact patterns and legal questions. For now, it seems a key consideration in evaluating scraping activity, though certainly not the only consideration, is whether the information is on public portions of a website.


Kelly Koscuiszka, partner at Schulte Roth & Zabel LLP, advises on privacy and data security as well as regulatory and enforcement matters for private funds, broker-dealers, technology companies (including alternative data providers) and individuals. She advises clients on regulatory compliance and privacy laws, and represents clients in regulatory investigations and enforcement actions by the SEC, DOJ, FINRA and other self-regulatory organizations as well as in complex civil litigation matters. 


Photo by Brandi Redd on Unsplash