Scroll Top

Scraping Publicly Available Data Does Not, in a Preliminary Injunction Setting, Violate CFAA

The U.S. District Court for the Northern District of California, granted HiQ’s motion for preliminary injunction, prohibiting LinkedIn from changing its access code to block HiQ’s efforts to scrape data from the LinkedIn website to gather employment data. HIQ Labs, Inc. v. LinkedIn Corp., Case No. 17-CV-3301 ( NDCA Aug. 14, 2017) (Available Here).

Plaintiff HiQ initiated this action after Defendant LinkedIn issued a cease and desist letter and attempted to terminate HiQ’s ability to access otherwise publicly available information on profiles of LinkedIn users. The c&d letter threatened action against HiQ under the Computer Fraud and Abuse Act (CFAA), 18 U.S.C. § 1030. LinkedIn employed various blocking techniques designed to prevent HiQ’s automated data collection methods. LinkedIn brought this action after years of tolerating HiQ’s access and use of its data. HiQ contended that LinkedIn’s actions constitute unfair business practices under Cal. Bus. & Prof. Code § 17200 et seq. HiQ moved for a preliminary injunction and was required to prove that LinkedIn’s positions regarding the CFAA and other related California State actions had little or no merit (the actions were most likely meritless, at least in a preliminary injunction setting).

The Court granted the motion and stated “HiQ has raised serious questions as to whether LinkedIn, in blocking HiQ.s access to public data, possibly as a means of limiting competition, violates state law.”

LinkedIn users can set their profiles as being entirely private, or can make them viewable by: (1) their direct connections on the site; (2) a broader network of connections; (3) all other LinkedIn members; or (4) the entire public. When users choose the last option, their profiles are viewable by anyone online regardless of whether that person is a LinkedIn member. LinkedIn also allows public profiles to be accessed via search engines such as Google.

HiQ sells to its client businesses information about their workforces that HiQ generates through analysis of data on LinkedIn users’ publicly available profiles. HiQ gathers the workforce data that forms the foundation of its analytics by automatically collecting it, or harvesting or “scraping” it, from publicly available LinkedIn profiles.

The LinkedIn’s User Agreement (terms of service (TOS) or terms of use (TOU)) prohibits various methods of data collection from its website. LinkedIn argued that HiQ was in violation of those provisions. LinkedIn also argued that it had restricted HiQ’s company page on LinkedIn and that “[a]ny future access of any kind” to LinkedIn by HiQ would be “without permission and without authorization from LinkedIn.” Therefore, LinkedIn withdrew any permission that HiQ had to scrape the LinkedIn data. LinkedIn further argued that it had “implemented technical measures to prevent HiQ from accessing, and assisting other to access, LinkedIn’s site, through systems that detects, monitor, and block scraping activity.” LinkedIn argued that any further access to LinkedIn’s data would violate state and federal law, including California Penal Code § 502(c), the federal Computer Fraud and Abuse Act (“CFAA”), 18 U.S.C. § 1030, state common law of trespass, and the Digital Millennium Copyright Act, 17 U.S.C Sec. 1201(a)(1))(No person shall circumvent a technological measure that effectively controls access to a work protected under this title).

It was undisputed fact that HiQ’s entire business depends on its access to LinkedIn’s public profile data. These potential consequences are sufficient to constitute irreparable harm.

LinkedIn argued that it faces significant harm because HiQ’s data collection threatens the privacy of LinkedIn users, because even members who opt to make their profiles publicly viewable retain a significant interest in controlling the use and visibility of their data. In particular, LinkedIn points to the interest that some users may have in preventing employers or other parties from tracking changes they have made to their profiles. Over 50 million LinkedIn members have used a “Do Not Broadcast.” However, LinkedIn presented little evidence of users’ actual privacy expectation. Out of its hundreds of millions of users, including 50 million using Do Not Broadcast, LinkedIn only identified three individual complaints specifically raising concerns about data privacy related to third-party data collection. Also, LinkedIn’s professed privacy concerns were somewhat undermined by the fact that LinkedIn allows other third-parties to access user data without its members’ knowledge or consent. LinkedIn offers a product called “Recruiter” that allows professional recruiters to identify possible candidates for other job opportunities.

Furthermore, despite the fact that HiQ has been aggregating LinkedIn’s public data for five
years with LinkedIn’s knowledge, LinkedIn presented no evidence of harm, financial or otherwise resulting from HiQ’s activities.

The CFAA creates civil and criminal liability for any person who “intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains . . . information from any protected computer.” 18 U.S.C. § 1030(a)(2)(C). As the Supreme Court has explained, the statute “provides two ways of committing the crime of improperly accessing a protected computer: (1) obtaining access without authorization; and (2) obtaining access with authorization but then using that access improperly.” Musacchio v. United States, 136 S. Ct. 709, 713 (2016).

First, in Facebook, Inc. v. Power Ventures, Inc., the Ninth Circuit held that “a defendant can run afoul of the CFAA when he or she has no permission to access a computer or when such permission has been revoked explicitly.” 844 F.3d 1058, 1067 (9th Cir. 2016). In United States v. Nosal (Nosal II), 844 F.3d 1024 (9th Cir. 2016), the Court of Appeals for the Ninth Circuit held that an employee “whose computer access credentials were affirmatively revoked by [his employer], he acted without authorization in violation of the CFAA when he or his former employee coconspirators used the login credentials of a current employee” to gain access to the employer’s computer systems.

The trial court distinguished these cases because none of the data in Facebook or Nosal II was public data. “The CFAA must be interpreted in its historical context, mindful of Congress. purpose. The CFAA was not intended to police traffic to publicly available websites on the Internet. The Internet did not exist in 1984. The CFAA was intended instead to deal with ‘hacking’ or ‘trespass’ onto private, often password-protected mainframe computers. See H.R. Rep. No. 98-894, 1984 U.S.C.C.A.N. 3689, 3691-92, 3695-97 (1984); S. Rep. No. 99-432, 1986 U.S.C.C.A.N. 2479, 2480 (1986). The Ninth Circuit has recognized this statutory purpose, explaining that ‘Congress enacted the CFAA in 1984 primarily to address the growing problem of computer hacking, recognizing that, [i]n intentionally trespassing into someone else’s computer files, the offender obtains at the very least information as to how to break into that computer system.’ United States v. Nosal (Nosal I), 676 F.3d 854, 858 (9th Cir. 2012) (quoting S.Rep. No. 99–432, at 9 (1986), 1986 U.S.C.C.A.N. 2479, 2487 (Conf. Rep.)).”

As HiQ pointed out, application of the CFAA to the accessing of websites open to the public would have sweeping consequences well beyond anything Congress could have contemplated. It would “expand its scope well beyond computer hacking.” The Ninth Circuit has specifically rejected the argument that “the CFAA only criminalizes access where the party circumvents a technological access barrier.” Nosal II, 844 F.3d at 1038. Therefore, to extend the scope of the CFAA as suggested by LinkedIn to create civil liability merely by viewing a website in contravention of a unilateral directive from a private entity would effectively create a digital version of Medusa — by seeing the public information after being told not to do so, would expose one to civil and criminal liability.

Finally, the Court opined that the Internet in general, and social networking sites in particular, is equivalent to the “modern public square,” that is, those spaces embrace a social norm that assumes openness and accessibility of those forums to all comers. Cf. Ampex Corp. v. Cargle, 128 Cal. App. 4th 1569, 1576 (2005).

“Where a website or computer owner has imposed a password authentication system to regulate access, it makes sense to apply a plain meaning reading of ‘access’ ‘without authorization’ such that ‘a defendant can run afoul of the CFAA when he or she has no permission to access a computer or when such permission has been revoked explicitly.’ Power Ventures, 844 F.3d at 1067.” Also, the use of CAPTCHA does not limit access to certain individuals. CAPTCHA is intended “as a way to slow[] a user’s access rather than as a way to deny authorization to access.”

Related Posts