Internet censorship in the US

Posted by David Zaslavsky on November 22, 2011 7:05 PM

— Comments

Thanks to campaigns like last week’s American Censorship day, computer users around the United States (and beyond) have been sitting up and paying attention to two bills regarding online copyright infringement that are now working their way through Congress: SOPA and PROTECT-IP. There is a lot of hype about how the law represented by these bills would be a terrible affront to free speech, and it may or may not be right, but as usual when it comes to legal matters, many people don’t have the knowledge to judge for themselves. With this blog post and possibly others like it, I’m trying to get relevant information out there so we can all make more informed decisions.

(Full disclosure: I am personally opposed to the passage of SOPA/PROTECT-IP, but I’ve tried not to let that bias come through too strongly.)

A brief history of information exchange

Back in the days before internet use was so widespread, media redistribution was not a major problem. If you wanted to share a song or a video with someone, you had to physically lend them a tape, CD, or DVD. Yes, it was possible to make copies of media, but it required specialized equipment, and more importantly, it took time and effort for ordinary people to do. Media distributors, namely the record companies and movie studios, were largely the only ones with the resources to do this efficiently on a large scale.

With the advent of computers, and specifically high-speed internet access, that’s no longer true. Now, in order to share some information with someone, you no longer need to hand off an actual physical object. Instead, all your computer needs to do is transfer the state of some transistors (your RAM) into electrical signals in a wire (your modem or network card), basically just shift some electrons around. This is a highly repetitive task, exactly the kind of thing computers can do very efficiently. In other words, copying large amounts of information has suddenly become quick and easy enough that individuals can do it.

To most people, this is a good thing. The entire purpose of the internet is to allow individuals to disseminate information widely: blog posts, tweets, personal photos, status updates, news articles, academic publications, advertisements, and all sorts of other types of content all benefit from having as many people see them as possible. But computers don’t distinguish between different kinds of content the way humans do, so technologically speaking, any system that lets people share their blog posts and tweets and so on can be used just as well for sharing music and movies. This represents a problem for the multimedia industry’s traditional business model, where they make money from each copy of a CD or DVD that gets sold.

In response, the representative trade organizations for the music and movie industries, the RIAA and MPAA respectively, have used various tactics to try to stop movies and music from being shared between people.

Technological: deep packet inspection

One tactic the media organizations have tried is making computers “smart” enough to pick out copyrighted content from all other internet traffic and stop it in transit. This is called “deep packet inspection” because rather than examining only the header data of each IP packet, which tell where the packet is from and where it’s going, computers look “deeper” into the packet, at its actual content. This system requires the coorperation of the tier-1 internet service providers, because they are the ones who control the computers that manage all the data sent over the internet. There was a large public outcry against this system because of the potential for abuse; once the technology to look into packet contents comes into common use for one reason (like catching copyright violations), it becomes almost trivially easy for someone with the right kind of access to use it for another purpose (like silencing political opposition). ISPs have also generally opposed the idea because, although they do use deep packet inspection under limited circumstances, checking every packet that passes through their system requires a lot of computational power. Besides, the use of deep packet inspection to detect copyrighted material in transit was easily nullified by encrypting data before sending it out over the internet, using services like Tor or the SSL encryption that’s built into every browser and most web servers.

Legal: the Digital Millennium Copyright Act

When technological measures proved impractical, the media industries shifted their focus from the ISPs to the “online service providers,” sites like YouTube or Facebook which actually host user-submitted content that may include music or movie files. To do so, they used the provisions of the Digital Millennium Copyright Act, a 1998 law that governs copyright protection for files and digital data.

Normally, copyright law holds any person or organization which distributes copyrighted material accountable for that distribution. However, there is a piece of the DMCA called the Online Copyright Infringement Liability Limitation Act, commonly known as the “safe harbor provision,” which specifies that an online service provider is not legally liable for infringing material uploaded by its users, as long as they remove the content once notified of it.

When a copyright owner finds their copyrighted work posted on a content provider site, they send a DMCA takedown notice to the site’s designated DMCA compliance agent
The content provider removes the content and notifies the person who uploaded it
The uploader can dispute the original DMCA takedown notice. Unless a lawsuit is filed within 14 days, the content gets put back up.

The safe harbor provision is what allows sites like YouTube to operate; effectively, they get to pass the legal “blame” for copyright infringement on to their users.

The DMCA does, in principle, allow anyone to get any content taken off a particular website, but only temporarily. A key feature of this procedure is that whenever something gets “flagged” as copyright infringement, the person who uploaded it has to be notified, and has the option to respond and get the content restored. Plus, if a takedown notice turns out to be invalid and it’s not just an honest mistake, there are (supposed to be) consequences for the sender of the notice, which discourages sending frivolous DMCA takedown notices.

Economic: MediaDefender

In addition to sending takedown notices to hosting proviers, the RIAA and MPAA also attempted to directly track down users who are publishing copyrighted content and threaten legal action against them. Unlike a large company (YouTube), an individual internet user often doesn’t have the resources to fight a court battle to demonstrate that he (or she) isn’t doing anything illegal. So most of the time, when random Joe Schmoe gets a notice from the RIAA or MPAA that he’s been caught illegally distributing copyrighted material and is getting served with a lawsuit unless he settles for $XXX, he’s just going to pay up.

This technique, though kind of sleazy, probably would have been more or less acceptable if it were only used on serious, confirmed copyright infringers. But it went much further than that. The media companies decided to “outsource” the work of tracking down and notifying copyright infringers to a separate company, MediaDefender. MediaDefender, in turn, started using automatic computer programs to detect uploaders of copyrighted content and send out infringement notices. As I alluded to earlier, computer programs aren’t smart enough to reliably tell when they’ve found true copyright infringement and when they haven’t. Perhaps you can see where this is going: MediaDefender sent out a lot of threatening letters to people who had done nothing wrong, and in some cases not to people at all. There are stories circulating on the internet about how some researchers got a settlement offer sent to their printer.

The reason this happened is that it’s actually very difficult to make the connection between the uploading of a copyrighted file and the real person responsible for it. The way MediaDefender and companies like it typically locate copyright infringers is by connecting to the BitTorrent network and attempting to download a movie. When you use BitTorrent to download something, you have access to the IP addresses of all the other computers you’re downloading it from. Each of those is uploading the file, which qualifies as illegal distribution. Or so MediaDefender assumed.

There are a few ways this argument can go wrong, though, which they typically forget to check. For one thing, in some cases the file they were downloading was not even copyrighted material, or if it was, the copyright was not owned by one of MediaDefender’s clients. They often checked only based on the name, not by examining the content of the file to see if it was what they thought it was. Secondly, some uploaders connect to the BitTorrent network through a proxy server, or something like Tor. In this case, the IP address the downloader sees is that of the proxy server, which legally is not responsible for the content passing through it.

After the stories went public and people saw how unreliable the techniques for identifying copyright infringers were, the RIAA and MPAA bowed to public pressure and stopped using MediaDefender’s services.

SOPA and PROTECT-IP

The latest entry in the war on copyright infringement is the bill known as either SOPA or PROTECT-IP, or more recently the E-PARASITE Act, and yes, those acronyms are exactly as stupid as they sound. All three names refer to more or less the same thing; SOPA or E-PARASITE is the version being considered by the House of Representatives, and PROTECT-IP is the version originally proposed by the Senate. These bills are intented to expand on the DMCA, and generally they provide additional means by which copyright owners (meaning the large media companies) can have content removed from being available online.

But they don’t stop there. Rather than just providing for the removal of copyrighted content, as the DMCA did, these bills contain procedures that would basically cripple any website accused of facilitating the distribution of copyrighted material. If SOPA/PROTECT-IP passes, it becomes illegal for any US-based company to advertise on such a website, provide it with a domain name, or allow any payments to be made to it.

Remember, the reason people are concerned about this is not that they want to avoid paying for DVDs. The problem is twofold:

There is a huge potential for these laws to be abused, since they make it possible for anyone (with enough money?) to get material or even entire websites taken offline, without having to prove that they have a legal basis for doing so
There are valid arguments that the whole system of copyright stifles innovation. A lot of progress in art and science is made by building on top of other people’s work, and strong copyright enforcement makes that a risky proposition. It’s much more difficult to come up with something entirely new.

If nothing else, consider that the list of people and companies who are opposed to SOPA and PROTECT-IP includes everyone from Google to Visa to the technology chief of the European Union to even the Vice President (sort of… indirectly). I’d say they have to have some valid points.

In closing, since one of the core themes of this blog is going to the original source, I’d encourage you to look at the actual text of each bill and put some thought into just how good or bad it would be for economic, artistic, and scientific development. Here are the links again, as of the time of posting:

SOPA, House Resolution 3261
PROTECT-IP Senate Resolution 968

To see current information on the status of each bill, head to the Library of Congress legislative information site and search for the bill number, either HR 3261 or SR 968.

A brief history of information exchange

Technological: deep packet inspection

Legal: the Digital Millennium Copyright Act

Economic: MediaDefender

SOPA and PROTECT-IP

Further Reading