Forum

Protecting the Internet Without Wrecking It

How to meet the security threat.

Jonathan Zittrain

March 1, 2008

On November 2, 1988, five to ten percent of the 60,000 computers hooked up to the Internet started acting strangely. Inventories of affected computers revealed that rogue programs were demanding processor time. When concerned administrators terminated these programs, they reappeared and multiplied. They then discovered that renegade code was spreading through the Internet from one machine to another. The software—now commonly thought of as the first Internet worm—was traced to a twenty-three-year-old Cornell University graduate student, Robert Tappan Morris, Jr., who had launched it by infecting a machine at MIT from his terminal in Ithaca, New York.

Morris said he unleashed the worm to count how many machines were connected to the Internet, and analysis of his program confirmed his benign intentions. But his code turned out to be buggy. If Morris had done it right, his program would not have drawn attention to itself. It could have remained installed for days or months, and quietly performed a wide array of activities other than Morris’s digital nose count.

The mainstream media had an intense but brief fascination with the incident. A government inquiry led to the creation of the Defense Department-funded Computer Emergency Response Team Coordination Center at Carnegie Mellon University, which serves as a clearinghouse for information about viruses and other network threats. A Cornell report on what had gone wrong placed the blame solely on Morris, who had engaged in a “juvenile act” that was “selfish and inconsiderate.” It rebuked elements of the media that had branded Morris a hero for dramatically exposing security flaws, noting that it was well known that the computers’ Unix operating systems were imperfect. The report called for university-wide committees to provide advice on security and acceptable use. It described consensus among computer scientists that Morris’s acts warranted some form of punishment, but not “so stern as to damage permanently the perpetrator’s career.”

In the end, Morris apologized, earned three years of criminal probation, performed four hundred hours of community service, and was fined $10,050. He transferred from Cornell to Harvard, founded a dot-com startup with some friends in 1995, and sold it to Yahoo! in 1998 for $49 million. He is now a respected, tenured professor at MIT.

In retrospect, the commission’s recommendations—urging users to patch their systems and hackers to grow up—might seem naïve. But there were few plausible alternatives. Computing architectures, both then and now, are designed for flexibility rather than security. The decentralized, nonproprietary ownership of the Internet and the computers it links made it difficult to implement structural revisions. More important, it was hard to imagine cures that would not entail drastic, wholesale, purpose-altering changes to the very fabric of the Internet. Such changes would have been wildly out of proportion to the perceived threat, and there is no record of their having even been considered.

Generative systems are powerful—they enable extraordinary numbers of people to devise new ways to express themselves in speech, art, or code, perhaps because they lack central coordination and control.

By design, the university workstations of 1988 were generative: their users could write new code for them or install code written by others. This generative design lives on in today’s personal computers. Networked PCs are able to retrieve and install code from each other. We need merely click on an icon or link to install new code from afar, whether to watch a video newscast embedded within a Web page, update our word processing or spreadsheet software, or browse satellite images.

Generative systems are powerful and valuable, not only because they foster the production of useful things like Web browsers, auction sites, and free encyclopedias, but also because they enable extraordinary numbers of people to devise new ways to express themselves in speech, art, or code and to work with other people. These characteristics can make generative systems very successful even though—perhaps especially because—they lack central coordination and control. That success attracts new participants to the generative system.

The flexibility and power that make generative systems so attractive are, however, not without risks. Such systems are built on the notion that they are never fully complete, that they have many uses yet to be conceived of, and that the public can be trusted to invent good uses and share them. Multiplying breaches of that trust can threaten the very foundations of the system.

Whether through a sneaky vector like the one Morris used, or through the front door, when a trusting user elects to install something that looks interesting without fully understanding it, opportunities for accidents and mischief abound. A hobbyist computer that crashes might be a curiosity, but when a home or office PC with years’ worth of vital correspondence and papers is compromised, it can be a crisis. And when thousands or millions of individual, business, research, and government computers are subject to attack, we may find ourselves faced with a fundamentally new and harrowing scenario. As the unsustainable nature of the current state of affairs becomes more apparent, we are left with a dilemma that cannot be ignored. How do we preserve the extraordinary benefits of generativity, while addressing the growing vulnerabilities that are innate to it?

* * *

How profound is today’s security threat? Since 1988, the Internet has suffered few truly disruptive security incidents. A network designed for communication among academic and government researchers appeared to scale beautifully as hundreds of millions of new users signed on during the 1990s, and three types of controls seemed adequate to address emerging dangers.

First, the hacker ethos frowns upon destructive hacking. Most viruses that followed Morris’s worm had completely innocuous payloads: in 2004, Mydoom spread like wildfire and reputedly cost billions in lost productivity, but the worm did not tamper with data, and it was programmed to stop spreading at a set time. With rare exceptions like the infamous Lovebug worm, which overwrote files with copies of itself, the few highly malicious viruses that run contrary to the hacker ethos were so poorly coded that they failed to spread very far.

Second, network operations centers at universities and other institutions became more professionalized between 1988 and the advent of the mainstream Internet. For a while, most Internet-connected computers were staffed by professionals, administrators who generally heeded admonitions to patch regularly and scout for security breaches. Less adept mainstream consumers began connecting unsecured PCs to the Internet in earnest only in the mid-1990s. Then, transient dial-up connections greatly limited both the amount of time during which they were exposed to security threats, and the amount of time that, if compromised and hijacked, they would contribute to the problem.

Finally, bad code lacked a business model. Programs to trick users into installing them, or to sneak onto the machines, were written for amusement. Bad code was more like graffiti than illegal drugs: there were no economic incentives for its creation.

Today each of these controls has weakened. With the expansion of the community of users, the idea of a set of ethics governing activity on the Internet has evaporated. Anyone is allowed online if he or she can find a way to a computer and a connection, and mainstream users are transitioning rapidly to always-on broadband.

Moreover, PC user awareness of security issues has not kept pace with broadband growth. A December 2005 online safety study found 81 percent of home computers to be lacking first-order protection measures such as current antivirus software, spyware protection, and effective firewalls.

Perhaps most significantly, bad code is now a business. What seemed genuinely remarkable when first discovered is now commonplace: viruses that compromise PCs to create large zombie “botnets” open to later instructions. Such instructions have included directing PCs to become their own e-mail servers, sending spam by the thousands or millions to e-mail addresses harvested from the hard disk of the machines themselves or gleaned from Internet searches, with the entire process typically proceeding behind the back of the PCs’ owners. At one point, a single botnet occupied fifteen percent of Yahoo!’s search capacity, running random searches on Yahoo! to find text that could be inserted into spam e-mails to throw off spam filters. Dave Dagon, who recently left Georgia Tech University to start a bot-fighting company named Damballa, pegs the number of botnet-infected computers at close to 30 million. Dagon said, “Had you told me five years ago that organized crime would control one out of every ten home machines on the Internet, I would not have believed that.” So long as spam remains profitable, that crime will persist.

Botnets can also be used to launch coordinated attacks on a particular Internet endpoint. For example, a criminal can attack an Internet gambling Web site and then extort payment to make the attacks stop. The going rate for a botnet to launch such an attack is reputed to be about $50,000 per day.

Viruses are thus valuable properties. Well-crafted worms and viruses routinely infect vast swaths of Internet-connected personal computers. Antivirus vendor Eugene Kaspersky of Kaspersky Labs told an industry conference that they “may not be able to withstand the onslaught.” IBM’s Internet Security Systems reported a 40 percent increase in situations in which a machine was compromised between 2005 and 2006. Nearly all of those vulnerabilities could be exploited remotely, and over half allowed attackers to gain full access to the machine and its contents.

As the supply of troubles has increased, the capacity to address them has steadily diminished. Patch development time increased throughout 2006 for all of the top operating system providers. Times shortened modestly across the board in the first half of 2007, but, on average, enterprise vendors were still exposed to vulnerabilities for 55 days—plenty of time for hazardous code to make itself felt. What is more, antivirus researchers and firms require extensive coordination efforts simply to agree on a naming scheme for viruses as they emerge. This is a far cry from a common strategy for battling them.

In addition, the idea of casually cleaning a virus off a PC is gone. When computers are compromised, users are now typically advised to reinstall everything on them. For example, in 2007, some PCs at the U.S. National Defense University fell victim to a virus. The institution shut down its network servers for two weeks and distributed new laptops to instructors. In the absence of such drastic measures, a truly “mal” piece of malware could be programmed to, say, erase hard drives, transpose numbers inside spreadsheets randomly, or intersperse nonsense text at arbitrary intervals in Word documents found on infected computers—and nothing would stand in the way.

Recognition of these basic security problems has been slowly growing in Internet research communities. Nearly two-thirds of academics, social analysts, and industry leaders surveyed by the Pew Internet & American Life Project in 2004 predicted serious attacks on network infrastructure in the coming decade. Security concerns will lead to a fundamental shift in our tolerance of the status quo, either by a catastrophic episode, or, more likely, a glacial death of a thousand cuts.

Consider, in the latter scenario, the burgeoning realm of “badware” beyond viruses and worms: software that is often installed at the user’s invitation. The popular file-sharing program KaZaA, though advertised as “spyware-free,” contains code that users likely do not want. It adds icons to the desktop, modifies Microsoft Internet Explorer, and installs a program that cannot be closed by clicking “Quit.” Uninstalling the program does not uninstall all these extras, and the average user does not know how to get rid of the code itself. What makes such badware “bad” has to do with the level of disclosure made to a consumer before he or she installs it. The most common responses to the security problem cannot easily address this gray zone of software.

Many technologically savvy people think that bad code is simply a Microsoft Windows issue. They believe that the Windows OS and the Internet Explorer browser are particularly poorly designed, and that “better” counterparts (GNU/Linux and Mac OS, or the Firefox and Opera browsers) can help shield a user. But the added protection does not get to the fundamental problem, which is that the point of a PC—regardless of its OS—is that its users can easily reconfigure it to run new software from anywhere. When users make poor decisions about what software to run, the results can be devastating to their machines and, if they are connected to the Internet, to countless others’ machines as well.

The cybersecurity problem defies easy solution because any of its most obvious fixes will undermine the generative essence of the Internet and PC. Bad code is an inevitable side effect of generativity, and as PC users are increasingly victimized by bad code, consumers are likely to reject generative PCs in favor of safe information appliances—digital video recorders, mobile phones, iPods, BlackBerrys, and video game consoles—that optimize a particular application and cannot be modified by users or third-parties. It is entirely reasonable for consumers to factor security and stability into their choice. But it is an undesirable choice to have to make.

* * *

On January 9, 2007, Steve Jobs introduced the iPhone to an eager audience crammed into San Francisco’s Moscone Center. A beautiful and brilliantly engineered device, the iPhone blended three products into one: an iPod, with the highest-quality screen Apple had ever produced; a phone, with cleverly integrated functionality, such as voicemail that came wrapped as separately accessible messages; and a device to access the Internet, with a smart and elegant browser, and built-in map, weather, stock, and e-mail capabilities.

Steve Jobs had no clue how the Apple II would be used. The iPhone—for all its startling inventiveness—is precisely the opposite.

This was Steve Jobs’s second revolution. Thirty years earlier, at the First West Coast Computer Faire in nearly the same spot, the twenty-one-year-old Jobs, wearing his first suit, exhibited the Apple II personal computer to great buzz amidst “ten thousand walking, talking computer freaks.” The Apple II was a machine for hobbyists who did not want to fuss with soldering irons: all the ingredients for a functioning PC were provided in a convenient molded plastic case. Instead of puzzling over bits of hardware or typing up punch cards to feed into someone else’s mainframe, Apple owners faced only the hurdle of a cryptic blinking cursor in the upper left corner of the screen: the PC awaited instructions. But the hurdle was not high. Some owners were inspired to program the machines themselves, but beginners, too, could load software written and then shared or sold by their more skilled counterparts. The Apple II was a blank slate, a bold departure from previous technology that had been developed and marketed to perform specific tasks.

The Apple II quickly became popular. And when programmer and entrepreneur Dan Bricklin introduced the first killer application for the Apple II in 1979—VisiCalc, the world’s first spreadsheet program—sales of the ungainly but very cool machine took off. An Apple running VisiCalc helped to convince a skeptical world that there was a place for the PC on everyone’s desk.

The Apple II was quintessentially generative technology. It was a platform. It invited people to tinker with it. Hobbyists wrote programs. Businesses began to plan on selling software. Jobs (and Apple) had no clue how the machine would be used. They had their hunches, but, fortunately for them (and the rest of us), nothing constrained the PC to the hunches of the founders.

The iPhone—for all its startling inventiveness—is precisely the opposite. Rather than a platform that invites innovation, the iPhone comes preprogrammed. You are not allowed to add programs to the all-in-one device that Steve Jobs sells you. Its functionality is locked in, though Apple can change it through remote updates. Indeed, those who managed to tinker with the code and enable iPhone-support of more or different applications, were on the receiving end of Apple’s threat to transform the iPhone into an iBrick, a threat on which the company delivered. The machine was not to be generative beyond the innovations that Apple (and its exclusive carrier, AT&T) wanted—or perhaps beyond software provided by partners that Apple approves, should it release an expected software development kit.

Jobs was not shy about these restrictions. As he said at the iPhone launch: “We define everything that is on the phone . . . You don’t want your phone to be like a PC. The last thing you want is to have loaded three apps on your phone and then you go to make a call and it doesn’t work anymore.”

In the arc from the Apple II to the iPhone, we learn something important about where the Internet has been, and something even more important about where it is going. The PC revolution was launched with PCs that invited innovation by others. So too with the Internet. Both were designed to accept any contribution that followed a basic set of rules (either coded for a particular operating system, or respecting the protocols of the Internet). Both overwhelmed their respective proprietary, non-generative competitors: PCs crushed stand-alone word processors and the Internet displaced such proprietary online services as CompuServe and AOL.

But the future is looking very different because of the security situation—not generative PCs attached to a generative network, but appliances tethered to a network of control. These appliances take the innovations already created by Internet users and package them neatly and compellingly, which is good—but only if the Internet and PC can remain sufficiently central in the digital ecosystem to compete with locked-down appliances and facilitate the next round of innovations. The balance between the two spheres is precarious, and it is slipping toward the safer appliance. For example, Microsoft’s Xbox 360 video game console is a powerful computer, but, unlike Microsoft’s Windows operating system for PCs, it does not allow just anyone to write software that can run on it. Bill Gates sees the Xbox at the center of the future digital ecosystem, rather than its periphery: “It is a general purpose computer . . . [W]e wouldn’t have done it if it was just a gaming device. We wouldn’t have gotten into the category at all. It was about strategically being in the living room.”

Devices like iPhones and Xbox 360s may be safer to use, and they may seem capacious in features so long as they offer a simple Web browser. But by focusing on security and limiting the damage that users can do through their own ignorance or carelessness, these appliances also limit the beneficial tools that users can create or receive from others—enhancements they may be clueless about when they are purchasing the device.

If the PC ceases to be at the center of the information technology ecosystem, the most restrictive aspects of information appliances will come to the fore.

Security problems related to generative PC platforms may propel people away from PCs and toward information appliances controlled by their makers. If we eliminate the PC from many dens or living rooms, we eliminate the test bed and distribution point of new, useful software from any corner of the globe. We also eliminate the safety valve that keeps those information appliances honest. If TiVo makes a digital video recorder that has too many limits on what people can do with the video they record, people will discover DVR software like MythTV that records and plays TV shows on their PCs. If mobile phones are too expensive, people will use Skype. But people do not buy PCs as insurance policies against appliances that limit their freedoms, even though PCs serve exactly this vital function. People buy them to perform certain tasks at the moment of acquisition. If PCs cannot reliably perform these tasks, most consumers will not see their merit, and the safety valve will be lost. If the PC ceases to be at the center of the information technology ecosystem, the most restrictive aspects of information appliances will come to the fore.

In fact, the dangers may be more subtly packaged. PCs need not entirely disappear as people buy information appliances in their stead. They can themselves be made less generative. Users tired of making the wrong choices about installing code on their PCs might choose to let someone else decide what code should be run. Firewalls can protect against some bad code, but they also complicate the installation of new good code. As antivirus, antispyware, and antibadware barriers proliferate, there are new barriers to the deployment of new good code from unprivileged sources. And in order to guarantee effectiveness, these barriers are becoming increasingly paternalistic, refusing to allow users easily to overrule them. Especially in environments where the user of the PC does not own it—offices, schools, libraries, and cyber cafés—barriers are being put in place to prevent the running of any code not specifically approved by the relevant gatekeeper. Users may find themselves limited to using a Web browser. And while “Web 2.0” promises that much more use for a browser—consumers can now write papers and use spreadsheets through a browser, and software developers now write for Web platforms like Facebook instead of PC operating systems —these Web platforms are themselves tethered to their makers, their generativity contingent on the continued permission of the platform vendors.

Short of completely banning unfamiliar software, code might be divided into first- and second-class status, with second-class, unapproved software allowed to perform only certain minimal tasks on the machine, operating within a digital sandbox. This technical solution is safer than the status quo but imposes serious limits. It places the operating system creator or installer in the position of deciding what software will and will not run. The PC will itself have become an information appliance, not easily reconfigured or extended by its users.

The key to avoiding such a future is to give the market a reason not to abandon or lock down the PCs that have served it so well, also giving most governments reason to refrain from major intervention into Internet architecture in the name of public safety. The solutions to the generative dilemma will rest on social and legal as much as technical innovation, and the best guideposts can be found in other generative successes in those arenas. Mitigating abuses of openness without resorting to lockdown will depend on a community ethos embodied in responsible groups with shared norms and a sense of public purpose, rather than in the hands of a single gatekeeper, whether public or private.

* * *

We need a strategy that addresses the emerging security troubles of today’s Internet and PCs without killing their openness to innovation. This is easier said than done, because our familiar legal tools are not particularly attuned to maintaining generativity. A simple regulatory intervention—say, banning the creation or distribution of deceptive or harmful code—will not work because it is hard to track the identities of sophisticated wrongdoers, and, even if found, many may not be in cooperative jurisdictions. Moreover, such intervention may have a badly chilling effect: much of the good code we have seen has come from unaccredited people sharing what they have made for fun, collaborating in ways that would make businesslike regulation of their activities burdensome for them. They might be dissuaded from sharing at all.

We can find a balance between needed change and undue restriction if we think about how to move generative approaches and solutions that work at one “layer” of the Internet—content, code, or technical—to another. Consider Wikipedia, the free encyclopedia whose content—the entries and their modifications—is fully generated by the Web community. The origins of Wikipedia lie in the open architecture of the Internet and Web. This allowed Ward Cunningham to invent the wiki, generic software that offers a way of editing or organizing information within an article, and spreading this information to other articles. Unrelated non-techies then used Wikis to form Web sites at the content layer, including Wikipedia. People are free not only to edit Wikipedia, but to take all of its contents and experiment with different ways of presenting or changing the material, perhaps by placing the information on otherwise unrelated Web sites in different formats. When abuses of this openness beset Wikipedia with vandalism, copyright infringement, and lies, it turned to its community—aided by some important technical tools—as the primary line of defense, rather than copyright or defamation law. Most recently, this effort has been aided by the introduction of Virgil Griffith’s Wikiscanner, a simple tool that uses Wikipedia’s page histories to expose past instances of article whitewashing by interested parties.

Unlike a form of direct regulation that would have locked down the site, the Wikipedian response so far appears to have held many of Wikipedia’s problems at bay. Why does it work so well? Generative solutions at the content layer seem to have two characteristics that suggest broad approaches to lowering the risks of the generative Internet while preserving its openness. First, much participation in generating Web content—editing Wikipedia entries, blogging, or even engaging in transactions on eBay and Amazon that ask for reviews and ratings to establish reputations—is understood to be an innately social activity. These services solicit and depend upon participation from the public, and their participation mechanisms are easily mastered. The same possibility for broad participation exists one level down at the technical layer, but it has not yet been as fully exploited: mainstream users have thus far been eager to have someone else solve underlying problems, which they perceive as technical rather than social. Second, many content-layer enterprises have developed technical tools to support collective participation, augmenting an individualistic ethos with community-facilitating structures. In the Internet and PC security space, on the other hand, there have been few tools available to tap the power of groups of users to, say, distinguish good code from bad.

The effectiveness of the social layer in Web successes points to two approaches that might save the generative spirit of the Net, or at least keep it alive for another interval. The first is to reconfigure and strengthen the Net’s experimentalist architecture to make it fit better with the vast expansion in the number and types of users. The second is to develop new tools and practices that will enable relevant people and institutions to help secure the Net themselves instead of waiting for someone else to do it.

Generative PCs with Easy Reversion. Wikis are designed so that anyone can edit them. This creates a genuine and ongoing risk of bad edits, through either incompetence or malice. The damage that can be done, however, is minimized by the wiki technology, because it allows bad changes to be quickly reverted. All previous versions of a page are kept, and a few clicks by another user can restore a page to the way it was before later changes were made. So long as there are more users (and automated tools they create) detecting and reverting vandalism than there are users vandalizing, the community wins. (Truly, the price of freedom is eternal vigilance.)

Our PCs can be similarly equipped. For years Windows XP (and now Vista) has had a system restore feature, where snapshots are taken of the machine at a moment in time, allowing later bad changes to be rolled back. The process of restoring is tedious, restoration choices can be frustratingly all-or-nothing, and the system restoration files themselves can become corrupted, but it represents progress. Even better would be the introduction of features that are commonplace on wikis: a quick chart of the history of each document, with an ability to see date-stamped sets of changes going back to its creation. Because our standard PC applications assume a safer environment than really exists, these features have never been demanded or implemented. Because wikis are deployed in environments prone to vandalism, their contents are designed to be easily recovered after a problem.

The next stage of this technology lies in new virtual machines, which would obviate the need for cyber cafés and corporate IT departments to lock down their PCs.

In an effort to satisfy the desire for safety without full lockdown, PCs can be designed to pretend to be more than one machine, capable of cycling from one personality to the next. In its simplest implementation, we could divide a PC into two virtual machines: “Red” and “Green.” The Green PC would house reliable software and important data—a stable, mature OS platform and tax returns, term papers, and business documents. The Red PC would have everything else. In this setup, nothing that happens on one PC can easily affect the other, and the Red PC could have a simple reset button that restores a predetermined safe state. Someone could confidently store important data on the Green PC and still use the Red PC for experimentation.

Easy, wiki-style reversion, coupled with virtual PCs, would accommodate the experimentalist spirit of the early Internet while acknowledging the important uses for those PCs that we do not want to disrupt. Still, this is not a complete solution. The Red PC, despite its experimental purpose, might end up accumulating data that the user wants to keep, occasioning the need for what Internet architect David D. Clark calls a “checkpoint Charlie” to move sensitive data from Red to Green without also carrying a virus or anything else undesirable. There is also the question of what software can be deemed safe for Green—which is just another version of the question of what software to run on today’s single-identity PCs.

For these and related reasons, virtual machines will not be panaceas, but they might buy us some more time. And they implement a guiding principle from the Net’s history: an experimentalist spirit is best maintained when failures can be contained as learning experiences rather than expanding to catastrophes.

A Generative Solution to Bad Code. The Internet’s original design relied on few mechanisms of central control. This lack of control has the generative benefit of allowing new services to be introduced, and new destinations to come online, without any up-front vetting or blocking by either private incumbents or public authorities. With this absence of central control comes an absence of measurement. The Internet itself cannot say how many users it has, because it does not maintain user information. There is no awareness at the network level of how much bandwidth is being used by whom. From a generative point of view this is good because it allows initially whimsical but data-intensive uses of the network to thrive (remember goldfish cams?)—and perhaps to become vital (now-routine videoconferencing through Skype, from, unsettlingly, the makers of KaZaA).

Because we cannot easily measure the network and the character of the activity on it, we cannot easily assess and deal with threats from bad code without laborious and imperfect cooperation among a limited group of security software vendors.

But limited measurement is starting to have generative drawbacks. Because we cannot easily measure the network and the character of the activity on it, we cannot easily assess and deal with threats from bad code without laborious and imperfect cooperation among a limited group of security software vendors. The future of the generative Net depends on a wider circle of users able to grasp the basics of what is going on within their machines and between their machines and the network.

What might this system look like? Roughly, it would take the form of toolkits to overcome the digital solipsism that each of our PCs experiences when it attaches to the Internet at large, unaware of the size and dimension of the network to which it connects. These toolkits would run unobtrusively on the PCs of participating users, reporting back—to a central source, or perhaps only to each other—information about the vital signs and running code of that PC, which could help other PCs determine the level of risk posed by new code. When someone is deciding whether to run new software, the toolkit’s connections to other machines could tell the person how many other machines on the Internet are running the code, what proportion of machines belonging to self-described experts are running it, whether those experts have vouched for it, and how long the code has been in the wild.

Building on these ideas about measurement and code assessment, Harvard University’s Berkman Center and the Oxford Internet Institute—multidisciplinary academic enterprises dedicated to charting the future of the Net and improving it—have begun a project called StopBadware, designed to assist rank-and-file Internet users in identifying and avoiding bad code. The idea is not to replicate the work of security vendors like Symantec and McAfee, which for a fee seek to bail new viruses out of our PCs faster than they pour in. Rather, these academic groups are developing a common technical and institutional framework that enables users to devote some bandwidth and processing power for better measurement of the effect of new code. The first step in the toolkit is now available freely for download. Herdict is a small piece of software that assembles vital signs like number of pop-up windows or crashes per hour. It incorporates that data into a dashboard usable by mainstream PC owners. Efforts like Herdict will test the idea that solutions that have worked for generating content might also be applicable to the technical layer. Such a system might also illuminate Internet filtering by governments around the world, as people participate in a system where they can report when they cannot access a Web site, and such reports can be collated by geography.

A full adoption of the lessons of Wikipedia would give PC users the opportunity to have some ownership, some shared stake, in the process of evaluating code, especially because they have a stake in getting it right for their own machines. Sharing useful data from their PCs is one step, but this may work best when the data goes to an entity committed to the public interest of solving PC security problems and willing to share that data with others. The notion of a civic institution here does not necessarily mean cumbersome governance structures and formal lines of authority so much as it means a sense of shared responsibility and participation. Think of the volunteer fire department or neighborhood watch: while not everyone is able to fight fires or is interested in watching, a critical mass of people are prepared to contribute, and such contributions are known to the community more broadly.

The success of tools drawing on group generativity depends on participation, which helps establish the legitimacy of the project both to those participating and those not. Internet users might see themselves only as consumers whose purchasing decisions add up to a market force, but, with the right tools, users can also see themselves as participants in the shaping of generative space—as netizens.

Along with netizens, hardware and software makers could also get involved. OS makers could be asked or required to provide basic tools of transparency that empower users to understand exactly what their machines are doing. These need not be as sophisticated as Herdict. They could provide basic information on what data is going in and out of the box and to whom. Insisting on getting better information to users could be as important as providing a speedometer or fuel gauge on an automobile—even if users do not think they need one.

Internet Service Providers (ISPs) can also reasonably be asked or required to help. Thus far, ISPs have been on the sidelines regarding network security. The justification is that the Internet was rightly designed to be a dumb network, with most of its features and complications pushed to the endpoints. The Internet’s engineers embraced the simplicity of the end-to-end principle for good reasons. It makes the network more flexible, and it puts designers in a mindset of making the system work rather than designing against every possible thing that could go wrong. Since this early architectural decision, “keep the Internet free” advocates have advanced the notion of end-to-end neutrality as an ethical ideal, one that leaves the Internet without filtering by any of its intermediaries, routing packets of information between sender and recipient without anyone looking on the way to see what they contain. Cyberlaw scholars have taken up end-to-end as a battle cry for Internet freedom, invoking it to buttress arguments about the ideological impropriety of filtering Internet traffic or favoring some types or sources of traffic over others.

End-to-end neutrality has indeed been a crucial touchstone for Internet development. But it has limits. End-to-end design preserves users’ freedom only because the users can configure their own machines however they like. But this depends on the increasingly unreliable presumption that whoever runs a machine at a given network endpoint can readily choose how the machine will work. Consider that in response to a network teeming with viruses and spam, network engineers recommend more bandwidth (so the transmission of “deadweights” like viruses and spam does not slow down the much smaller proportion of legitimate mail being carried by the network) and better protection at user endpoints. But users are not well positioned to painstakingly maintain their machines against attack, and intentional inaction at the network level may be self-defeating, because consumers may demand locked-down endpoint environments that promise security and stability with minimum user upkeep.

Strict loyalty to end-to-end neutrality should give way to a new principle asking that any modifications to the Internet’s design or the behavior of ISPs be made in such a way that they will do the least harm to generative possibilities. Thus, it may be preferable in the medium term to screen out viruses through ISP-operated network gateways rather than through constantly updated PCs. To be sure, such network screening theoretically opens the door to undesirable filtering. But we need to balance this speculative risk against the growing threat to generativity. ISPs are in a good position to help in a way that falls short of undesirable perfect enforcement facilitated through endpoint lockdown, by providing a stopgap while we develop the kinds of community-based tools that can promote salutary endpoint screening.

Even search engines can help create a community process that has impact. In 2006, in cooperation with the Harvard and Oxford StopBadware initiative, Google began automatically identifying Web sites that had malicious code hidden in them, ready to infect browsers. Some of these sites were set up for the purpose of spreading viruses, but many more were otherwise-legitimate Web sites that had been hacked. For example, visitors to chuckroast.com can browse fleece jackets and other offerings and place and pay for orders. However, Google found that hackers had subtly changed the chuckroast.com code: the basic functionalities were untouched, but code injected on the home page would infect many visitors’ browsers. Google tagged the problem, and appended to the Google search result: “Warning: This site may harm your computer.” Those who clicked on the results link anyway would get an additional warning from Google and the suggestion to visit StopBadware or pick another page.

The site’s traffic plummeted, and the owner (along with the thousands of others whose sites were listed) was understandably anxious to fix it. But cleaning a hacked site takes more than an amateur Web designer. Requests for specialist review inundated StopBadware researchers. Until StopBadware could check each site and verify it had been cleaned of bad code, the warning pages stayed up. Prior to the Google/StopBadware project, no one took responsibility for this kind of security. Ad hoc alerts to the hacked sites’ webmasters—and their ISPs—garnered little reaction. The sites were fulfilling their intended purposes even as they were spreading viruses to visitors. With Google/StopBadware, Web site owners have experienced a major shift in incentives for keeping their sites clean.

The result is perhaps more powerful than a law that would have directly regulated them, and it could in turn generate a market for firms that help validate, clean, and secure Web sites. Still, the justice of Google/StopBadware and similar efforts remains rough, and market forces alone might not direct the desirable level of attention to those wrongly labeled as people or Web sites to be avoided, or properly labeled but with no place to seek help.

The touchstone for judging such efforts is whether they reflect the generative principle: do the solutions arise from and reinforce a system of experimentation? Are the users of the system able, so far as they are interested, to find out how the resources they control—such as a PC—are participating in the environment? Done well, these interventions can encourage even casual users to have some part in directing what their machines will do, while securing those users’ machines against outsiders who have not been given permission by the users to make use of them. Automatic accessibility by outsiders—whether by vendors, malware authors, or governments—can deprive a system of its generative character as its users are limited in their own control.

Data Portability. The generative Internet was founded and cultivated by people and institutions acting outside traditional markets, and later carried forward by commercial forces. Its success requires an ongoing blend of expertise and contribution from multiple models and motivations. Ultimately, a move by the law to allocate responsibility to commercial technology players in a position to help but without economic incentive to do so, and to those among us, commercially inclined or not, who step forward to solve the pressing problems that elude simpler solutions may also be in order. How can the law be shaped if one wants to reconcile generative experimentation with other policy goals beyond continued technical stability? The next few proposals are focused on this question about the constructive role of law.

One important step is making locked-down appliances and Web 2.0 software-as-service more palatable. After all, they are here to stay, even if the PC and Internet are saved. The crucial issue here is that a move to tethered appliances and Web services means that more and more of our experiences in the information space will be contingent: a service or product we use at one moment could act completely differently the next, since it can be so quickly reprogrammed by the provider without our assent. Each time we power up a mobile phone, video game console, or BlackBerry, it might have gained some features and lost others. Each time we visit a Web site offering an ongoing service like e-mail access or photo storage, the same is true.

As various services and applications become more self-contained within particular devices, there is a minor intervention the law could make to avoid undue lock-in. Online consumer protection law has included attention to privacy policies. A Web site without a privacy policy, or one that does not live up to whatever policy it posts, is open to charges of unfair or deceptive trade practices. Similarly, makers of tethered appliances and Web sites keeping customer data ought to be asked to offer portability policies. These policies would declare whether users will be allowed to extract their data should they wish to move their activities from one appliance or Web site to another. In some cases, the law could create a right of data portability, in addition to merely insisting on a clear statement of a site’s policies.

A requirement of data portability is a generative insurance policy applying to individual data wherever it might be stored. And the requirement need not be onerous. It could apply only to uniquely provided personal data such as photos and documents, and mandate only that such data ought to readily be extractable by the user in some standardized form. Maintaining data portability will help people pass back and forth between the generative and the non-generative, and, by permitting third-party backup, it will also help prevent a situation in which a non-generative service suddenly goes offline, with no recourse for those who have used the service to store their data.

Appliance neutrality. Reasonable people disagree on the value of defining and legally mandating network neutrality. But if there is a present worldwide threat to neutrality in the movement of bits, it comes from enhancements to traditional and emerging “appliancized”services like Google mash-ups and Facebook apps, in which the service provider can be pressured to modify or kill others’ applications on the fly. Surprisingly, parties to the network neutrality debate—who have focused on ISPs—have yet to weigh in on this phenomenon.

In the late 1990’s, Microsoft was found to possess a monopoly in the market for PC operating systems. Indeed, it was found to be abusing that monopoly to favor its own applications—such as its Internet Explorer browser—over third-party software, against the wishes of PC makers who wanted to sell their hardware with Windows preinstalled but adjusted to suit the makers’ tastes. By allowing third-party contribution from the start—an ability to run outside software—after achieving market dominance, Microsoft was forced by the law to meet ongoing requirements to maintain a level playing field between third-party software and its own.

We have not seen the same requirements arising for appliances that do not allow, or strictly control, the ability of third parties to contribute from the start. So long as the market’s favorite video game console maker never opens the door to generative third-party code, it is hard to see how the firm could be found to be violating competition law. A manufacturer is entitled to make an appliance and to try to bolt down its inner workings so that they cannot be modified by others. So when should we consider network neutrality-style mandates for appliancized systems? The answer lies in that subset of appliancized systems that seeks to gain the generative benefits of third-party contribution at one point in time while reserving the right to exclude it later.

The common law recognizes vested expectations. For example, the law of adverse possession dictates that people who openly occupy another’s private property without the owner’s explicit objection (or, for that matter, permission) can, after a lengthy period of time, come to legitimately acquire it. More commonly, property law can find prescriptive easements—rights-of-way across territory that develop by force of habit—if the owner of the territory fails to object in a timely fashion as people go back and forth across it. These and related doctrines point to a deeply held norm: certain consistent behaviors can give rise to obligations, sometimes despite fine print that tries to prevent those obligations from coming about.

Applied to the idea of application neutrality, this norm of protecting settled expectations might suggest the following: if Microsoft wants to make the Xbox a general purpose device but still not open to third-party improvement, no regulation should prevent it. But if Microsoft does so by welcoming third-party contribution, it should not later be able to impose barriers to outside software continuing to work. Such behavior is a bait and switch that is not easy for the market to anticipate and that stands to allow a platform maker to exploit habits of generativity to reach a certain plateau, dominate the market, and then make the result proprietary—exactly what the Microsoft case rightly was brought to prevent.

The free software movement has produced some great works, but under prevailing copyright law even the slightest bit of “poison,” in the form of code from a proprietary source, could amount to legal liability for anyone who copies or even uses the software.

Generative Software. At the code layer, it is not easy for the law to maintain neutrality between the two models of software production that have emerged with the Net: proprietary software whose source code recipe is nearly always hidden, and free software—free not in terms of the price, but the openness of its code to public view and modification. The free software movement has produced some great works, but under prevailing copyright law even the slightest bit of “poison,” in the form of code from a proprietary source, could amount to legal liability for anyone who copies or even uses the software. These standards threaten the long-term flourishing of the free software movement: the risks are more burdensome than need be.

But there are some changes to the law that would help. The kind of law that shields Wikipedia and Web site hosting companies from liability for unauthorized copyrighted material contributed by outsiders, at least so long as the organization acts expeditiously to remove infringing material once it is notified, ought to be extended to the production of code itself. Code that incorporates infringing material ought not be given a free pass, but those who have promulgated it without knowledge of the infringement would have a chance to repair the code or cease copying it before becoming liable.

Modest changes in patent law could help as well. If those who see value in software patents are correct, infringement is rampant. And to those who think patents chill innovation, the present regime needs reform. To be sure, amateurs who do not have houses to lose to litigation can still contribute to free software projects—they are judgment proof. Others can contribute anonymously, evading any claims of patent infringement since they simply cannot be found. But this turns coding into a gray market activity, eliminating what otherwise could be a thriving middle class of contributing firms should patent warfare ratchet into high gear.

The law can help level the playing field. For patent infringement in the United States, the statute of limitations is six years; for civil copyright infringement it is three. Unfortunately, this limit has little meaning for computer code because the statute of limitations starts from the time of the last infringement. Every time someone copies (or perhaps even runs) the code, the clock starts ticking again on a claim of infringement. This should be changed. The statute of limitations could be clarified for software, requiring that anyone who suspects or should suspect his or her work is being infringed sue within, for instance, one year of becoming aware of the suspect code. For example, the acts of those who contribute to free software projects—namely, releasing their code into a publicly accessible database like SourceForge—could be enough to start the clock ticking on that statute of limitations. In the absence of such a rule, lawyers who think their employers’ proprietary interests have been compromised can wait to sue until a given piece of code has become wildly popular—essentially sandbagging the process in order to let damages rack up.

Generative Licenses. There is a parallel to how we think about balancing generative and sterile code at the content layer: legal scholars Lawrence Lessig and Yochai Benkler, as well as others, have stressed that even the most rudimentary mixing of cultural icons and elements, including snippets of songs and video, can accrue thousands of dollars in legal liability for copyright infringement without harming the market for the original proprietary goods. Benkler believes that the explosion of amateur creativity online has occurred despite this system. The high costs of copyright enforcement and the widespread availability of tools to produce and disseminate what he calls “creative cultural bricolage” currently allow for a variety of voices to be heard even when what they are saying is theoretically sanctionable by fines up to $30,000 per copy made, $150,000 if the infringement is done “willfully.” As with code, the status quo shoehorns otherwise laudable activity into a sub-rosa gray zone.

As tethered appliances begin to take up more of the information space, making information that much more regulable, we have to guard against the possibility that content produced by citizens who cannot easily clear permissions for all its ingredients will be squeezed out. Even the gray zone will constrict.

* * *

Regimes of legal liability can be helpful when there is a problem and no one has taken ownership of it. No one fully owns today’s problems of copyright infringement and defamation online, just as no one fully owns security problems on the Net. But the solution is not to conscript intermediaries to become the Net police.

Under prevailing law, Wikipedia could get away with much less stringent monitoring of its articles for plagiarized work, and it could leave plainly defamatory material in an article but be shielded in the United States by the Communications Decency Act provision exempting those hosting material from responsibility for what others have provided. Yet Wikipedia polices itself according to an ethical code that encourages contributors to do the right thing rather than the required thing or the profitable thing.

To harness Wikipedia’s ethical instinct across the layers of the generative Internet, we must figure out how to inspire people to act humanely in digital environments. This can be accomplished with tools—some discussed above, others yet to be invented. For the generative Internet to come fully into its own, it must allow us to exploit the connections we have with each other. Such tools allow us to express and live our civic instincts online, trusting that the expression of our collective character will be one at least as good as that imposed by outside sovereigns—sovereigns who, after all, are only people themselves.

Our generative technologies need technically skilled people of good will to keep them going, and the fledgling generative activities—blogging, wikis, social networks—need artistically and intellectually skilled people of goodwill to serve as true alternatives to a centralized, industrialized information economy that asks us to identify only as consumers of meaning rather than as makers of it. The deciding factor in whether our current infrastructure can endure will be the sum of the perceptions and actions of its users. Traditional state sovereigns, pan-state organizations, and formal multi-stakeholder regimes have roles to play. They can reinforce conditions necessary for generative blossoming, and they can also step in when mere generosity of spirit cannot resolve conflict. But that generosity of spirit is a society’s crucial first line of moderation.

Our fortuitous starting point is a generative device on a neutral Net in tens of millions of hands. Against the trend of sterile devices and services that will replace the PC and Net stand new platforms like Google’s Android, a user-programmable mobile phone not tethered to any one service provider. To maintain that openness, the users of those devices must experience the Net as something with which they identify and belong. We must use the generativity of the Net to engage a constituency that will protect and nurture it.

Forum

Protecting the Internet Without Wrecking It

Jonathan Zittrain

Read the Responses:

Protecting the Internet — Forum Response

Bruce M. Owen

Protecting the Internet — Forum Response

Susan Crawford

Protecting the Internet — Forum Response

Roger A. Grimes

Protecting the Internet — Forum Response

Richard M. Stallman

Protecting the Internet — Forum Response

David D. Clark

Protecting the Internet — Forum Response

Hal Varian

Jonathan Zittrain Responds:

Protecting the Internet — Final Response

Jonathan Zittrain

Forums

How Did We Fare on COVID-19?

To restore public trust and prepare for the next pandemic, we need a reckoning with the U.S. experience—what worked, and what didn’t.

Stephen Macedo
Frances Lee

with responses from

Adam Gaffney
Jonathan White
Cailin O’Connor & James Owen Weatherall
Adam Kucharski

The AI We Deserve

Critiques of artificial intelligence abound. Where’s the utopian vision for what it could be?

Evgeny Morozov

with responses from

Brian Eno
Audrey Tang
Terry Winograd
Bruce Schneier & Nathan Sanders
Sarah Myers West & Amba Kak
Wendy Liu
Edward Ongweso Jr.
Brian Merchant

Public Policy after Pandemic

The United States wasn’t prepared for COVID-19, despite decades of warnings. What must we do to plan more effectively?

Sheila Jasanoff

with responses from

Zeynep Pamuk
Alexandre White
Jana Bacevic
Jay S. Kaufman

trending_flat

Get our newsletter

Vital reading on politics, ideas, and culture to your inbox

A political and literary forum, independent and nonprofit since 1975

Registered 501(c)(3) organization

Help Us Stay Paywall-Free

Donate

Become a Member