On Wikipedia, the Internet Archive, and what MIT's Sanjay Sarma calls the Ziggurat of Facts.
Digitally today we gambol across the Internet like children let out early from school. We go about liking a post on Instagram, posting a video on YouTube, reading an article on Wikipedia, bumping to some new sounds on Spotify, and (for those of us who act our age) looking up a news story on CNN or googling a fact, the weather, or a sports score. Yet, while it seems as though we are roaming through the fields now gaily and freely, we are actually running across other people’s lawns, crossing the private railroad company’s trainyards and trestles, sprinting across endless box-store parking lots, one after another. And these owners are chasing us, just like in the movies—only they are running after us and capturing our data, filming us, in a way, as we all scramble across our rectangles, rather than trying to call the police or beat us some about the head.
And while it may seem at first glance that all of these platforms are the same—they appear free, or freely accessible, for the most part; they all appear to us on the same rectangle of screen and through the same audio speakers; their logos and iconography are all part of our common brand awareness now; they are all, to continue the metaphor, outside—they are not the same. If intellectual property—and copyrighted IP in particular—is something, at the end of days, we will ultimately be able to enjoy together, we should understand the mechanisms that facilitate our ability to share these materials today. Very few sites on the Internet are free as in “freedom”: nonprofit; noncommercial; ad-free; open to all at all times. The sites that are free, in this sense, are our digital parks now, our forests, our civic centers, our real public spaces, and they need our nurture. This includes platforms for the distribution of knowledge online—and also mechanisms for the collective and collaborative production of knowledge. And increasingly, the two go hand in glove.
For the Commons, only Wikipedia, of those in the above list, is part of that process of production. It is our air. It does not belong to someone else. Wikipedia promotes a self-conscious approach to rights—to contributor rights, to use rights, to user rights—as its first order of business. Not only is the actual process of knowledge creation on Wikipedia one that actually produces more equity and fairness within the act of authoring and editing—never closed to anyone, never static; not only is it an encyclopedia that even ten years ago rivalled and usually surpassed the quality of the Encyclopedia Britannica and other longstanding print and online references; today this powerful website, the fifth or sixth or tenth most visited site in the world, shapes knowledge creation as well as reflects the state of what we know.1
It is also, and rather significantly for this volume, the most gigantic and successful realization ever known of the original Enlightenment project. A commentator on Twitter discovered that you can cause a disturbance—and provide moments of serious fear and confusion—by taking almost any Wikipedia article and putting its descriptions of things that exist today into the past tense. Take Wikipedia’s article on water, for example. Today it reads:
Water is an inorganic, transparent, tasteless, odorless, and nearly colorless chemical substance, which is the main constituent of Earth’s hydrosphere and the fluids of all known living organisms. It is vital for all known forms of life, even though it provides no calories or organic nutrients.
Now read it this way:
Water was an inorganic, transparent, tasteless, odorless, and nearly colorless chemical substance, which was the main constituent of Earth’s hydrosphere and the fluids of all known living organisms. It was vital for all known forms of life, even though it provided no calories or organic nutrients.
Or take oxygen.
Oxygen is the chemical element with the symbol O and atomic number 8. It is a member of the chalcogen group in the periodic table, a highly reactive nonmetal, and an oxidizing agent that readily forms oxides with most elements as well as with other compounds. After hydrogen and helium, oxygen is the third-most abundant element in the universe by mass.
Oxygen was the chemical element with the symbol O and atomic number 8. . . . After hydrogen and helium, oxygen was the third-most abundant element in the universe by mass. . . . Oxygen was continuously replenished in Earth’s atmosphere by photosynthesis, which used the energy of sunlight to produce oxygen from water and carbon dioxide.
And more past-tense tension:
The Supreme Court of the United States (SCOTUS) was the highest court in the federal judiciary of the United States of America. It had ultimate (and largely discretionary) appellate jurisdiction over all federal and state court cases that involve a point of federal law, and original jurisdiction over a narrow range of cases, specifically “all Cases affecting Ambassadors, other public Ministers and Consuls, and those in which a State shall be Party.” The Court held the power of judicial review, the ability to invalidate a statute for violating a provision of the Constitution. It was also able to strike down presidential directives for violating either the Constitution or statutory law.
Democracy (Greek: δημοκρατία, dēmokratiā, from dēmos “people” and kratos “rule”) was a form of government in which the people had the authority to choose their governing legislation. Who people are and how authority was shared among them were core issues for democratic theory, development and constitution. Some cornerstones of these issues were freedom of assembly and speech, inclusiveness and equality, membership, consent, voting, right to life and minority rights. . . . Generally, there were two types of democracy: direct and representative. In a direct democracy, the people directly deliberated and decided on legislature. In a representative democracy, the people elected representatives to deliberate and decide on legislature, such as in parliamentary or presidential democracy.
The United States of America (USA), commonly known as the United States (U.S. or US) or America, was a country mostly located in central North America, between Canada and Mexico. It consisted of 50 states, a federal district, five major self-governing territories, and various possessions. At 3.8 million square miles (9.8 million km2), it was the world’s third or fourth-largest country by total area. With a 2019 estimated population of over 328 million, the U.S. was the third most-populous country in the world. The Americans were a racially and ethnically diverse population that was shaped through centuries of immigration. The capital was Washington, D.C., and the most populous city was New York City.
The dystopic nature of this exercise is effective. It also works because many of us have come to recognize the lingua franca, structure, cadence, and content of a Wikipedia article, and rely on the site to provide us with entry points into learning more. Moreover, they are the key entry points online, as each of the above surfaces first in a Google search for the noun.2
Wikipedia’s terms of service enshrine the values that we should seek from any publishing platform—if our creations are meant to be our own, if contributing to knowledge is a part of what we want to accomplish, if the means of production are meant to be our own as well. It summarizes its terms as follows:
Part of our mission is to:
• Empower and Engage people around the world to collect and develop educational content and either publish it under a free license or dedicate it to the public domain.
• Disseminate this content effectively and globally, free of charge.
You are free to:
• Read and Print our articles and other media free of charge.
• Share and Reuse our articles and other media under free and open licenses.
• Contribute To and Edit our various sites or Projects.
Under the following conditions:
• Responsibility—You take responsibility for your edits (since we only host your content).
• Civility—You support a civil environment and do not harass other users.
• Lawful Behavior—You do not violate copyright or other laws.
• No Harm—You do not harm our technology infrastructure.
With the understanding that:
• You License Freely Your Contributions—you generally must license your contributions and edits to our sites or Projects under a free and open license (unless your contribution is in the public domain).
• No Professional Advice—the content of articles and other projects is for informational purposes only and does not constitute professional advice.3
And when it will be able to host and play video, much as Amazon and Facebook and Alphabet/Google/YouTube do—which is coming, by the way—Wikipedia will become the most important media platform of all time.4
In 2017, the Metropolitan Museum of Art in New York City released more than three hundred thousand images of items in its collections with a license that allows anyone, anywhere to use and modify them freely and for free—a license that facilitated their incorporation into Wikipedia. The thinking—it was not as sudden a shift as all that; Wikipedians had been working with the museum, as with other cultural institutions, from 2009, and social tagging at museums and so-called cultural informatics kicked off even before that—was that the Met could become more relevant to people’s lives by having its knowledge circulate around the world, and on platforms even beyond its own control, than if it were to continue to husband its resources on its own website alone. Still, the concept of releasing art and culture information—images and text, but also metadata—into a world that could manipulate that information and make things out of it is not that easy a concept for every museum curator (or for any lover of facts, truth, and knowledge, for that matter) to embrace.
The Met’s objective was also to introduce authoritative, canonical, definitive images of its holdings to the online world—and counter the persistent duplication of low-grade and otherwise faulty images of its artwork around the world, the so-called “Yellow Milkmaid Syndrome.” The term comes from a 2011 white paper that the European Commission–supported cultural agency Europeana published to drum up support for its own plan—realized the following year, five years before the Met’s public initiative—to release material relating to European museums and cultural-heritage organizations freely and openly on the web. Harry Verwayen, today executive director of Europeana and then one of the paper’s co-authors, summarized the challenge at the time, “the release of open metadata from the perspective of the business model of the cultural institutions.”
Why does it make sense for them to open up their data? The study showed that this depends to a large extent on the role that metadata plays in the business model of the institution. By and large, all institutions agreed that the principle advantages of opening their metadata is that this will increase their relevance in the digital space, it will engage new users with their holdings, and perhaps most importantly, that it is in alignment with their mission to make our shared cultural heritage more accessible to society.
But by themselves these arguments were not in all cases sufficiently convincing to make the bold move to open the data. There is also a fear that the authenticity of the works would be jeopardised if made available for anyone to re-use without attribution, and a loss of potential income if all control would be given away. All understandable arguments from institutions who are increasingly under financial pressure. Nevertheless one could feel that the balance was tilting towards opening access.
An illustrating anecdote was provided by the Rijksmuseum. The Milkmaid, one of Johannes Vermeer’s most famous pieces, depicts a scene of a woman quietly pouring milk into a bowl. During a survey the Rijksmuseum discovered that there were over 10,000 copies of the image on the Internet—mostly poor, yellowish reproductions. As a result of all of these low-quality copies on the web, according to the Rijksmuseum, “people simply didn’t believe the postcards in our museum shop were showing the original painting. This was the trigger for us to put high-resolution images of the original work with open metadata on the web ourselves. Opening up our data is our best defence against the ‘Yellow Milkmaid.’”
Open metadata for 20 million items in European collections from more than 2,200 institutions were released to the world that year.5
The Met took this initiative further. The Museum brought on a Wikipedian-in-residence to advocate within the Wikipedia community for more Met imagery and material to appear in Wikipedia. Through organized edit-a-thons that Wikipedia sponsored at the museum, as well as through individual initiatives among Wikipedia editors, Met Museum materials began to be published in key articles—on George Washington, Vincent van Gogh, Babylon, Benjamin Franklin, Henry VIII—across the vast encyclopedia. Images viewed by a hundred or two hundred web users a month on the Met’s own website suddenly were being seen by hundreds of thousands of web users of Wikipedia and elsewhere. Visitors to the museum in 2018 numbered 7 million; and to the Met website, 30 million; but to the Wikipedia pages featuring Met images and data, 190 million.6 The Met also began to explore opportunities to collaborate with other institutions interested in pushing technology, and artificial intelligence in particular, to spread art and culture knowledge—and to collaborate with the crowd. A new series of initiatives—involving MIT and others—launched in 2019 to accelerate experimentation between knowledge institutions and society.7
The Commons is growing almost exponentially fast.8 More than 1.4 billion works are online now, licensed with a Creative Commons license. Wikipedia hosts more than 40 million pieces of freely licensed content.9 Search tools are being built to discover freely licensed content exclusively.10 For knowledge institutions to embrace the Commons will be to recognize—in some direct way, but also in some symbolic, shamanistic, almost spiritual way—the power of the real public sphere, the power of society, and the power of the people (for whom, after all, we design our knowledge institutions). The system/the network/the apparatus allowing us to make this turn, and to secure it, is only twenty years old. The bright shiny objects of the Internet—Instagram (founded in 2010), Facebook (2004), Twitter (2006), YouTube (2005), Netflix (1997), WeChat (2011), and all the others—have fixed our attention, as happens, but it’s the combination of Wikipedia and Creative Commons, together with the relentless drumbeat of the Free Software Foundation, the Electronic Frontier Foundation, and other advocacy groups, that really have our best future in mind. And that future includes the past, as it is fair to say that the majority of all creative works created by mankind are (much stronger language than “should be”) already part of the Commons today.
This kind of movement by knowledge institutions—hundreds in Europe, the Met, and now more institutions in the United States—is being propelled now not only by Commons activists, Wikipedia editors, and key organizations like the Wiki Education Foundation, all of whom are seeking, in the spirit of this book, greater access to knowledge, but also by cultural and educational institutions themselves, institutions whose leaders and funders are seeking greater connectivity to a public that in turn might be seeking access to knowledge that they possess, control, or can contribute. The advantages for marketing culture and education via the inheritor to the Enlightenment Encyclopédie—which happens, again, to be one of the most popular web platforms in the world—are clear enough, but so too is the burgeoning impulse to install authoritative, curated assets for the world to see through the mushroom clouds and shrapnel of lies big and small, fakery, deceptions, misrepresentations, half-truths, and (at best) incomplete information all exploding now online. In many ways, knowledge institutions together—museums, libraries, archives, galleries, universities—represent a kind of Fifth Estate, beyond the four estates of the executive, the legislature, the judiciary, and the press. Each of these institutions can contribute to our self-knowledge as humans, and in essential ways; together they deserve a name.
Projects are under way now to much more systematically bring your favorite knowledge institution, anywhere in the world that it may be, out to Wikipedia and also to bring Wikipedia into the knowledge institution—training faculty and students to write with it, training teachers to both teach and strengthen it, and training the world to read it (and indeed any information) with a careful and critical eye. The statistics provided today by Wikipedia’s newest calculator, for the GLAM sector—galleries, libraries, archives, and museums—would have made Diderot proud. The British Museum images in Wikipedia are spread across pages that to date have 5.7 billion views. The Imperial War Museum: 7.2 billion. The Library of Congress: 36 billion. NASA: 30.1 billion. Media from Public Library of Science journals—1.9 billion.11 And the effort, true to its roots, also runs across subject matter as varied as that of the original Encyclopédie, but more modern now. One such effort, on improving Wikipedia’s content in and related to women’s studies, reached 346 million readers in about a year.12 Imagine if all the cultural and educational institutions in eighteenth-century France had had staff members who were writing for the Enlightenment encyclopedia—and buying subscriptions!—rather than just a tiny coterie of upper-class white men. And imagine if their understanding of liberty (and they, too, had some imagination!) was such that their contributions, like today’s to Wikipedia, would have been governed by Wikipedia’s four freedoms:
To ensure the graceful functioning of this ecosystem, works of authorship should be free, and by freedom we mean:
• the freedom to use the work and enjoy the benefits of using it[;]
• the freedom to study the work and to apply knowledge acquired from it[;]
• the freedom to make and redistribute copies, in whole or in part, of the information or expression[; and]
• the freedom to make changes and improvements, and to distribute derivative works[.]13
And imagine if all things published and posted then and since had had Wikipedia’s requirement not of simply being true, but of being verifiable.14 Imagine how much more speedily this whole process of global enlightenment would be moving! When the museum is built for our time and our work—the Extraordinary Museum of Knowledge, or the Ziggurat, in MIT vice president Sanjay Sarma’s words, of Facts—what culture of ours would we want featured and displayed?15 What would be the items and artifacts we’d be most proud of? What would we want our museum visitors to be able to do with them? Amazon shipping envelopes? Netflix DVDs? Or Wikipedia, which by then will have become the Bible, Koran, and Netflix of information? For the Commons is not only a destination and ultimately the living museum of all creativity, it is also a dynamic showcase for one of the newest forms of human innovation: Commons-based peer production, and the Commons-based peer production of knowledge in particular.16