Information Architecture in Wikipedia

This article was featured in the ASIS&T Bulletin special edition on Information Architecture, published in July 2015.

Information architecture (IA) is a mess on Wikipedia. We all know the coverage is inadequate, to say the least. In fact when Ted Nelson gave his keynote at the Information Architecture Summit this year, he mentioned looking up information architecture on Wikipedia to prepare for the talk, and the entire crowd groaned. What can we do as a community to correct this rather embarrassing problem?

Sadly, while we have been “defining the damned thing” within our own relatively closed community, Wikipedia editors, or Wikipedians, have been labeling many of our artifacts “stub” to “start” quality and of “low” to “mid-importance.” Years-old, unaddressed criticism of the Wikipedia article on information architecture includes the admonishments that we have no academic conferences or publications, that we use “peacock” terms and unverified claims, and that our thought leaders and pioneers are not notable enough for Wikipedia. Could this be true?

In 1996 Richard Saul Wurman described information architecture as “the creating of systemic, structural, and orderly principles to make something work – the thoughtful making of either artifact, or idea, or policy that informs because it is clear.” [[1], p.16] In fact, Wurman coined the phrase information architecture in his 1976 program for the American Institute of Architects annual meeting. If we take Wurman as a start, then IA is a field of practice that is approaching its 40th anniversary [[2]]. Unfortunately, this long history is not reflected in the world’s largest public encyclopedia.

Preserving and elevating the artifacts of so many years of information architecture practice is critical. It is our livelihood. The current Wikipedia article for information architecture requires findable, up-to-date information that highlights academic research and proceedings and references to notable IA practices and thought leadership.

I proposed to create WikiProject Information Architecture [[3]] as a way for our community to ensure that information architecture is described and cited accurately and consistently across all related Wikipedia articles. WikiProject Information Architecture intends to recognize and celebrate the foundations of this field by discovering and informing the public about the history, tools, practitioners and benefits of information architecture.

Information Architecture in Wikipedia

In February 2003, the first Wikipedia article about information architecture appeared. The entire entry contained a single sentence: “The architecture of information is important to the process of comprehension.” [[4]] A few months later a few more sentences were added, along with a simple outline of an information architecture process.

A few names of “notable information architects” appeared in July, including Jon Magliola and Jeremy Bernstein, both described as “New York-based information architects.” [[5]] These names were immediately joined by others: Joseph Abrams, Jesse James Garrett, Bill McGill, Peter Morville and Louis Rosenfeld. By August 2004 a reference to the word “web” was removed with a comment that “info arch is not necessarily web based.” [[6]] In September, all of the names were removed and in October 2004, the category “information architects” was added. At this time the article was a simple description with no citations or attributions.

The earliest versions of the information architecture article were a collection of statements and names without attribution or external references. Without these basic things, there is no way to test the validity of the information presented in the article. For example, the list of notable information architects could have been anyone. There were no links to articles about the listed people, so notability is unclear (despite what we in the industry may know about Jesse, Lou and Peter). Since the names were added anonymously, readers can’t verify whether people added their own names to plug their own businesses. Wikipedia has a fairly strict conflict of interest policy. That, combined with a lack of verification and notability, is probably why the names were removed. In fact, without reference to an independent source for any of the content, the article failed pretty much all of the basic content rules for a Wikipedia article. It should have been deleted.

Yet, the information architecture article continued to exist. After the initial article was created, more people started adding and deleting and shaping the article. Wikipedia maintains revision statistics on every article via an external tool called X!’s Tools. To date, the article has been edited 720 times at a rate of about one edit per week. Nearly half of the edits (48.1%) were anonymous. There are 169 people watching the page, which means they will get an alert anytime the page is changed; 334 pages link to the information architecture article and 167 link out. Many of the out links are topics embedded in the information science and semantic web templates that are included on the page. While not technically a part of the article body, the fact that information architecture is included on these templates lends some notability to the topic. Only 17 out links are external to Wikipedia [[7]]. Clearly there is a lot of information already on Wikipedia that can be used to help develop information architecture as a topic there.

WikiProject Information Architecture

Back in 2002, Andrew Hinton presented 25 Theses [[8]], stating that access to findable, usable information is a necessity. Wikipedia and data from Wikipedia are increasingly being used by people, search engines and content developers to discover, uncover and share information about all kinds of subjects. For many it has become an essential resource.

Wikipedia is more than just a global, multi-language, linked encyclopedia. It is a community of people committed to accurate and deep discovery of a world of topics. I see a common value between information architects and Wikipedians. Our field has historically had difficulty disambiguating itself from other practices, such as interaction design, systems architecture and building architecture. What better way to do this than by using our own expertise at disambiguation and structure?

To support this effort and ensure its ongoing success, I proposed a WikiProject for information architecture [[9]]. WikiProjects are groups of Wikipedia editors who work on a specific topic or set of tasks to improve Wikipedia. It is a way our community can help to improve the quality and discoverability of IA on Wikipedia.

In the end I have a rather grand goal: WikiProject Information Architecture will help present the information architecture field to the world more accurately and with more authority through discovery of IA thought leaders, activity, achievement and links to the history of knowledge organization and discovery. The official description is somewhat more modest: “WikiProject Information Architecture deals with the categorization, creation and improvement of Wikipedia articles under the purview of information architecture and related topics.” [[9]]

Are You Hooked on Editing?

Find a Wikipedia Meetup near you at https://en.wikipedia.org/wiki/Wikipedia:Meetup

 

The Information Architecture Editathon

At the 2015 Information Architecture Summit I presented a poster and an interactive session on Information Architecture and Wikipedia. On poster night, I presented my WikiProjects proposal alongside artifacts from the initial establishment of the Information Architecture Institute, including Hinton’s 25 Theses, as well as a list of over 800 published resources discussing information architecture topics.

Sunday, April 26, was the Wikipedia IA Editathon. I invited attendees to become Wikipedians to help discover and promote the people, events and history of information architecture. My plan for the editathon was to present a very brief introduction to editing rules, after which we would work on the information architecture article and related pages. I invited three Wikipedians from the Minnesota Wikipedia Group to help facilitate the session.

If you were unable to attend, I encourage anyone who is interested in this work to join us in our continuing endeavor. Below is a brief introduction to Wikipedia editing and some content rules. I created a complete tutorial [[10]] and suggestions for next steps. The WikiProject information architecture page contains sample articles to work from and an online, editable list of IA practitioners, topics, methodologies and links to research, publications and educational programs. The lists are evolving thanks to a small army of new Wikipedians.

Anatomy of a Wikipedia Article

Wikipedia articles have several components, including the article, the talk page, the edit view and the history view, among other parts. Every Wikipedia page edit is publicly visible, and every page edit you make is traceable to your user account or IP.

Talk pages are Wikipedia’s version of peer review. Editors use the talk page of an article to discuss possible additions and problems with a page. Anyone can edit these pages and add to the discussion. Both article pages and talk pages have edit views. The View History page lets you take a peek at changes that have occurred during the lifetime of the article from the time it was created.

To experiment with page editing, you can use the shared Sandbox [[11]] or your personal sandbox. Your Sandbox is a good place to start a new article. The Shared sandbox is a bit more public and might allow you immediate feedback if you use it. You can also use the Article Wizard [[12]], which walks you through creating and submitting a new article, including tips and links to support forums and other Wikipedians who can help you.

Wikipedia’s editing language is called Wiki Markup. A cheat sheet [[13]] for commonly used markup is available to help you remember how to create headings, formatting and links. One of the most important markups is the reference tag <ref>Citation</ref>. This tag allows references and footnotes in the article to appear where the template code, {{reflist}} is placed. Any text that appears between two curly brackets indicates a template, which can be Wikipedia created content as well as templates you can create yourself.

While it is recommended that you create a user account, Wikipedia doesn’t require one to edit a page. Note however that Wikipedia will record your IP address if you do make an edit without logging in. User accounts can be anonymous and many active Wikipedians prefer to be anonymous, particularly those who work on relatively controversial subject areas. There is a long article on user names [[14]] to help you decide how to choose a name. You can edit your user page with a brief introduction about who you are and the kinds of articles you like to edit. If you would like to identify yourself as a member of WikiProject Information Architecture, you can add the following template to your user page:

{{User WikiProject Information Architecture}}

 

Wikipedia Core Content Policies

The five pillars of Wikipedia govern how it should be treated from the perspective of editors and readers. Additional consideration for Wikipedia’s core content rules will ensure that your article will be acceptable.

  • Wikipedia is an encyclopedia.
  • Wikipedia is written from a neutral point of view.
  • Wikipedia is free content that anyone can see, use and edit.
  • Editors should treat each other with respect and civility.
  • Wikipedia has no firm rules.

 

Neutral Point of View. Are you telling all sides of the story? Wikipedia articles must be written from a neutral point of view, which can be somewhat difficult if the topic is relatively new or subject to any kind of public controversy. In information architecture, we are very familiar with the phenomenon of “defining the damn thing,” which is to say that there are strong opinions about what information architecture encompasses and what it does not. As long as you represent any and all significant views fairly, proportionately and without bias, your article should be unchallenged.

Wikipedia offers templates that allow editors to tag articles with various warnings (Figure 1). These tags often have helpful information about how to improve the article, including links to content rules and sources.

Notability Guidelines notice on Wikipedia
Figure 1.
Wikipedia editor’s warning messages.
Source: http://en.wikipedia.org/wiki/Wikipedia:Template_messages/Cleanup

 

Verifiability. Can you support the statements you make in your article? Material challenged and all quotations must be attributed to a reliable, published source. This support even includes material that is unlikely to be challenged. In Wikipedia, verifiability means that people reading and editing the encyclopedia can check that information comes from a reliable source and is not something that you made up.

Reliable Source. Have you provided adequate citations and references? Wikipedia editors prefer academic and peer-reviewed publications, which tend to be considered by Wikipedians as the most reliable sources. Articles in established newspapers, magazines and journals are good resources, as are college-level textbooks and books published by known and respected publishing houses. Self-published books can be considered to have less reliability as they may be seen as self-promotion, although publishing models are changing rapidly and more self-published books, particularly in industry practices, are available today.

No Original Research. Is the content you are posting new, untested or original? Then chances are it is not going to pass the editor’s review. Wikipedia is an encyclopedia, not a position paper and not a research journal. It does not publish original thought. If the material you wish to add to Wikipedia, such as a unique methodology or tool, can be attributed to a reliable, published source, then you should be fine. You may be able to reference published sources that suggest a specific position about a topic, as long as the text of the article does not advance any new theory or idea.

Notability. Does the topic warrant its own article? Wikipedia avoids the indiscriminate inclusion of topics. To be included a topics must satisfy Wikipedia’s notability requirement. Notability is tested based on whether a topic is verifiable or not. If reliable, third-party sources are available, the topic is said to be notable. Fame, importance or popularity may enhance the acceptability of a subject, but these attributes are not required, and in fact can cause a topic to be suspect.

Andrea Resmini, author, professor, past president of the IA Institute and chief editor of the Journal of Information Architecture, told me recently that the artifacts and theory of IA should drive how information is presented in Wikipedia. People that get mentioned, he said, “should be people who contributed to the field in a way that is clearly demonstrable, either with artifacts (practice), or “theory” (research)…. What before who.” Resmini’s articles, “A Brief History of IA”[[2]] from the Journal of Information Architecture and “The Architecture of Information” [[15]] from Etudes de Communication, are useful starting points for enumerating these artifacts.

The following guidelines will help to ensure that your article complies with Wikipedia’s content rules:

  • Cite sources (reliable, verifiable and published sources).
  • Do not include the full text of primary sources (paraphrase or quote).
  • Identify reliable sources (established publishers).
  • No plagiarism (not just because it’s wrong…it will be deleted).
  • Do not create hoaxes (you’d be surprised).
  • No patent nonsense (Wikipedia is supposed to be informative).

 

Biographies of Living Persons. Are you writing about someone who is living today? Wikipedia has a specific set of guidelines for writing about living persons. Since information architecture is a relatively young field, many of its practitioners are very much alive today. Wikipedia’s guidelines [[16]] help to avoid any hint of embarrassment or potentially libelous claims. They reflect U.S. law, which is very strict. Any contentious material about a living, or recently deceased, person will likely be deleted immediately without discussion. If you find an error in an article about yourself, you are allowed to correct obvious errors, but more extensive corrections or changes should be requested by posting the details to the article talk page or by adding an {{admin help}} tag to your user talk page.

In addition to content policies, Wikipedia also maintains a list of conduct policies. Most are no brainers. Be civil. Avoid personal attacks and harassment. Try to come to a consensus or show both sides of a controversial topic. Avoid editing wars and take advantage of Wikipedia’s dispute resolution tools if a disagreement breaks out. If you revert an edit more than three times in one day, your account can be blocked. Avoid sock puppetry, or the tendency to create multiple accounts or recruit friends to create additional accounts so as to create an illusion of support. Usernames can and should be neutral, but cannot be deleted. But should you want to create a new username, Wikipedia’s “clean start” policy allows you to do so without penalty. Finally, understand that no one owns any work on Wikipedia. All pages that you create or edit may be edited or even deleted by other editors (often mercilessly, though typically with cause).

A full list of Wikipedia guidleines and policies can be found at https://en.wikipedia.org/wiki/Wikipedia:List_of_policies.

Dispute Resolution

Suppose you put several hours of work into an article. You have it exactly as you want it. It’s perfect, but suddenly you find someone has deleted huge chunks of your effort or they have questioned your references. Don’t panic. First off, remember that Wikipedia is a collaborative effort. You don’t own the article. Anyone can edit it.

Rather than start an editing war where you revert the changes only to have someone add them back, use the Talk tab to start a discussion. Perhaps an editor’s grammar changes inadvertently changed the meaning of a statement. It is fair to express on the Talk page why you believe certain content should be included or removed. Certainly add references where requested. It is much nicer to request citation than to simply remove material. If you can identify the user who made a change, you can also post a question to that person’s Talk page. Sometimes civil discussion simply breaks down. In that case, Wikipedia’s dispute resolution forums can help mediate a solution.

The Information Architecture of Wikipedia

In many ways, Wikipedia is fulfilling Tim Berners-Lee’s vision of a linked, open collection of human knowledge. One of the things it does that the web does not do by itself is allow for two-way linking via category tagging. We can place a tag that identifies information architecture content so that all pages that contain this tag appear in a single page called Category:Information Architecture, thus linking all content within that category to every other page tagged the same way. That’s pretty cool.

A second and very powerful way to fulfill Berners-Lee’s vision is through linked data capabilities of DBpedia, the database that leverages content of all Wikipedia pages, as well as various community developed WikiTools.

Wikidata [[17]] is a free knowledge base that anyone can edit. It acts as a central storage for the structured data of its sister sites: Wikipedia, Wikimedia, Wikisource and others. Wikidata items link to external databases through identifier properties [[18]]. These identifier properties are similar to the authority control numbers of the Library of Congress and the VIAF (Virtual International Authority File) and allow users to specify concept hierarchies and relationships. Wikidata also supports FRBR (Functional Requirements for Bibliographic Records) [[19]], a 1998 recommendation of the International Federation of Library Associations and Institutions (IFLA) to restructure catalog databases to reflect the conceptual structure of information resources and other ontologies.

DBpedia [[20]] is another tool that allows you to extract structured content from existing Wikipedia and related content. DBpedia is the major hub, in fact the largest repository, of Linked Open Data. It contains information derived from Wikipedia and is connected with other linked datasets by over 50 million RDF links. The following diagram (Figure 2) is the status of DBpedia data cloud as of March 2014. This is a classic view of DBpedia, with different colors representing different sectors of knowledge. It is rapidly becoming an insufficient way to represent DBpedia effectively and 3D models are available to allow you to navigate linkages between concepts.

Linked Open Data cloud diagram, 2014

Figure 2.
Linked Open Data cloud diagram, 2014.
http://lod-cloud.net/

In DBpedia semantics, the database contains 4.58 million “things” and 583 million “facts.” Facts are statements that can be made by linking one Wikipedia object to one another or to objects in an external database. DBpedia is most widely used for linking subject, object predicates triples.

Important Links

Slide Deck/Tutorial:

https://drive.google.com/file/d/0By4jVfIRGeOlX25kMXZ1dTdVZlk/view?usp=sharing

WikiProject Information Architecture:

https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Information_Architecture

Guide to Contributing to Wikipedia:

https://en.wikipedia.org/wiki/Wikipedia:Contributing_to_Wikipedia

Wiki Markup Cheat Sheet:

http://en.wikipedia.org/wiki/File:Wiki_markup_cheatsheet_EN.pdf

Article Wizard:

https://en.wikipedia.org/wiki/Wikipedia:Article_wizard

Help on Creating Biographies of Living Persons:

https://en.wikipedia.org/wiki/Wikipedia:Biographies_of_living_persons

IA Resources: Books

https://docs.google.com/a/owasp.org/document/d/1BbzaObS6gLe1VhUknLqRcU5DgOrBzakmbxOfN-8Yyp0/edit?usp=sharing

IA Resources: Published Articles:

https://docs.google.com/document/d/1YZMpHnH7mWtQ52qnjzBAZ-O7jy9nDRWJGPbNT4S7zCE/edit?usp=docslist_api

 

Each thing in the DBpedia dataset is identified by a URI (universal resource identifier), such as http://dbpedia.org/resource/Name. In this instance, “Name” is derived from the URL of the source Wikipedia article: http://en.wikipedia.org/wiki/Name.

Examples:

http://dbpedia.org/resource/Billie_Holiday

http://en.wikipedia.org/wiki/Billie_Holiday

 

“Facts” can be expressed in RDF triples, as in the following examples:

A “has name” B

Subject → Predicate → Object

S = http://dbpedia.org/resource/Richard_Saul_Wurman

P = http://xmlns.com/foaf/0.1/name”

O = “Richard Saul Wurman”

S = http://dbpedia.org/resource/Richard_Saul_Wurman?

P = http://dbpedia-owl:occupation

O = http://dbpedia-owl:Architect

 

One can then turn a set of triples such as these into an ontology of Facts and Things about a particular subject.

In Wikipedia, the core of DBpedia consists of an infobox, which is a summary of article metadata. In the following example taken from the Wikipedia article on Richard Saul Wurman (Figure 3), the infobox template is rendered on the page as a box of information, but the data is extractable via DBpedia.

Richard Saul Wurman's Wikipedia entryFigure 3.
Wikipedia entry for Richard Saul Wurman

 

The Infobox template contains the following code which can be extracted by DBpedia:

{{Infobox scientist

|name = Richard Saul Wurman

|image = Richard Saul Wurman2.jpg

|birth_date ={{Birth date and age|1935|03|26|mf=y}}

|birth_place = [[Philadelphia, Pennsylvania|Philadelphia, Penn.]]

|nationality = American

|religion = Jewish

|field = Architecture, information architecture, design

|work_institution = [http://www.192021.org 19.20.21]; [http://www.tedmed.com/TEDMED]; [http://thewwwconference.com WWW Conference]; [http://www.wwwconference.com WWW Conference]}}

 

There are many additional tools to explore in WikiData:Tools [[21]]. This site includes tools, gadgets and code snippets to help you query, view and manipulate data from the WikiData site. Some popular tools include WikiData Query, AutoList 2 and Ask WikiData. All of the tools are developed by members of the community and are available to use for free.

So What Can You Do Now?

WikiProjects survive only through the participation of committed editors. After a brief blog and social media campaign, 17 prospective Wikipedians signed on to show their support. Will you join us?

  • Find an article in need of improvement and add it to WikiProject Information Architecture or pick one from the project list.
  • Tag an article with the WikiProject Information Architecture category tag by entering the following code on its talk page: [[Category:WikiProject_Information_Architecture]].
  • Research citations on a person or topic that interests you.
  • Suggest citations or content on the talk page of an article.
  • Fix typos and grammatical errors on existing articles.
  • If you are feeling bold, create a new article. Try out your Sandbox or the Article Wizard.
  • Always remember to ask for help if you need it!

Resources 

1 Wurman, R. S. (1997). Information Architects. Zurich, Switzerland: Graphis.

2 Resmini, A. (Fall 2011). A brief history of IA. Journal of Information Architecture, 3(2). Retrieved from http://journalofia.org/volume3/issue2/03-resmini/

3 Wikipedia:WikiProject Council/Proposals/Information Architecture. (April 13, 2015). Wikipedia. Web. Retrieved from http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Council/Proposals/Information_Architecture

4 Information architecture: revision history. (February 21, 2003). Wikipedia. Web. Retrieved from http://en.wikipedia.org/w/index.php?title=Information_architecture&oldid=961689

5 Information architecture: revision history. (July 17, 2003). Wikipedia. Web. Retrieved from http://en.wikipedia.org/w/index.php?title=Information_architecture&oldid=1158837

6 Information architecture: revision history. (August 8, 2004). Wikipedia. Web. Retrieved from http://en.wikipedia.org/w/index.php?title=Information_architecture&oldid=5839445

7 Information architecture. X!’s Tools. Retrieved from http://tools.wmflabs.org/xtools-articleinfo/index.php?article=Information_architecture&lang=en&wiki=wikipedia

8 Hinton, A. (December 5, 2002). 25 Theses. Asilomar Institute for Information Architecture. Retrieved from https://web.archive.org/web/20021205120152/http://www.aifia.org/pg/25_theses.php

9 WikiProject Information Architecture. Wikipedia. Web. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Information_Architecture

10 Whysel, N. Y. (n.d.). WikiProject Information Architecture tutorial [slide presentation]. Retrieved from https://drive.google.com/file/d/0By4jVfIRGeOlX25kMXZ1dTdVZlk/view?usp=sharing

11 Sandbox. (n.d.). Wikipedia. Web. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Sandbox

12 Article Wizard. (n.d.). Wikipedia. Web. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Article_wizard

13 Cheatsheet. (n.d.). Wikipedia. Web. Retrieved from http://en.wikipedia.org/wiki/Help:Cheatsheet

14 Username Policy. (n.d.). Wikipedia. Web. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Username_policy

15 Resmini, A. (2013). Architectures of information. Études de communication, 41, 31–56. Retrieved from http://edc.revues.org/5380?lang=en

16 Biographies of Living Persons. (n.d.). Wikipedia. Web. Retrieved from http://en.m.wikipedia.org/wiki/Wikipedia:Biographies_of_living_persons

17 Wikidata. (n.d.). Retrieved from http://www.wikidata.org

18 Wikidata:Database reports/List of properties/Top100. Retrieved from http://www.wikidata.org/wiki/Wikidata:Database_reports/List_of_properties/Top100

19 OCLC Research. (n.d.). OCLC’s research activities and IFLA’s Functional Requirements for Bibliographic Records. Retrieved from http://www.oclc.org/research/activities/frbr.html

20 DBpedia. (n.d.). Retrieved from http://www.dbpedia.org

21 WikiData:Tools. (n.d.). Retrieved from http://www.wikidata.org/wiki/Wikidata:Tools