As We May Learn…

In 1945, Vennevar Bush wrote a paper, published in The Atlantic Monthly, entitled “As We May Think“. In it, he described an imaginary machine of his conception which he called the Memex.

In essence, the Memex was an elaborate microfiche reader. The idea was that many different microfiche could be put together in a kind of portable catalogue, and retrieved and viewed mechanically on the Memex. In abstract terms, it was like a library of books or documents which could be accessed in a single desktop-centred apparatus.

The genius of the Memex came with the idea that in addition to simply viewing microfiche, you should be able to annotate and interlink them. For instance, if you were a biologist studying certain types of tree leaves, you should be able to use the Memex to write your notes on one microfiche, and link that page to other pages talking about photosynthesis — from your notes, or an encyclopaedia, perhaps. Since this was a WWII-era contraption, Bush described the machine in terms of vacuum tubes, projection screens, photocells and dry photography — all technologies dating from a very different computational epoch.

Bush’s essay went on to explore the space around the machine as well, with predictions that academic exchanges would be reinvigorated by the Memex, as trading notes and documents would be a simple matter of mailing microfiche catalogues back-and-forth. Additionally, he predicted (rightly so) that interlinking data would give way to new multidisciplinary views in academic discourse.

The Memex never got built. Not even in a prototypical form. As head of the US Office of Scientific Research and Development (OSRD), it is estimated that Bush, during WWII, had two thirds of American physicists under his direction. So clearly, he was not short of resources.

The key point of the essay was not to build a machine, however, but a new way of looking at information as sets of interlinked concepts. In content, Bush was suggesting what I would say is an equivalent to modern Wikipedia. And this, roughly half a century before the latter’s inception. In form, Bush had outlined the basis for hypertext, a concept that only re-entered the academic consciousness 20 years later, in 1965, when Ted Nelson coined the term.

The title of the essay, As We May Think, referenced the notion that our tools for organizing information should be designed to work the way we think.

I think an extension to this line of thought is due. Now that we’ve brought Bush’s idea to fruition, what comes next? We need to start thinking about adapting computers to learn as we may learn.

Key to the human process of learning is the use of analogy. In fact, Bush believed the same, saying the brain “operates by association”. Children build up a conception of the world by observation and deduction, and by constructing a repertoire of increasingly-sophisticated concepts which are used to explain further observations and deductions. Wherever we’re able to compare two concepts and understand the similarities and differences, we learn. A particularly illustrative example of this kind of analogous reasoning is seen here:

For some reason, when I was very little, I thought that TV shows “stayed inside the TV” until it was turned on. When we watched TV, the shows were “leaking out.” When I started going to school, I would come home in the afternoon, try to watch “Sesame Street” and find that it wasn’t on. I then asked Mom not to watch TV while I was at school, as it “wasted” “Sesame Street.”

From I Used to Believe – The Childhood Beliefs Site.

Here, the writer, as a child, made the erroneous analogy of TV being like a carton of milk or something similar that he or she might have seen. The child’s conceptual model evolved when it learned how television broadcasts were different. And the impact of learning here was significant and rewarding enough to allow the author to recall the incident many years later.

To enable this kind of learning to take place inside a computer, the notion of hypertext won’t suffice. Text, in any form (hyperlinked, digital or even handwritten), is based on language, which is too nuanced and intricate to be convenient for machine learning. I don’t think we’re quite there yet with natural language technology. Basically, it would be too complicated to build a machine that understands, say, the Wikipedia entry on the Metal umlaut, and be able to learn enough from the text to be able to answer questions posed by an interviewer. But more importantly, we don’t think in language.

A better approach might be to reduce information down to simple concepts; learning can then occur by creating a conceptual map that explains how concepts are interrelated, and more importantly, how they are alike or dissimilar. By giving a computer a conceptual map of information, and the ability to build analogies, real human-like learning can take place inside a machine.

If television broadcasting and milk carton existed as two concepts that a computer knew about, then it could make the same erroneous deduction about the nature of television broadcasting that the above storyteller made — that Sesame Street “leaks out” of the TV when it’s on. If/when the computer made an observation that put into question the conceptual link between television broadcasting and milk carton, it could revise its own conceptual map accordingly, and learn something new.

The appeal of such a system, to me, is the thought that unlike Wikipedia, all data has a place. Wikipedia’s editor community has an ongoing debate between the opposing camps of the inclusionists, who advocate putting as much detail in Wikipedia entries as possible, to make the whole as complete as possible, and the deletionists, who realize that Wikipedia needs to be presentable and accessible, and that there is such a thing as too much. Does an article of little general interest like “List of Canadian female tennis players between 1969 and 1975″ really belong in an encyclopedia, or not?

With a system based on hyperdata (interlinked atomic concepts) instead of hypertext (interlinked human-readable text), every single little bit of information, no matter how mundane, can be part of the system, because now, a machine can selectively show you as much or as little detail as you want. The list of tennis players suggested above could be compiled dynamically by the computer based on what it knows, with no intervention needed by a human editor.

I made reference to this in Keeping Memory. The shopping lists and receipts and mixtapes that you accumulated, that were relevant to you, can fit nicely in their own corner of the hyperdata network, and no one else has to see them if they don’t want to. A receipt, for instance, would be linked to the concept of you, the concept of that In N’ Out Burger on Sunset Boulevard you went to, the concept of November 12th 2009, the concept of a #1 combo, the concept of the Visa card you used to pay for your burger, etc… Each of these concepts are further linked to other concepts — California, Hollywood, the year 2009, the billing address on your Visa card, the batch of mustard used in your order, and so on. Here, finally, we have information embedded in a true context, making the data relevant.

The deductive power of such a system is quite awesome in itself. Suppose the batch of mustard used to make my burger was tainted. By following the web of concepts interlinked in meaning and time, you could find out everyone who ate some of that tainted mustard, in any restaurant, and alert them to seek a doctor’s counsel. No matter how strong the inclusionist argument is, I doubt you’ll ever find the Wikipedia article entitled “List of restaurant patrons who ate products containing mustard batch #878-001″.

Beyond that, since concepts are separated from language, it would be possible to get the same information in any language. In fact, just as you could use specific words to represent concepts (“C-A-T” represents the concept of a cat in English), you could use any system of representation you can think of. The concept of cat could be represented by a shape and colour, a sound, a song, a smell, a note in a symphony, a move in a chess game, a unique snowflake design; anything, really… These are all just abstract ways of describing concepts, but the concepts themselves do not change.

The Separation of Concepts and Language — The same concept map can be represented in any language, or in completely abstract terms.

As the machine grows its increasingly-sophisticated conception of knowledge, it would further its own ability to categorize and conceptualize new information. By having a wealth of interlinked data already at its disposal, and the ability to observe how concepts are alike or dissimilar, the computer engages in a process of learning.

Clearly, hyperdata is the way forward, as it allows data to live within a context, and allows a separation between concepts and their representations. Hyperdata also provides the foundations for a machine learning system which uses simple analogical reasoning to understand and reinforce its own conceptual map of the world.