This is an extended footnote to the definition of information presented in Mowshowitz, A., On the market value of information commodities: I. The nature of information and information commodities, Journal of the American Society for Information Science, 43, 1992, pp. 225-232. Information was defined in that article as “the ability of a goal-oriented system to decide or control,” where “decide” and “control” both reduce to the verb “choose.” This definition was motivated in part by an analogy between information and energy. Energy, understood informally as the “ability to do work”, is clearly a property of a system rather than of any material substance (or fuel) from which energy may be extracted. Viewing information, like energy, as the ability of a system to do something, constitutes a radical departure from the usual definitions in which information is equated with organized collections of symbols.
The current note probes the analogy between energy and information, with a view to justification of the definition cited above and the theory of the economic value of information erected upon that definition.
When coal is burned in a furnace or gasoline in an engine, energy is released. The energy may be viewed as latent in the coal or gasoline, but, clearly, these substances are fuels or sources of energy, not energy itself. We argue in this note that a similar distinction holds for information and its sources (e.g., manuscripts, computerized files, videotapes, audio recordings, etc.).
The relationship between fuels and fuel-burning engines suggests a revealing parallel in the domain of information. When speech is processed by a human being or text is processed by a computer, information is ‘released’. Speech and text – embodied in their respective media – are analogous to fuels, i.e., they are sources, which when properly processed will give rise to information. Thus, like energy, information may be latent in sources such as text, but the source is not the information.
Information is typically said to be “contained” in a book or report, or “gotten out of’” a lecture or TV program, so the idea that information is somehow latent in a source is not inconsistent with ordinary usage. A common thread in the conventional view is that information is hierarchical in character. According to this view, some information is more “organized,” contains “deeper truths,” is more relevant or precise than other information. A typical hierarchy based on ideas of this sort is formed by the four elements data, information, knowledge, and wisdom. But these distinctions are characteristic of the relationship between the user and the source of information. Consider a book. Is the information to be obtained simply a property of the book? Some readers gain “information”, others “knowledge”, yet others “wisdom” from what they read.
This observation is obvious to anyone who has ever participated in a discussion about a book. Clearly, a source of information may give rise to very different experiences. Readers have much in common with fuel burning engines. Just as engines, which have different efficiency ratings and extract different amounts of energy from the same amount of fuel, different readers obtain varying amounts of usable information from a book.
Information, like energy, is a property of a system. In some sense it comes into existence in the relationship between a “source” and a “processor”. Neither energy nor information can be apprehended directly. Both are manifest in changes in the state of a system. Energy may be measured, for example, by a change in the temperature of a gas in a vessel; information may be measured by a change in the probability distribution of a random variable whose values are decision strategies.
A universal measure of information comparable to that of energy may seem unachievable because of the apparent differences in what can be extracted from various sources. The superabundance and variety of material that we normally conceive to contain information is so great as to defy any sort of reduction to a common denominator. The analogy with energy seems to break down. However, one tends to overlook the role of ‘pre-processing’ and classification that makes a universal measure of energy feasible and useful. The energy to be extracted by a given mass of coal is not determined for just any old substance. Rather the measurement of energy is applied to a given mass of coal of a particular kind, like anthracite coal consisting of pieces of uniform size and shape. To get this particular substance requires a number of operations on material extracted from the earth. The same can be said for fuel derived from petroleum. If comparable operations were applied to what is commonly called a source of information, it might be possible to use a universal measure and unit corresponding to the joule. Such a measure would make it possible to price information sources without reference to their intended uses, just as fuels are priced now. Gas stations charge the same amount per gallon of gasoline no matter what the customer does with it. So it could be for a unit of source information. Many books, magazines and reports are priced in this way, but non-traditional sources of information such as pages of online text and query results await the development of a suitable measure to allow for uniform pricing.
A further objection to the analogy might be raised in connection with the hierarchy “data, information, knowledge and wisdom.” Although these hierarchical distinctions are meaningful, they apply to the subjective experience of information (i.e., its meaning and value), not its objective measurement. Aspects of the meaning of information endow it with value, e.g., features such as timeliness, verisimilitude, etc. are characteristics of meaning that determine the value of information to a user. The value of information corresponds to the utility of energy. A given amount of fuel can be used to obtain energy for many different purposes such as heating a house, powering a vehicle, etc., and each use has its own particular economic (or subjective) value.
The analogy between information and energy holds for systems and processors, as well as for fuels and sources. Processors of information-sources, like fuel burning engines, are altered (for varying periods of time) in the course of processing. The respective system states are altered: for a fuel burning engine, the change occurs in energy level; for an information-source processor, the modification is organizational. The results of processing, however, reveal important differences between fuels and information sources. On the one hand, information sources such as books, computer programs, and databases can be read or processed without changing them – they can be reused by the same or different processors; on the other hand, oxidation chemically alters a fuel, after which it cannot readily be re-constituted in its original form. One might distinguish between hardware and software in the makeup of a processor. Hardware appears to be analogous to the usual conception of a fuel burning engine. Such an engine could be seen as having hard wired programs, corresponding to the software installed on a computer, controlling its behavior. There may also be differences having to do with degree of ‘intelligence’ of a processor. It may in some cases be useful to distinguish between processors and users of information. The human processor is usually the user, whereas the two may be different in the case of computer processors. For simplicity of argument we will assume the processor and user to be the same, subsuming the distinction in the multiple stages of processing.
Clearly there are differences in the respective etiologies of information and energy. On the one hand, fuels are transformed as energy is released, while sources of information remain unchanged as information is generated. On the other hand, fuel burning engines typically remain unchanged after the production of energy (i.e., in the absence of structural changes caused by damage or wear and tear, such an engine will behave the same way in future), while information processors may be transformed when information is created. For example, a program may be ‘structurally transformed’ in the sense that its behavior toward future inputs of the same kind may be different, e.g., adaptive systems such as game players that alter their playing strategy on the basis of the results of moves in previous games. Nevertheless, such changes do not alter the amount of information released in source-processing transactions. Computer controlled fuel burning engines introduce yet greater subtleties. Subsequent behavior of an engine with memory could be altered by burning a fuel under certain conditions.
These differences between information and energy suggest that the source-processor pair is central to the concept of information, just as the pair fuel-engine is to energy.
As noted before, a book (source of information) is normally unchanged by the human reader (information processor) in the course of reading, but the reader may be transformed (i.e., undergo a change in program) in perusing the text. Such transformations may vary greatly, but are detectable through changes in behavior. The same holds for computers. Input data remain unchanged in processing, but the computer is altered (however short-lived the change may be). Again, alteration is manifest in changes in behavior. Note that it is possible for the original input data to be altered by the computer, but it is reasonable to assume this does not occur (or at least that the data is preserved somewhere), just as we discount the possibility of the text of a book being altered by the reader.
The potential change in the processor’s behavior is the essence of information. We will take this up again in the section on processors.
by Abbe Mowshowitz
NEXT INSTALLMENT: The nature of information sources