Looking for Archivematica the software project? see: archivematica.org

archivemati.ca

archivemati.ca header image 1

Information as an Object

February 5th, 2007 · 2 Comments · PhD Research, Terms & Definitions

Last week I wrote a post that looked in more detail at the concept of information as a way to help define archival materials. I am going through this process of definitions and explanations to help establish the scope and context for my PhD research on archives access systems. These software-intensive systems provide online users with contextual information about collections of archival materials, allow them to search and browse for archival materials, learn more about their context of creation, management and use, identify their storage location, and request their retrieval.

The key characteristic of archival materials is that they preserve information for future use. As discussed in the previous post, information is a set of related signals, symbols or patterns that communicate a message which is received with the requisite contextual knowledge to decode and understand it.

In order for archival materials to preserve information for future use, the message that communicates the information must be recorded so that at some point in the future it may be retrieved and re-communicated or re-experienced. This requires that the message transmission is captured and converted into an object that can be carried forward through space and time. This brings us to the concept of an information object. An information object is an entity that contains the content of a message and has the required structure and context to allow that message to be decoded and understood.

The Medium is not the Message, it’s just the Medium

To record information requires that the message content is affixed or inscribed onto some physical carrier or medium. These could include cave walls, tree trunks, stone tablets, papyrus scrolls, pulp paper, silver-coated glass, celluloid film, vinyl disk, magnetic tape, aluminum hard disks, etc..

By inscribing a message on a physical medium it is possible to bring its contents forward through space and time. Although McLuhan-ites might disagree, the actual message that is being communicated by the information object is found in its content, not the medium. [1] The meaning of the message is decoded and interpreted using the information object’s context. The medium, in turn, is simply one part of the information object’s overall structure.

The Structure of an Information Object

The message must be ascribed using a pattern and layout that allows it to be decoded at some point in the future. Cave paintings are drawn using forms and shapes (e.g. people and animals) that are intended to be decoded by those who view them. Similarly, written languages combine symbols (e.g. letters) from a given code (e.g. latin alphabet) into a form (e.g. words and sentences) that are intended to be decoded by those who read them. The symbols, forms and shapes that make up the content of a message are arranged into a certain order and layout which, in itself, communicates additional information (whether implicitly or explicitly). This combination of medium, form, layout, encoding and the relationships between them can be referred to as the structure of an information object.

The Message Has No Meaning without Context

The context of an information object is the environment and conditions in which message contents are created, transmitted and received. As discussed in the previous post, the receiver of a message requires sufficient information about the context of a message to decode and make sense of its contents. The types of contextual information that may be relevant are virtually limitless and may include information about the message creator, their physical location, nationality, mother tongue, mood, body language, tone of voice, intentions, dreams, desires, etc.. The point is that the relevant context is unique to almost every communication instance. Much of the context is implicit in the environment and conditions of the message transmission and the background knowledge and experience of both the sender and receiver.

Of course contextual information is information in and of itself. It can, therefore, be thought of as meta-information which is usually referred to in practice as metadata. The term metadata originates from the data processing and database management discipline to mean, literally, ‘data about data’ (e.g. a field label name such as ‘city’ to describe a database column that stores the names of cities). [2]

Theo Thomassen has pointed out that a distinction should be made between context, context data and archival metadata. [3] However, within archival and information sciences the terms context and metadata are used liberally and often interchangeably to refer to both contextual information and meta-information. Furthermore, the term metadata is often used without further distinction to refer to metadata elements (e.g. ‘name’), metadata values (e.g. ‘Peter Van Garderen’), and metadata profiles (e.g. the complete phonebook entry for Peter Van Garderen which, in addition to my name, includes specific metadata values for the address and phone number elements).

For archival materials, contextual information is a critical characteristic of information objects. Archival materials will be referenced and used at some point in the indefinite future, likely long after the original communication of the message has taken place. Any implicit and explicit information about the context of the information object’s creation, management and use must be preserved. This is to ensure that the meaning and value of the original message survives, as much as possible, into the future.

The contextual information for archival materials is captured in metadata profiles that are usually referred to as archival description. The contextual information describes the relationship of the archival materials to specific people, organizations, events, places, dates, environments, rules, conditions, concepts, ideas and to other information objects. The types of contextual information that is captured and documented can range from information about the archival materials’ social-cultural context, juridical-legal context, provenancial-organizational context, administrative-procedural context, functional-process context, documentary context, or technical context. [4]

What about Digital Information Objects?

The level of detail and the types of contextual information that is available for archival materials can vary greatly. However, if the archival materials are in digital format, information about their technological context of creation, management and use is critical to ensure their long-term preservation, access and use.

Next week I will look in more detail at the some of the unique characteristics and issues related to digital information objects.

————
[1] “Human communication is concerned with meaning derived from content embedded in physical objects called symbols that serve as the base units in narratives, programs, and codes called messages.” Newhagen, John “Interactivity, Dynamic Symbol Processing, and the Emergence of Content in Human Communication” The Information Society (20) (Taylor and Francis Inc, 2004), p395.

[2] Within archival science and records management, metadata is typically defined as “data describing context, content and structure of records and their management through time.” ISO 15489-1 Information and Document Management – Records Management (Part 1: General). (International Organization for Standardization, 2001), p. 3.

[3] Thomassen, Theo. “Het Begrip Context in de Archiefwetenschap” Context: Interpretatiekaders in de archivistiek. (Stichting Archiefpublicaties, 2000), p.27.

[4] As Theo Thomassen concludes in his analysis of the concept of context in archival science, the “way that context is defined or specified is strongly dependent on the purpose for which one wants to use the concept.” The types of context that are considered relevant can vary depending on whether, for example, it is being used to standardize archival descriptions, to appraise archival materials, or to determine which metadata elements are required to protect the authenticity of electronic records. Thomassen, Theo. “Het Begrip Context in de Archiefwetenschap” Context: Interpretatiekaders in de archivistiek. (Stichting Archiefpublicaties, 2000), p.27.

2 responses so far ↓

  • 1 Andrew // Feb 13, 2007 at 3:55 pm

    Hi Peter

    I think we met an an IS&T Conference? Anyway, I’m wondering about your use of the word ‘information’ as opposed to data in the case of digital objects. Sure digital objects contain content (data) and structure but often they do not contain all the contextual information necessary to understand them. In that case does that mean they are not information objects? And they cannot be used on their own as paper records can – they must be mediated by hardware and software oeprating together with the data object to produce the record/information.

    In archival institutions the information object will most often be the actual digital data object plus a lot of contextual metadata that lives somewhere else. Does this mean that archival institutions are not preserving information objects? I guess you have looked at the OAIS reference model? Perhaps the concept of the archival information package (AIP) could be useful for you?

    Anyway, this is interesting reading – keep it up!

    cheers
    Andrew Wilson

  • 2 Peter Van Garderen // Feb 14, 2007 at 10:56 am

    Hi Andrew,

    Thanks for you thoughtful feedback. Yes, I remember meeting you in Ottawa last May. You did a presentation on the SHERPA-DP project right?

    Before I launch into my comments I just want to add the disclaimer that my definitions and opinions are based within the viewpoint of my own research domain. They may not work everywhere. However, I have tried to start from scratch and work my way back up and that at least provides for some interesting reading as you so kindly noted (at least for digital preservation geeks ;-)

    1) RE: “Information”: I am trying to avoid the word data altogether. It’s become a little clouded. Instead I am using content in combination with the concept of message, signs, symbols, or patterns. I define content as “the message that is communicated by information.”

    2) RE: “Context”: I agree 100% that context must be present along with content and structure, otherwise we are not dealing with information but just an incomprehensible message or, if you like, just a bunch of data.

    I think I made that point in the section above entitled “The Message Has No Meaning Without Context” as well as in the previous post on Information in the section entitled “The Importance of Context and Knowledge”. But maybe I should clarify it more simply somewhere in there: no context = no information.

    3) RE: “Digital”: In the next post I look a little closer at Digital Information Objects. There I point out that there is no theoretical difference between an information object in analogue or digital form. However, in reality managing the content, structure, and context of digital information objects is much more complex and vulnerable to loss. On top of that we have to accept the enigma that the digital object’s structure exists but we can never be truly specific about where (see the section ‘If a Digital Information Object Falls in the Forest…’).

    4) RE: “Location of Contextual Information”: I think it is commonly accepted in the digital preservation world that contextual information does not have to be embedded with the digital object (both OAIS and ISO15489 allow for the linking to external or remote metadata repositories/databases). However, if the links to the metadata is lost then we are back to square one and we are once again not dealing theoretically with information but just with a bunch of content and possibly some structure. So sure, embedding is probably less risky from a preservation point-of-view but it is not a requirement, as long as the contextual information can be brought together with the content and structure at the point of access.

    5) RE: “OAIS”: Yes, like most of our colleagues I have been guided or influenced by the OAIS concepts for quite some time now. The concept of the Information Package (whether AIP or DIP) is very useful. In the course of actually developing a physical data model that implements these concepts I have stripped down and simplified the Information Package (to the point that I can’t call it an OAIS Information Package anymore). Quite simply, when I refer to content, I am referring to the OAIS Content Data Object. When I am referring to contextual information, I am referring to the OAIS Representation Information, Preservation Descriptive Information, Packaging Information, and Descriptive Information. Some or all of this contextual information is required to give the digital information object its structure but, once again, it is not the actual structure.