Looking for Archivematica the software project? see: archivematica.org

archivemati.ca

archivemati.ca header image 1

The Information Model to End All Information Models

December 8th, 2006 · 4 Comments · ICA-AtoM, System Architecture, System Requirements

I am now beginning work on the second alpha iteration of the open-source ICA-AtoM software application.

I am working on a considerable upgrade to the underlying object and database model which supports the application. As the application matures and evolves it will be relatively easy to make changes and updates to the application modules and user interfaces. However, it will be difficult to make significant changes to the database model once the application goes into active deployment and people start entering information. Therefore, I better get this right now…

ICA-AtoM must be flexible enough to support a wide variety of potential uses. Firstly, as an archival description package for individual archival institutions as well as a union catalog application that can combine descriptions from multiple repositories (e.g. http://humanrightsarchives.org).

Beyond that, I have high hopes that ICA-AtoM can evolve to become an even more universal information cataloging tool that can be used, for example, to manage personal digital archives (e.g. family photos, home movies, music collections, etc) as well as other reference resources that typically support archival collections (e.g. encyclopedias, dictionaries, bibliographies, reference libraries, web links, etc.).

Therefore, I want to be sure that the ICA-AtoM information model supports the International Council on Archives standards out-of-the-box. At the same time, I want it to be flexible enough to support additional information cataloging and classification needs. I want it to be open, extensible and to anticipate the future wave of semantic information organization and sharing.

Magic Mystery Tour of Information Architecture

That’s why I’ve spend some mind-bending hours revisiting a number of information architecture standards, communities and research projects.

I’ve gone everywhere from Dublin Core to Microformats to Semantic Web to modified pre-order tree traversal to object relational mapping to METS to EAD to OAIS to the Monash Project to the UBC Project to Topic Maps and back to ISAD(G) and ISAAR(CPF).

The most amusing stop on this magic mystery tour of information architecture was dropping in on the Semantic Web vs. Microformats stand-off and flame war.

The most useful stop was the Topic Map community and a surprisingly relevant article written by a Microsoft architect.

Towards a Supa-Dupa Information Model

Somewhere in the back of my brain a universal information object architecture is slowly solidifying. I have, in fact, been thinking about and working on such a model for quite a number of years, starting 10 years ago with my first job out of archives school at a local software vendor and extending into my current consulting and doctoral research.

I posted a high-level architecture diagram for an archives access system on this blog last year. Now I am getting into the nitty-gritty of the information model that would support such a system.

Some of the guiding principles or goals for this base information model are:

  1. There are five core entities: Objects, Agents, Events, Places and Concepts
    • By default, Object types include archival materials.
    • By default, Agent types include archival record creators
    • By default, Place types include archival repositories
  2. The user can define new category types for the five core entities (e.g. adding bibliographic materials or artifacts as a Object types)
  3. Associations and Topic Occurrences are two, parallel contexts in which core entities relate to each other
    • Associations are any type of relationships between core entities (e.g. ‘is a child of’, ‘is a type of’, ‘is the creator of’, ‘occurred at’, ‘is storage location for’)
      • The user can define any type of association
    • Topic Occurrences are instances when a core entity is used as an access point to describe and identify an Object as an information resource (e.g. Object XYZ is an information resource about Concept XYZ or about Place XYZ or about Agent XYZ).
  4. Objects can be used as Topic Occurrences for other Objects (e.g. A. Pallister’s book Magna Carta the Legacy of Liberty is an information resource Object that is about the Magna Carta, another information resource Object.
  5. Core Entities, Associations and Topic Occurrences can be grouped and restricted to specific scopes or contexts (e.g. so that boundaries can be established between specific collections and taxonomies)
  6. Objects can exist in more than one form (i.e. analogue, digital or multiple copies of both) and in more than one place (e.g. real-world and online)
    • online digital objects must have addressable URIs
  7. Core entities are represented by metadata profiles
  8. All information resource Objects can be described using at least Dublin Core metadata elements
  9. All Core Entities can be described at the logical/descriptive level using more than one metadata profile
    • The administrative metadata for the storage and physical management of analogue and digital objects must be generic at the physical level
  10. The data model must be flexible enough to surface data stored in ICA-AtoM as EAD, EAC, Microformats, RDF or topic maps (e.g. as XSLT transformations or through REST APIs)

Now my challenge will be to implement these design principle and objectives in a functional and optimal web-based application. This means that the complexity has to be hidden from the user and the application can’t slow down to a crawl to process a number of complex relationships or database queries.

This is what I’ll be working on over the next few weeks using the Symfony framework, its Propel object-relationship mapping layer and the MySQL database engine.

I am sure I’ll have to make some compromises and tweaks along the way (keep in mind that this list is simply some research notes generated during my best practice analysis) but hopefully this investment will pay off in the long run to make the ICA-AtoM information model as flexible and powerful as possible.

4 responses so far ↓

  • 1 thesecretmirror.com » Blog Archive » The State of Open Source Archival Management Software // Dec 20, 2006 at 8:01 pm

    [...] ICA-AtoM didn’t have the most coverage at this year’s SAA conference. However, its developer, Peter Van Garderen of Artefactual Systems, Inc. (also a PhD candidate at the Universiteit van Amsterdam), spoke at the session entitled “Finding Aids: The Next Generation.” The development of ICA-AtoM is being sponsored by the International Council on Archives for the development of the Guide to Archival Sources on Human Rights Violations, a multi-repository guide to collections held around the world. Although ICA-AtoM is only at the alpha-stage of development, the software seems robust enough to support distributed archival description compatible with ISAD(G) and ISAAR(CPF). Obviously, it appears to be better suited for multi-institutional descriptive projects, and that in turn seems to reflect the current lack of features that the other two programs offer, such as the ability to track collections from the time of accession onwards and the creation of container lists. However, the development roadmap indicates that these and other features are on definitely planned for inclusion. I believe that Van Garderen’s decision to make these features modular — which most likely means the application will also be scalable — clearly shows sophistication in both planning and development. The collaborative aspect of ICA-AtoM’s implementation, its use of international standards, and Van Garderen’s candor regarding the development process make me believe that this package might be the best replacement for our current implementation of ICOS. [...]

  • 2 Basement Tapes » The Information Model to End All Information Models // Dec 23, 2006 at 5:04 am

    [...] archivematica — Blog Archive » The Information Model to End All Information Models [...]

  • 3 TLC News Service » Flashback (Week of 12/18/06) // Mar 5, 2007 at 9:39 pm

    [...] "The Information Model to End All Information Models" ICA-AtoM must be flexible enough to support a wide variety of potential uses. Firstly, as an archival description package for individual archival institutions as well as a union catalog application that can combine descriptions from multiple repositories (e.g. http://humanrightsarchives.org). [...]

  • 4 A Data Model of Web Data Models: Part I » AI3:::Adaptive Information // Oct 10, 2007 at 9:06 am

    [...] ICA-Atom — ICA-Atom is an open-source, archival description application that is currently in development, based on the Reference Model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1, Blue Book, January 2002. Also, see Peter Van Garderen (Fri, December 8th, 2006), The Information Model to End All Information Models; see http://archivemati.ca/2006/12/08/the-information-model-to-end-all-information-models/. [...]