MW-photo
April 11-14, 2007
San Francisco, California

When Is A Terracotta Hut Urn Like A Sailor’s Deck-Log?
Meaning Instantiated Across Virtual Boundaries

Richard P. Smiraglia, Long Island University, USA

Abstract

Resource sharing requires the integration of scientific and cultural information, which itself requires semantic interoperability on several levels, from metadata structures to the representation of intellectual content. Repository- and resource-specific data-structures fail to acknowledge the universality of function that might be common among resources. An example is the case of instantiation. When is a terracotta hut urn like a sailor’s deck-log like a best-selling novel? When its representations are multiply instantiated within a repository, but the instantiations are not distinguishable for retrieval purposes. Instantiation is the phenomenon first denoted empirically by research into bibliographic ‘works.’ Specifically, an instantiation of a work exists whenever the work is manifest in physical form (in a book, for example). A problem arises when multiple instantiations of a work (several editions, translations, etc.) exist and their descriptions must be linked in a retrieval system with sufficient information to assist in the selection of the instantiation of interest to a searcher.

Similarly, unique artifacts are represented by metadata or by images (called representations), which exist in multiple instantiations (a photographic negative, a print, its digital descendent, etc.). The same is true of the representations of archival documents, which might exist in paper photocopies (original and carbon), digital images, and so forth. Etruscan artifacts at The University of Pennyslvania Museum of Archeology and Anthropology, and the Archives of The Class of 1942 of The United States Merchant Marine Academy provide empirical evidence of the phenomenon of instantiation. Multiplicity among informing objects is universal, and the analysis of instantiation shows this. Instantiation can be demonstrated as one example of a semantically interoperable, and empirically derived, model for the integration of scientific and cultural information.

Keywords: artifacts, documents, instantiation, metadata, ideation, semantic, knowledge sharing

Introduction

This paper is about enhancing the ability of repositories of all kinds to share information. In particular, research demonstrates that the phenomenon of “instantiation” – multiple representations – is universal among information objects. Specifically, we will see how this phenomenon can lead to ambiguous metadata for artifacts and archival documents.

1.0 Sharing Metadata and Semantic Interoperability

Global resource sharing requires the integration of scientific and cultural information. Such integration requires semantic interoperability on several levels, from metadata structures to the representation of intellectual content. However, interoperability is hampered by a lack of empirical evidence supporting the development of data structures. Museums, archives, and libraries – particularly in today’s virtual environment – share a common purpose: the dissemination of culture. But our post-modern emphasis on domain-specific ontology, which is widely seen as an improvement in knowledge organization, can lead to a potential form of ‘Balkanization (that is, isolation) in resource description. Repository- and resource-specific data-structures, generated in-house by rational or pragmatic means, often fail to acknowledge the universality of function that might be common among resources.

An example is the case of the phenomenon of instantiation. When is a terracotta hut urn like a sailor’s deck-log like a best-selling novel? When its representations are multiply instantiated within a repository, but the instantiations are not distinguishable for retrieval purposes. In this paper we shall see the results of empirical analysis of cases in both a museum and an archives. We will learn that for any given artifact, the repository holds a plethora of identically-named instantiations. One solution for semantic interoperability would be an empirically-verifiable taxonomy of artifact instantiation.

1.1 Useless Metadata lack Differentiation

Metadata are categorical descriptors of information resources, often used to constitute strings for information retrieval. That is, resource-linkage is provided through the semantic application of content metadata. Furthermore, specific metadata structures provide semantic linkage among resource descriptions. Most metadata schemas rely on pragmatically-derived descriptive conventions, and most of these rely on describing a visible entity – ““describe what you see,” or as library cataloging rules have it, ““describe the item in hand.”

However, and herein lies the problem, data structures must be designed to both collocate (gather together a group of like entities) and disambiguate (allow differentiation among identical-appearing, but distinct, entities). Unfortunately, reliance on resource description alone can frustrate both objectives. Figure 1 shows comparative screen displays, one bibliographic and one artifactual. In both, resource description is accurate and therefore succeeds at collocating descriptions that appear to be the same. And in both, varying instantiations are undisambiguated leading a searcher no recourse other than to wade through description after description in search of a suitable object.

Fig 1: Undisambiguated Display

Fig 1: Undisambiguated Display

In both lists the entities look identical, but are not. Rather, their metadata representations, based on pragmatic schema, make them appear to be identical. The entities actually represent points of instantiation, points along a temporal continuum at which representations of the intellectual content have taken form. Some (like copies from the same printing, or prints made simultaneously from the same photographic negative) are simply re-presentations. Others, and this is more likely the farther we get along the continuum, represent altered intellectual content (such as translations, or digitally enhanced images). The problem, simply enough, is to create a taxonomy that can guide disambiguation by turning these displays into relational structures.

Another problem is that metadata, essentially a set of ‘names,’ cannot be neutral because the names influence the activity of facilitating (or obfuscating) use. The very act of naming – the essential activity of resource description – objectifies potential use of the items named. According to Hjørland’s appeal to activity theory (2003, 98), metadata schemas, rationally deduced, predetermine the potential use of intellectual content by limiting its retrieval. Even the simple name document, has the potential to limit use of artifacts by directing (or mis-directing) searchers. Knowledge extraction then – the act of generating taxonomic description of resources – should be based on empirical observation of the resources themselves. What I mean to suggest is that empirical research can both demonstrate the complexity of representations of artifacts and also suggest, through dynamic taxonomy, a set of practices that allow the resources to speak for themselves.

2.0 Instantiation is a Global Information Phenomenon

Instantiation is the phenomenon first denoted empirically by research into bibliographic ‘works’ (Smiraglia 2001, 2002, 2005a). Specifically, an instantiation of a work exists whenever the work is manifest in physical form (in a book, for example). A problem arises when multiple instantiations of a work (several editions, translations, etc.) exist and their descriptions must be linked in a retrieval system with sufficient information to assist in the selection of the instantiation of interest to a searcher. Similarly, unique artifacts are represented by metadata or images (called representations), which exist in multiple instantiations (a photographic negative, a print, its digital descendent, etc.). The same is true of the representations of archival documents, which might exist in paper photocopies (original and carbon), digital images, and so forth.

In this discussion we are dealing with the interaction of an entity and a phenomenon. The entity is the intellectual content of any artifact, known bibliographically as ‘the work.’ The work is the set of ideas and the manner of their communication; that is, ideational and semantic content combined. A bibliographical work is usually associated with a writing, and the writing usually has a text of some sort. Artifactually, however, the concept also applies. An object was created with some intention (ideational content) and some physical form (semantic content). A red wooden tulip sitting on my desk was created as an artistic representation of a flower in bloom – its ‘ideation’ – and it was created mechanically from wood and paint – its ‘semantic expression.’ A photograph of it is not the same as the tulip itself, although ideationally it represents the same thing. Furthermore, a black and white print of that photograph continues to represent the ideational content, but in altered form. In a retrieval system we will want all three iterations (that is, instantiations) to be connected to one another so as to identify their common representation. That is, we want the descriptions to collocate. But we also want to be able to tell the photos from the original and from each other; that is, we want the descriptions to disambiguate.

And so we have the interaction of the ‘work’ entity with the phenomenon of instantiation. Instantiation (Smiraglia 2005) is essentially a generic term for the realization of a work at a specific point in time. Other terms are often confused with this phenomenon – version, manifestation, for example – but instantiation is a simpler term that is used to signal a place in a sequence of time, but with no further implication of intellectual or physical detail. It is the generic name of a universal phenomenon that has been observed among all informing objects – resources, if you will. And it is very useful for analysis of sets of multiply-realized phenomena.

From bibliographic evidence (Smiraglia 2001) we know that substantial numbers of works evolve large instantiation networks over time (that is, have many editions, translations, screenplays, etc.). The types of instantiation can be grouped in two broad categories: derivation, which is used for subsequent iterations of the work with little change, and mutation, which is used for iterations with substantial change in ideational or semantic content. Thus a second edition of a book is a derivation, but the motion picture based on it is a mutation. And we know that the primary catalyst for evolution of instantiation is cultural acceptance, termed ‘canonicity’ in the bibliographic realm. Once a work becomes part of public consciousness, there is a tendency (one might evoke here the simple law of supply and demand) for instantiations to evolve.

When we turn to museums and archives, we find the same phenomenon of instantiation is abundant. But there are, of course, some differences. In the realm of artifacts, we must distinguish first between the artifact itself, metadata descriptions of the artifact, and representations, e.g., photographs, models, etc. While the artifacts are unique, we can observe instantiation among both metadata descriptions and representations. In the archives we have a similar but less clear-cut situation: we can observe instantiations of the original documents as well as their representations. In both cases, because we are making a distinction between a unique artifact and the subsequent iterations, we have termed this process ‘content genealogy’ (Smiraglia 2004, 2005b). Let us now turn to some specific examples derived from research.

3.0 Hut Urns and Deck Logs

Two recent studies demonstrate consistently the phenomenon of instantiation across resource and repository types (Smiraglia 2004 and 2005b, 2006). Case-study evidence is reported in these papers derived from Etruscan artifacts at The University of Pennyslvania Museum of Archeology and Anthropology, and from the Archives of The Class of 1942 of The United States Merchant Marine Academy. Both the Etruscan artifacts and the merchant marine archives demonstrate the presence of both representations (images, models, etc.) and metadata sets (descriptive data) for each artifact.

3.1 Etruscan Artifacts

Case study, an appropriate first step in uncharted territory, is a useful technique for gathering exhaustive data about a few selectively chosen subjects, in order to discover as much as possible about them. We were most interested at the outset in seeing whether multiple representations of an artifact might be extant (which would indicate a need to ‘collocate’ in an information system). Therefore we sought a set of artifacts that would be easily available to the researcher and that would likely have appeared in representations in print as well as in digital form. Etruscan artifacts from the University of Pennsylvania Museum of Archaeology and Anthropology were selected to provide qualitative evidence of the content genealogy of representations of non-documentary artifacts. These artifacts are housed in a popular exhibition that is also well-represented in digital form on the Web (http://www.museum.upenn.edu/new/worlds_intertwined/etruscan/main.shtml). The exhibit is also accompanied by a published Guide (2002). By choosing artifacts that had both Web and print representations, it was guaranteed some degree of instantiation would be present. Eight artifacts were selected based on their prominence in the exhibit, and with attention to their diversity (see Table 1).

Material Artifact Source Date
Terracotta Hut urn Said to have come from the area between Albano and Genzano 8th c. BC
Impasto Kotyle Narce, Tomb I 7th c. BC
Impasto Footed bowl Narce, Tomb 19M early 7th c. BC
  Etrusco-Corinthian Olpe Vulci, Tomb B 6th c. BC
  Nenfro Lintel Ovrieto, Crocifisso del Tufo Necropolis ca. 550 BC
  Bucchero Kantharos Vulci, Tomb B 6th c. BC
Terracotta Antefix with Satyr Head Caere (Cerveteri) 4th c. BC
Alabaster Cinerary urn   3rd-2nd c. BC

Table 1: The Eight Artifacts as Described by UPM

With the assistance of the museum’s archivist, every record in-house for each artifact was examined and noted. It was no surprise to discover that for each artifact both metadata descriptions and representations exist. In fact, instantiation was abundant. Table 2 contains the total numbers of representations and metadata sets that were found for each artifact, both in-house and outside the repository. Internal records were used to compile a bibliography of publications in which both metadata descriptions and images could be found. Note the disparity among the artifacts in the quantity of metadata and image representations. Certain of the artifacts have substantial sets of information-objects, both in-house and outside the museum. The Etrusco-Corinthian olpe has the most, and the lintel has the fewest. This suggests that presence neither on the Web nor in the popular exhibition influences instantiation directly. If we disregard the functional metadata and representations – those created in-house according to routine museum function – we still see disparity. Now the olpe and the cinerary urn have the largest sets of representations in print, followed closely by the satyr head antefix. As reported earlier, a number of factors were ruled out as contributors to instantiation.

Artifact Representations In-House Representations in Publication Metadata In House Metadata in Publication
Etrusco-Corinthian Olpe 9 4 6 2
Footed bowl 4 2 6 3
Bucchero Kantharos 4 1 6 2
Hut urn 1 2 5 1
Antefix with Satyr Head 5 3 6 3
Cinerary urn 7 1 6 6
Kotyle 4 1 4 2
Nenfro Lintel 1 1 7 1

Table 2: Representations of The Eight Artifacts

However, these two artifacts, the olpe and the cinerary urn, together with the satyr head, show the most instantiation. The common factor among these three, and the factor that distinguishes them from the rest, is appearance in published volumes soon after acquisition. As reported in Smiraglia (2005b), these artifacts are among those described by the Conservationist as ‘frequent flyers,’ a term used to denote objects that are frequently requested for loan. It seems that once an object has appeared in publication, it tends to get requested even if the museum has better exemplars in its collections. Such items spend more time on loan than in the museum, hence the moniker. And each time the artifact is loaned, a new set of before- and after-metadata and photographs are generated to help monitor the continued integrity of the artifact. While on loan, the artifact is likely to have metadata and image representations published in the loan-exhibition catalogue, and these likely will generate more attention over time. This is the parallel to the bibliographic catalyst of canonicity. That is, cultural acceptance of the entity is a catalyst for instantiation.

Types of instantiation were gathered into a taxonomy, thus:

Metadata

Representation

Many of the terms here are associated with museum functions; the very fact of holding an artifact means that a repository might multiply-instantiate various representations. Content genealogy consists of the reiteration of metadata sets from field notes to object records to database, and the sequential production of images from photographic negatives which might subsequently be digitized. The olpe had a more interesting instantiation history; for instance, for one image there is a lantern slide, a nitrate negative, a 35 mm. negative, and 3 copies of the 35 mm. negative, and a photograph published in an early museum catalog. In fact, there are two such (different) images of the olpe.

3.2 Merchant Marine Memorabilia

The U.S. Merchant Marine Academy at Kings Point, Long Island, New York, houses an extraordinary library with extensive archives of former midshipmen. The members of the Class of 1942 have used their self-awareness to constitute a rich historical record of their class. The fourteen’““folders’ created by members of the class tell the story of young men who chose the seafaring life for a career but found themselves destined for greatness. The Pearl Harbor event of December 7, 1941, shocked a nation, and these young cadets soon found themselves commissioned as officers in the merchant marine and on their way to sea in the thick of war. Table 3 is a brief summary that demonstrates some of the richness of the collection via a taxonomy of forms, genres, and topics; media include documents, video-recordings and one CD-ROM compiled as an alumni project.

Forms and Genres Topics
Articles (newspaper) Alumni
Binders Battle of the Atlantic
Biographies Cadets
Brochures Chrysler Estate
Certificates Chrysler House
Correspondence Officers
Diplomas P.O.W. Survival
Directories Ships
Handbooks Examinations
Interviews Lost classmates
Land surveys Officers
Narratives Reunion
Newsletter Training
Photographs U.S. Merchant Marine Cadet Corps
Registers War risk insurance
Stories Engineer license
Videorecordings (VHS tapes)  

Table 3: Contents of the Class of 1942 Project

Two folders – documents contributed by two former midshipmen – constitute the cases for analysis. In the first folder we found letters, envelopes, binders, photographs, ship’s deck-logs, time-sheets, scholarship applications, and so forth. A heavy, canvas-bound ring-binder dominated the folder. It contained ‘orders’ – papers ordering military personnel from place to place. A note inside indicates the binder had been issued specifically because it would sink in ocean water. In the event a ship was boarded, these binders were to be tossed overboard so as to be lost to enemy intelligence. Inside the binder was a deck long from this cadet’s time at sea; immediately behind the log was a photocopy of it. In the second folder we found a bound diary reconstituted from the written diary of another midshipman, submitted to the class project by his son. The diary is reproduced via word processing, but a substantial segment of the original hand-written material has been photocopied and tipped in at back. Quite a lot of memorabilia are inserted. Table 4 is a schematic representation of the contents of these folders.

Folder One Folder Two
Word-processed letter (2003) and photocopy; attached to photocopy of carbon of letter (1945) Diary: volume bound with decal of USMMA on front cover; hand-written text from the original diary tipped in at back
Transcript verso front cover pasted notice listing location of copies of volume
Application for admission leaf pasted photo of author
Deck log; photocopy of deck log narrative text 93 pages;
Memo (and photocopy of memo) attached to scanned copy of United States Merchant Marine Cadet Corps: A Brief History Deck Cadet Certificate pasted in
2 memos concerning Brief History Postal Commemorative, and 2 ship photos pasted in
6 letters two photos of the subject with classmates
class newsletter photocopy of discharge
Word-processed typescript of personal narrative; contains 6 photographs scanned in photocopy of document
Scanned copy of Brief History photocopy of photographs and text from New York newspaper
  two photographs pasted in
  photocopy of story

Table 4: Instantiation among the Merchant Marine memorabilia

Perhaps the most interesting observation was the amount of instantiation built into the collection itself by the members of the class as they compiled their historical archives. Commonplace are photocopies, carbon copies, digitized scans of postcards containing photographs, scans of photos, photos alongside digitized scans of them, and documents together with their carbon copies and digitized scans of the originals.

4.0 Achieving the Ability to Share Knowledge

The term content genealogy describes the succession of representations that occur along a chronological continuum. Multiplicity among informing objects is universal, and the analysis of instantiation shows this. Instantiation can be demonstrated as one example of a semantically interoperable, and empirically derived, model for the integration of scientific and cultural information. Elements of instantiation that are consistent across these studies, as well as with the bibliographical research, are canonicity and nodes within instantiation networks at points of change in ideational or semantic content. In fact, we can place what we have been calling taxonomies side by side and see that they constitute a typology of instantiation (Table 5). That is, the types are not mutually exclusive, but may occur simultaneously, as for example in the case of a photocopy of a carbon copy of a typed letter.

Bibliographic Works Artifacts--Metadata Artifacts--Representations Personal Papers
simultaneous editions -finding aids -field photos Photocopies
successive editions -field notes -working images Carbon copies
predecessors -letters -exhibition color images Photos
amplifications -conservation treatment notes -digitized exhibition images -postcard with photo
extractions -register descriptions; object cards -conservation photos -digitized scan of postcard with photo
accompanying materials -image order invoices -archived photographic negatives -reprint of photo
musical presentation -museum database records -archived photographic prints -digitized scan of photo
notational transcription -catalog card records -archived photographic transparencies  
persistent works -finding aids    
       
translations   -object reproductions  
adaptations   -drawings  
performances   -3D models  

Table 5: Comparative Instantiation Typologies

The typologies demonstrate Hjørland’s idea (2003) that activity theory can proscribe the categorizing activity of knowledge organization. We denote categories for metadata in order to assign information objects spatial loci within our own schema. The typologies also demonstrate the epistemological parameters of derivation and mutation. Derivation denotes types or properties of instantiation in which intellectual content is unaltered; mutation denotes types or properties of instantiation in which intellectual content has been altered semantically or ideationally. In table 5, terms listed below the solid line represent mutations, which occur in both bibliographic and artifactual typologies. According to research to date, archival records and artifactual metadata typologies identify derivations.

5. Virtual Boundaries?

What does all of this mean? Primarily it means that there is a need for a common language for sharing information about artifacts across repositories and repository types. In the virtual world, the potential exists for museums, archives, and libraries to constitute unified virtual repositories of cultural information. Regardless of the type of artifacts held – that is, whether they are naturally occurring, works of art, personal papers, or books, or anything else – the phenomenon of instantiation is consistent, and is consistently problematic.

One attempt at a common language for information sharing is the CIDOC-Conceptual Reference Model (known as CIDOC-CRM). This model is an empirically derived language that supplies ““definitions and formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation” (http://cidoc.ics.forth.gr/). A powerful template for knowledge sharing, the CRM has been designed to conform to several metadata sets or knowledge models for cultural resources, ranging from the Dublin Core, to Encoded Archival Description, to the Functional Requirements for Bibliographic Records (wherein instantiation plays a large role), to MPEG7, and to a variety of museum specifications.

The phenomenon of instantiation has been extended from the confines of the bibliographic realm into the basic language of the CRM, incorporating not only the research reported here but also semiotic theory. Major physical entities are Creation, Production, Conceptual Object, and Information Carrier, which are the ‘things’ of information resources. Major intellectual entities are Work and Expression, the substance of which are acknowledged to act as signs. The substance of a work, always, is ideational content – ideas.

Conclusion

When is a terracotta hut urn like a sailor’s deck log? The first answer is that they are alike when both are perceived as part of the world’s cultural resources. The second answer is that they are alike when both are digitized and placed in virtual repositories for use by scholars. And the third answer is that they are alike when their multiple instantiations cause confusion in information retrieval, or when repositories fail to disseminate the entire realm of possible iterations present in the instantiation networks. This is the challenge for virtual repositories of every stripe. Efforts based on taxonomies derived from empirical analysis show promise for helping unite the world’s information repositories.

Acknowledgements

The author wishes to acknowledge the assistance of the University of Pennsylvania Museum of Archaeology and Anthropology, especially Museum Archivist Alessandro Pezzati, and the United States Merchant Marine Academy Library and its director Dr. George Billy.

References

Guide to the Etruscan and Roman Worlds (2002). Philadelphia: The University of Pennsylvania Museum of Archaeology and Anthropology.

Hjørland, Birger (2003). ““Fundamentals of knowledge organization:. Knowledge organization 30: 87-111.

Smiraglia, Richard P. (2005). ““Content metadata – An analysis of Etruscan artifacts in a museum of archeology”. Cataloging & classification quarterly. 40 n3/4: 135-51.

Smiraglia, Richard P. (2006). ““Empiricism as the basis for metadata categorization: expanding the case for instantiation with archival documents”. In Budin, Gerhard, Christian Swertz and Konstantin Mitgutsch eds. Knowledge Organization and the Global Learning Society; Proceedings of the 9th ISKO International Conference. Vienna, July 4-7 2006. Würzburg: Ergon, pp. 383-88.

Smiraglia, Richard P. (2005). ““Instantiation: Toward a theory.” In Vaughan, Liwen, ed. Data, information, and knowledge in a networked world; Annual conference of the Canadian Association for Information Science. London, Ontario. June 2-4 2005. Available http://www.cais-acsi.ca/2005proceedings.htm

Smiraglia, Richard P. (2004). ““Knowledge sharing and content genealogy: extending the ‘works’ model as a metaphor for non-documentary artifacts with case studies of Etruscan artifacts”. In McIlwaine, Ia C., ed. Knowledge Organization and the Global Information Society; Proceedings of the Eighth International ISKO Conference 13-16 July London UK. Advances in knowledge organization v. 9. Würzburg: Ergon Verlag, pp. 309-14.

Smiraglia, Richard P. (2001). The nature of a work: implications for the organization of knowledge. Lanham, MD: Scarecrow.

Smiraglia, Richard P. (2002). “Works as signs, symbols, and canons: The epistemology of the work”. Knowledge organization 28: 192-202.

Cite as:

Smiraglia, R.,When Is A Terracotta Hut Urn Like A Sailor’s Deck-Log? Meaning Instantiated Across Virtual Boundaries, in J. Trant and D. Bearman (eds.). Museums and the Web 2007: Proceedings, Toronto: Archives & Museum Informatics, published March 1, 2007 Consulted http://www.archimuse.com/mw2007/papers/smiraglia/smiraglia.html

Editorial Note