|
MULTIPLE USES OF DATA IN THE MUSEUM ENVIRONMENT
1 INTRODUCTION
The fondness museum professionals have for standards committees
and for data interchange standards is well known. From the Getty thesaurus
to the Dublin core, on standards committees debating the 'best' graphics
format, the quest for the universal language which allows museum to
talk unto museum goes on. Apart from the common problems with standards
in computing: technology renders them obsolete before they are finally
ratified; everyone, whether through benevolence, competitive advantage
or the 'not invented here syndrome' introduces an extension or enhancement
to the standard; and too much time is spent discussing the standard
instead of finding something useful to do with the results, there
is surely an opportunity being missed as a result of the mindset which
urges standardisation.
The aim of allowing a universal search of every museum database
is a laudable one but what does it really offer? If I search a ship
database for vessels between certain tonnages, are those values based
on gross tonnage or on displacement? If I want to look at voyages
to modern-day Kaliningrad in the seventeenth century do I look for
the modern port, or the earlier German name of Königsberg, or,
if the data refers to trading records from Scottish mercenaries in
Russian service, Korolovets? The way in which any historical question
is framed will depend to a great extent on the questioner's background
and intellectual baggage. To make data available
to a wide audience either the data needs to be standardised, and user
thinking modified, or the users left with their modes of thought and
professional practices intact, and the data varied. If we can accept
that 'a museum can be said to offer intellectual access to its resources
if it enables people to think, for purposes they have defined themselves,
about the objects in its collection' (Orna, 1994)
why should we try to constrain users in the potentially far more flexible
world of the virtual museum or archive? Data is infinitely more malleable
than people.
Before we begin to think of universal access to museum data we need
to solve a more local problem. How can data sources within one organisation
meet the needs of different users within that institution and the
needs of visitors, physical or virtual?
2 THE COMMON DATA WELL
Every institution has multiple uses for its data. That is why many
museum Web sites are on-line versions of brochures or ugly extensions
to collections management data. Several collections management software
packages boast of their public-access add-ons. This, unfortunately,
ignores the rather obvious fact that most members of the public want
to ask questions quite different to curators responsible for museum
collections. One collections management package offering this add-on
feature illustrated its advertisement with a screen shot of an entry
for an image of a Scarlet Macaw with a button enticingly labelled
Related. What would a related entry be? Another picture of
a macaw, a parakeet, other animals, other preliminary oil sketches,
other works by the same artist, other paintings by 19th century Belgians,
other paintings from the same museum collection, purchased with the
same bequest, other paintings on paper or with similar brush strokes
or similar colours? We can think of limitless relations: the ones
we think of first reflect our own interests. Potentially, almost all
the relations we can conceive of are in the underlying database somewhere,
in that it would be possible to construct an SQL statement (assuming
the database supported this) to reveal them. The Related button
is, however, too coarse a tool to achieve this.
If the answers are in the data, leave the
data well alone, adjust the buttons instead. Develop an interface
appropriate to the user. Better still, if we are `enabling people
to think for purposes they have defined themselves', let us, where
possible, do away with the button altogether. Wendy Hall has written
about ending the 'tyranny of the button' in multimedia systems generally
(Hall, 1994). The button in the 'standard' collections
management package may be the thing to make the museum professional
protest, 'I don't see why we should have to conform our registration
needs/practices to the computer's needs/limitations' (recent MCN-L
posting). In the public access system it takes on the role of, in
Peter Walsh's memorable phrase, 'the unassailable voice' (Walsh,
1997): this is the connection between this artefact and that which
will interest you, do not stray from the tightly policed corridors
of the virtual museum.
|
|
3 FILTERS NOT DATA STANDARDS
The authors of this paper have already described some of the mechanisms
in moving a database from a kiosk application to the Web (Kydd
and MacKenzie, 1997). The general principles learned from that
exercise are applicable to the porting of data between any electronic
delivery platforms in the museum: concentrate on the needs of the
users in the various environments, the data will look after itself.
So, for example, the local newspaper, which uses the system for
managing its photographic archive, is interested in an image's accession
number because it identifies the location of the original negative
or print but this is irrelevant to the Web tourist. The Webmaster
needs a database which is fast, cheap and compatible with the server
software but will do little or no editing work. The curator creating
and maintaining the database needs tools for data entry and editing,
for comparing entries, for searches and for generating reports. In
both these instances the needs of the user are met by interface design.
In the first scenario, one interface simply has (at least) one more
field than the other. In the second, creating two interfaces to the
one database may not be practicable: an affordable server-ready database
may not have the features taken for granted in most common PC database
packages; the curator may not wish all entries to appear on the Web;
the software which needs to run with the two applications will be
quite different and may require different operating environments (a
Unix server, Windows spreadsheets and word processing facilities,
for example); the computing skills of the two individuals are likely
to be quite different and, dare I suggest, the curator is less willing
to change working practices to compensate for the lack of features,
or the difficulty of using them, in a database system with which the
Webmaster is comfortable. The compromises made in choosing a system
to satisfy both needs is likely to satisfy neither so why try? Use
two separate systems, two different databases. Provided one can quickly
transfer its data to the other it really does not matter. Transferring
data between modern databases is a question of knowing what the tables
and fields concerned are and this is essentially a header definition.
No matter what the subject area, one field plus this header is probably
sufficient to allow two systems to share data in a meaningful way.
4 WHAT IS EASY AND WHAT IS DIFFICULT
The issue of database selection for core data is essentially a non-issue.
Whatever the initial data source is, it is the content which is important
not the format. Modern databases all come with a range of import filters
or one can be populated from another by the construction of an SQL
query. Most development languages (VB, Visual C++, Smalltalk etc.)
can use ODBC drivers to talk to a wide range of databases. As has
already been argued it is making sense of the content where the challenge
lies. That is an interface design question not a database standardisation
one.
Similarly images can easily, and automatically be converted from
one format to another: perhaps they exist as TIFF images for internal
work in a networked archive, are distributed on CD-ROM to internal
and external users as 24-bit JPEG images. Moving them to the Web may
mean there is a mixture of 8 and 24 bit GIF and JPEG images. The standard
is again not the issue but bandwidth is. Users on dial-up lines may
be less than enthusiastic waiting for images of several hundred kilobytes
to load yet there will be some pictures where they consider the wait
worthwhile. To give them the choice the obvious solution is thumbnails.
There are certainly automatic thumbnail generation packages but what
happens, if we decide that all the thumbnails are to be the same size
to fit in a pre-defined search results matrix, with images in a wide
range of shapes and sizes, what about large images where shrinking
the whole image produces a thumbnail where identification of the key
features is no longer possible? Manually creating thumbnails is not
an option for something like the TAMH project where around 4000 images
are involved. Creating a toolkit of small applications to ease tasks
like this is an ongoing process and a route worth following for anyone
trying to get more value from their data.
The really difficult thing, as in any computer
application, is giving the users what they want (or need, the two
are not usually the same) complicated here by the fact that multiple
use of data implies multiple audiences for that data with very different
interests, backgrounds and working practices as described above. This
has basic technical implications. The kiosk-based TAMH implementation
concentrated on letting users, ranging from young schoolchildren to
post-doctoral researchers, ask the questions they wanted to ask in
they way they wanted to ask them and so avoided fixed hypertext links
as much as possible (MacKenzie, 1995, 1996).
Current Web technology prevents a straight emulation of this there
and the whole Web is, after all, built on such links. An earlier paper
(Kydd and MacKenzie, 1997) described some of the
strategies we employed to implement this design philosophy on the
Web in an ad hoc way. Work since then has concentrated on developing
tools to allow the migration of data between print, collections management,
kiosk and Web applications to follow a more generic model. We would
not try to argue that the tools we have developed are the only mechanism
by which such a task can be carried out but we would argue that the
general approach is the only valid one, that it is not about standardisation
of data but rather about developing interface tools and filters which
present that data in ways appropriate to the different user groups.
5 EARLY EXAMPLES
We came to the multiple use of data on an ad hoc basis. Doing some
work on archiving material on Joseph Beuys in Scotland, we had word
processed catalogue entries on the Strategy: Get Arts exhibition
of 1970. However, in addition to printing these, we required a formatted
view of the pages in a touch-screen kiosk application. It would have
been possible to have prepared SGML versions of the files, as several
museums do with their catalogues (Light, 1995),
which would produce both on-screen and printed output. However, time
was short, the person responsible for the files was not skilled in
SGML so we produced a very simple in-house formatting tool, called
BLURB, to serve the same function. It proved adequate for its intended
task and also allowed the catalogue entries to be included in a CD-ROM
on Joseph Beuys. Because it is simple to use, fully integrated with
in-house word-processing and has little adverse effect on the typing
speeds of secretarial staff, it also became the method used for formatting
newspaper articles in the TAMH project. BLURB will never go before
a standards committee, is quite limited in what it can do, but was
selected for its ease-of-use rather than any standardisation considerations.
The relatively simple syntax means that it is simple to run it through
a filter to produce, for example, Web pages of the newspaper articles
in TAMH (www.dmcsoft.com/tamh/).
Similarly our Antique Golf site (www.dmcsoft.com/antiquegolf/) is simply a Web-based extension
of an existing database. The clubs are listed in an Access database
for off-line searching and for inventory purposes. Access was chosen
because the person responsible for maintaining the inventory was familiar
with the package and, again, moving this database to the MySQL database
version which drives the website is simple, just running an SQL query.
Similarly DDE/OLE allows it to be the engine for a print-based catalogue.
The point in choosing these tools is not that the data is standardised
but that it is adaptable and portable. The Access golf club database
meets the needs of the person maintaining the inventory and who is
sitting with the real clubs; the Web-based version meets the needs
of the browser who wants to see what the clubs look like.
6 MUSDEV: A GENERIC APPROACH TO THE PROBLEM
The early version of the TAMH data entry module was really just
the display version of the software with the database write enabled.
As the entries grew it became apparent that this was not adequate
for cross-referencing articles or checking what was already there
or for maintaining the fields which were not seen by most users, accession
numbers, the relationship between main table entries and tables of
short biographical entries, visit sites, keywords, thesaurus entries
and the like. We also suffered from good ideas: new fields and tables
to add, and these meant individual changes to the database query and
write parts of the code on a woefully uncontrolled basis. Another
realisation was that our first thought, to drive the system from a
database was the right one, but whereas we initially held around eighty-five
per cent of the data in databases we should have, and ultimately did,
move everything to one database. This meant moving map data, graphing
and icon-display functions into the database and this forced the decision
to go for a more generic solution to the problem.
The result is what we call MusDev, an interface to a database which
defines the relationship between the data elements but does not attempt
to define what happens with that relationship. Nor does it care what
the database is or what it contains. Our TAMH data source was an Access
database, so we stuck with that, but it would work equally well with
any other database, local or remote, for which ODBC drivers exist.
Figure 1 below shows a Main Table (in effect, a short article)
entry on Admiral Greig. The buttons along the bottom of the screen
indicate the other tables in the MusDev source for this project. If
the images button is pressed, the images currently associated with
this entry are shown (Figure 2). If I wish to associate new
images or longer articles with this short entry, going to the Add
Links screen (Figure 3) offers the facility of browsing
and selecting other table entries and dragging and dropping them onto
the link diagram. The entry can be completed on Figure 1 by
adding descriptive abstracts and assigning it to time periods, sources
and subject area categories according to those already in the database
or, added interactively, from this screen. Keywords are used as descriptors,
those appearing in lowercase are ones generated automatically by the
system based on its thesaurus which may also be edited on-screen.
In this example the alternative spelling Cronstadt and the
city to which it is the port, St Petersburg, is generated for
Kronstadt.
Figures 1, 2 and 3: Main tables, images, and adding links
This has defined the entry and it relationship to other types of
entry (images, articles, places to visit, artefacts etc) and gathered
some clues as to search terms which are relevant but it makes no reference
to how a user of the kiosk- based system will see the relationship
between the entry and, say, the image. This is essential if we are
to leave the option of multiple use of the data open.
MusDev as Toolkit
MusDev is also an ever expanding toolkit for the administrative tasks
identified in Section 4 above. The tools we have incorporated in it
so far include the following.
Automating thumbnail generation
The Thumbnail Generator is illustrated in Figure 9. Although
it can with Autocapture generate thumbnails for whole sets
of images, where an awkward shape or content presents, as in the portrait
of the lady in Figure 9 where the important detail is the fact that
she has a characteristic head-dress and is reading a bible, a draggable
box can specify the area to be thumbnailed.
Figures 9 and 10: Thumbnail generator and image search matrix
The advantage of generating meaningful thumbnails
is apparent from Figure 10 which shows one of the image search
matrices. Like many other developers we have struggled with using
keywords to describe images and their inherent limitations (MacKenzie,
1995) and followed similarity algorithms such as QBIC (Holt,
Hartwick and Vetter, 1995) and ARTISAN (Eakins,
Shields and Boardman, 1996) with interest and reached the conclusion
that, for now, the most efficient way of allowing people to find images
of interest to them is just to let them look. Even with very fast
flicking through pages such as the one illustrated here, users can
pick out the images they want. This is the justification for "content-rich"
thumbnails.
The Keyword Hierarchy Tool
The problems of using keyword or even free-text searching to identify
topics of interest are well documented. Michelle Kaufmann (Kaufmann,
1996) gives examples of searching for Passover and it resulting
in matches on items about dietary restrictions, identity or even just
as a reference to a point in time. Similarly the search would miss
references to Passover in the Hebrew form, Pesach. Examples
in this paper have already referred to the changing port names in
the TAMH project where the form used will often depend on either the
nationality of the searcher or the period of history in which he or
she is interested.
Figure 11: Keyword hierarchy
We decided on a very simple approach, a screen which allows entry
of one keyword followed by a list of acceptable synonyms (Figure
11). The type of item for which we needed to specify synonyms
was so wide: from geographical names: Kurzeme for Courland; different
English idioms, railroad for railway; different terminology in cargoes,
flax, hemp and codilla used interchangeably; or abbreviations RNLI
for Royal National Lifeboat Institution that a simple approach seemed
to be the best and no catch-all thesaurus would serve our purpose.
The point is, however, that using this tool in a different subject
area where it was appropriate we could import any commonly used thesaurus
or controlled vocabulary such as the one Michelle Kaufman describes
at the Shoah Foundation. Again this is an example of concentrating
on the interface not the standard. Any controlled vocabulary can ultimately
be decomposed to a one to many relationship. Different projects will
call for different thesauri so rather than deliberate as to which
is generally the best one, allow the possibility of using any. If
we can agree a simple way of relating one controlled vocabulary to
another where there is some degree of overlap, so much the better.
7 COMPONENT APPROACH
The kiosk-based version of TAMH uses the database created by MusDev
with no changes whatsoever. It merely adds the interface layers and
search tools. Figure 4 shows the actt of searching for St
Petersburg (one of the system- generated keywords) and how the
entry appears to the kiosk user (Figure 5). Figure 6
shows what the user sees by touching the small image, a larger view
with magnification and other options. Returning to the main entry
display and touching (or dragging the mouse) over Battle of Hogland
brings up a Link button (Figure 7) to start searching
for all references to that phrase. Any word, or series of words can
be selected in this way, everything is 'linkable' not just the terms
the system authors specify.
Figures 4, 5 and 6: Search results, entry record, and image
viewer
All of these display, link and search options are separate components
which use the database content as their input. New components can
be added, and existing ones modified, without a requirement to change
the underlying database. Components themselves may be edited (or modified
from an administrator screen Figure 8) so that not all displayable
fields from a particular component are on-screen, simplifying the
mariner database search and display for a school audience, for example.
MusDev is, in effect, our collections management package: the kiosk-based
version of TAMH is another department which seeks to use some of the
core data for completely different purposes and, therefore, wishes
to have something other than the collections management view of the
data.
Figures 7 and 8: Text link and administrator screen
The component-based approach extends the life of the data enormously.
We can add new components to reflect changes in technology, a user
has a machine fast enough to display the archived TIFF images rather
than the usual JPEG representations, or changes in emphasis, (exhibits
associated with Admiral Duncan, last year being the bicentenary of
his most famous victory at Camperdown) or changes in use, a school
asks for a facility for students to cut and paste images, multimedia
elements and text into multimedia essays. What we really wanted, though,
was a button marked Web which would take the database and publish
it to the Web not exactly the way it appears in the kiosk but in a
way which takes account of Web browsers' needs and the technology
limitations and strengths of that particular environment. To say we
are not quite there yet is something of an understatement but the
component-based approach has certainly been a step in the right direction.
8 COMPONENTS AND THE WEB
Many current website production and management tools allow site
designers to produce component-based sites. DMC uses its own tool,
WebDev (www.dmcsoft.com/webdev/)
to achieve this, allowing the integration of dynamic content from
databases into web pages. WebDev uses a component-based architecture
for sites, allowing the separation of content-producing components
from interface appearance components such as headers, footers, menus
and graphics. This approach works well with the MusDev component model,
allowing site templates for the overall look and feel of the website
to be distinguished from the components which MusDev has to generate.
Websites and their constituent web pages can be broken down into
a series of components. A web page which displayed an article record
from the TAMH database might consist of standard header and footer
components for the page, and a database query component which ran
a database search for the record and formatted the data for output.
The header and footer components would be reusable on other pages
in the website, removing the need to change all the pages on the site
when these standard elements had to be altered. An image collection
component might point to the location of an image archive on the web
server, and could use an extra parameter to refer to an individual
image within that collection for display on a web page. If the image
collection were moved to a different location, the only reference
which would need to be changed would be that in the image collection
component itself. Other components could encapsulate web page elements
such as search input forms and website menus. Breaking the web page,
and consequently the website down into a series of such components,
makes it more manageable.
The examples given in the previous section have web equivalents.
The keyword search form in Figure 4 would translate into an
HTML form. Clicking on the button would link to a page with the keyword
query component, producing a record display equivalent to Figure
5, with the image being an inline image thumbnail from an image
collection component. This image would be linked to a full size image,
so that the user could inspect it as in Figure 6. Because of
the limitations of HTML, the equivalent of the text selection and
link searching in Figure 7 would simply be a form with text
input box for searching on a user-specified term.
9 WEBSITE PUBLISHING
WebDev allows a website to be structured so that components which
produce content (from image archives or database queries) can be separated
from other components which control site structure or page layout.
A site structure and layout components for page style could be used
in conjunction with templates of content production components to
build a generic website where all that was missing to complete the
system was the equivalent of the kiosk system's display, link and
search components.
Publishing to the web involves taking the interface layers and search
tools developed for the standalone kiosk system and creating web-based
versions of them. MusDev has provided a number of tools for managing
both database and image information, and separating this data from
the actual interface used in a user-based kiosk system or an administration-oriented
collections management system. The next stage in this evolution is
to take the user system and map this onto a web-based user interface.
MusDev makes the transfer of the underlying data into a straightforward
task. The database can be migrated from the standalone system to a
web-accessible database server using a combination of MusDev tools
and standard software such as Microsoft Access. For the TAMH project,
DMC currently uses a simple set of Access queries and an ODBC link
to update the server database from the standalone version, although
this task may eventually be integrated into the MusDev toolset. The
image archive can be copied to a webserver, or shared between standalone
and web versions in a networked environment, and necessary conversions
of image formats to web-friendly versions can be achieved. Image references
in the database can be converted to point to the new locations on
the server as part of the database import routine. Thumbnail generation
can produce images which allow web users to preview images before
selecting those of interest to them to download at a better resolution
and quality, but without wasting download times on images they have
no interest in.
As mentioned previously, the kiosk-based version of TAMH uses display,
link and search components to build its interface. The aim is to have
MusDev map these components onto web-based equivalents. Display components
can be treated as generic HTML formatting of database records and
images, link components implement searches on the database from user-selected
input criteria, and search components have HTML input forms and SQL
queries to search the database. Using this component-based approach,
website templates equivalent to the kiosk system components can be
constructed and output by MusDev. Parameters for the components in
MusDev are combined with the templates to output the website equivalents
of the kiosk system.
It should be noted that because of some of the limitations in current
web-based interfaces, these components are described as equivalent,
rather than the same. For instance, the example in the previous section
of selecting text in a record and having it automatically appear as
search text is not possible in standard HTML text, so the equivalent
web component allows manual entry of the selected phrase into a text
box in an HTML form. Similar compromises can be made with other types
of components. The advantage of the component architecture is that
if a new solution is produced for a particular component, then the
application can have its display templates for that component type
changed to produce the new model for all such components.
Websites in general differ from kiosk systems in that they need
to provide the user with more context information: a user at a kiosk
system usually knows where they are (i.e. they have walked into the
museum and have a physical reference for their location). Users on
the Internet do not have the same frame of reference, as they may
have linked from anywhere on the web to reach the website location.
Therefore they need information to identify what the site is they
are viewing, and to help them navigate within that site. A site designer
could choose or produce templates for the overall look of the website,
provide the extra context information that the website needs (information
about the website, news and external links sections, etc) and generate
the content- specific components based on the kiosk system design
in MusDev.
The eventual goal is to have a straightforward route to publishing
the same data and similar interface and search tools on both the kiosk
and web systems. MusDev currently produces the component-based interface
for the kiosk system. WebDev currently provides website design and
management functionality. Up until now the move from kiosk to web
has involved a manual process of mapping one set of components onto
the other. The task now is to produce a flexible template-based system
to allow MusDev to export its components to a WebDev project.
FUTURE DIRECTIONS
The intention of MusDev is to produce a generic set of tools for
integrating museum data with a number of administration- and user-centred
systems. Currently it has been used with our own TAMH project, and
we are actively looking for partners within the museum community to
test the portability of the system to other subject areas. One of
the ways in which MusDev will be enhanced in its web integration capabilities
is by building a library of museum-like components for users to choose
from, so that we get ever closer to the "Press button for Web version"
situation.
MusDev is a set of tools which allow a single set of data sources
to be used for multiple purposes within an organisation. The emphasis
on the portability of the data and the independence of data from the
interfaces and search mechanisms which use it means that it can be
used with one interface for data management and another interface
for a kiosk system. The final step of integrating an interface to
the web into the system is well under way.
REFERENCES
Eakins, J.P., Shields, K. and Boardman J. (1996)
ARTISAN - a shape retrieval system based on boundary family indexing,
In Sethi, I.K. and Jain, R.C. (eds) Storage and Retrieval for Still
Image and Video Databases IV, 210-215 (Return
to text)
Hall, W. (1994) Ending the tyranny of the button.
IEEE Multimedia, 1(1), 60-68 (Return
to text)
Holt, B., Hartwick, L. and Vetter S. (1995) Query
by Image Content: the QBIC Project's Application in the University
of California at Davis's Art and Art History Departments. Visual Resources
Association Bulletin, 22(2), 61-65 (Return
to text)
Kaufman. M (1996) Memory and Rediscovery: Using
a Controlled Vocabulary to Provide Access to Holocaust Survivor Visual
Histories. Spectra 24(2), 26-29 (Return
to text)
Kydd, S. and MacKenzie D. (1997) Going On-line:
Moving Multimedia Exhibits onto the Web, In Bearman D. and Trant
J. (eds) Museums and the Web 97: Selected Papers, AMI, Pittsburgh,
299-313 (Return to text)
Light, R. (1995) Getting a handle on exhibition
catalogues, the Project CHIO DTD. In Bearman D. (ed) Multimedia
Computing and Museums, AMI, Pittsburgh, 368-381 (Return
to text)
MacKenzie, D. (1996) Beyond Hypertext: Adaptive
Interfaces for Virtual Museums, Proceedings of EVA'96, Vasari Enterprises,
Aldershot (Return to text)
MacKenzie, D. (1995) Using Archives for Education,
Journal of Educational Multimedia and Hypermedia, 5(2), 113-128
(Return to text)
Orna, E. (1994) In the know. Museums Journal, 94(11)
(Return to text)
Walsh, P. (1997) The Web and the Unassailable Voice,
Archives and Museum Informatics, 11(2) (Return
to text)
Last modified: March 23, 1998. This file can be found below http://www.archimuse.com/mw98/
Send questions and comments to info@archimuse.com
|