MW-photo
April 13-17, 2010
Denver, Colorado, USA

Architecting CollectionSpace: A Web-Based Collections Management and Information System for 21st Century Museum Professionals

Carl Goodman, Museum of the Moving Image, USA; Patrick Schmitz, University of California, Berkeley, USA; Dan Sheppard, University of Cambridge, United Kingdom; Colin Clark, University of Toronto, Canada

Abstract

CollectionSpace is an ambitious multi-institutional project dedicated to the development of an open-source, Web-based software application for collections management and information access. The use of CollectionSpace software will help bring core collections management procedures and related functionality into the Web 2+ era and has the potential to redefine the ways in which collections information is captured, managed, preserved, and leveraged. In service of this goal, CollectionSpace has introduced innovative approaches to application customization and extensibility. It also conforms to Web-based technical and design standards that Museum professionals increasingly expect.

Keywords: Web Applications, Collections Management, Web 2.0, Open Source, Services Oriented Architecture

Introduction

Advances in technology and public expectations have led to a sea change in the way that collection-holding institutions interact with their constituents. Today, Internet-based access to collections in digitized form drives the need for cataloging and collections information management. With the rise of Web 2.0, Museum audiences expect to be able to "create, participate in, share, refine, save, and re-use" information (IMLS, 2008). Institutions need to re-purpose and re-publish collections information and make it available to a wide variety of audiences in an array of new formats.

One only has to peruse the proceedings and publications of the Museums and the Web conference to see the promise and vitality of these kinds of projects, including social media and other forms of participatory experiences, smart-phone tours and podcasts, on-line exhibitions and e-learning experiences, gallery kiosks and games. At the same time, we haven’t exactly abandoned, nor should we, the tried, tested, and true tasks normally associated with collections management, and which have to do with knowing what you have, where it came from, where it is, and how it’s being used. With this broadening of venues and contexts for collections information comes a similar broadening of the range, types, and sources of collections information itself.

Museum professionals are often limited by the inability of their collections management systems, whether homegrown or purchased, to function as “collection information systems” that are conducive to re-purposing and re-publishing of information. The result is an inordinate amount of time spent configuring “one-shot” exports of data from collections management systems into audience-facing digital applications. Once in the audience-facing application, the data’s link to its corresponding information in the collections management system is often severed. No round trip ticket is available.

Another outgrowth of the maturation of the Internet is that information- and Web-literate Museum professionals expect software tools they use internally, such as collections management systems, to be informed by their everyday, anytime, everywhere use of the Internet and the Web.

Many collections management solutions were initially designed for a specific museum or collection type (e.g. art), and to this day carry with them an embedded data model, and other assumptions derived from that particular type, and are difficult to alter. The result is that individual museums or communities of practicedefined as subgroups of museums and collection-holding organizations that hold certain elements in common, such as collection type or subject, museum size/scale, geographical area or region, or common funding sourcehave developed their own, localized technology solutions which rarely benefit the wider museum field.

The CollectionSpace project

Seeking to address these issues, in 2008 the Andrew W. Mellon Foundation funded the CollectionSpace project, with four organizations working together as partners: Museum of the Moving Image, University of Toronto, University of Cambridge, and University of California, Berkeley. Following a planning and design period in 2009, the partners commenced with their work, developing the new CollectionSpace open source collections information management software. In late 2009, two other institutions became contributors to the project as well: the Walker Art Center and the Statens Museum for Kunst (the National Gallery of Denmark).

CollectionSpace has been designed to answer the need for more effective collections information and management software. The first phase of development, to be completed in June 2010, is focusing on the development of a flexible, customizable software application that captures eight core museum proceduresas defined by the SPECTRUM procedures standard developed by the Collections Trust, formerly the Museum Documentation Association, that are critical to the management of any type of Museum collection. CollectionSpace is being distributed under the open source ECL (Education Community License) 2.0 license. A 1.0 version is due in June 2010, with incremental releases being made available on a monthly basis. The core project will continue with the development of a 2.0 version of the software from July 1, 2010June 31 2011.

On its surface, CollectionSpace is collections management system redesigned to take advantage of emerging best practices in user experience design. But underneath its surface lies a technical architecture informed and inspired at the ground level by innovations in services- oriented and enterprise information architecture, but natively conversant in the languages of plugins, APIs, Web services, and data feeds. The result is software that has complex underpinnings, in the form of a service oriented architecture, but can be installed, maintained, and customized, using the Web development technologies which museum Web and technical staff both use and like using.

CollectionSpace has embraced the evolutionary move to Web applications as a strategic direction, and leverages the maturing platforms to provide a flexible, powerful application. As with other Web-based applications, no installation is needed, and users can access the system from any Web-browser, with the same flexibility (and security) people have come to expect for applications ranging from e-mail to on-line banking.

CollectionSpace is architected as three distinct layers

  1. the back-end services layer that manages a repository of documents, records, media, and associated metadata,
  2. the application layer that makes it easy to customize CollectionSpace for each museum, and
  3. the user interface layer that provides the end-user experience and interaction with the collections information.

This approach has a number of key advantages, and is a part of what makes CollectionSpace a completely new approach to collections management. Its architecture provides museum technology professionals with specific entry points from which they customize, extend, draw data from, or connect to the application using standards-based approaches.

The application and user interface layers of the software are configured to reflect these extensions and customizations so that the overarching user experience is that of a completely customized application for an individual museum. However, because CollectionSpace is designed specifically for this kind of flexibility, the extensions and customizations do not require specialized programming skills or tools. The schemas and configuration files are declared in XML that can be edited with any number of standard tools, and UI customization can be done by any Web developer familiar with HTML, CSS, and JavaScript technologies. A museum’s Web master or a museum-savvy Web developer will be comfortable performing such customizations, and can hit the ground running by adopting a similar institution’s implementation of the software rather than having to start from scratch. A series of ‘themes’ for CollectionSpace, consisting of domain-specific extensions and templates, are intended to be created later in 2010, though the capabilities for the creation of these themes exist now.

CollectionSpace and Services Oriented Architecture

An important advantage of this architecture is the ease with which CollectionSpace can integrate with other information systems that support collections management, dissemination and publishing of collections information, and research on the collections. The approach borrows from best practices from Enterprise Information Architecture. Unlike traditional information architecture, Enterprise Information Architecture takes as a starting point that there are many valid yet diverging methods of information classification and use, and that it is practically impossible to enforce the same information architectures across multiple divisions or departments within an enterprise. The result is a software architecture that can be customized, with relatively minimal work, for specific communities of practice within the museum domain, without each of these customizations having to be built from scratch.

The CollectionSpace schema (the fields used to represent each type of object or procedure) is extensible and customizable for each institution. This is essential to making CollectionSpace a reasonable fit for a broad range of museums and collections. For example, at development partner U.C. Berkeley, there are dozens of museums and research collections that range from Anthropology to Zoology, with collections of media, of cultural heritage objects, and specimens (both living and preserved). To efficiently support these many collections, a shared set of services provides the common functionality, but each service is extended and the application customized to fit the information management and workflow needs of the respective collection.

Configuration files describe the schema extensions, as well as local preferences for labeling and presenting information and user interface customization, to make the application look and feel specific to one museumit reflects a particular institution.

Because CollectionSpace is a growing community as well as an application, domain-specific extension schemas and customization templates will emerge as institutions with the same focus share ideas and contribute their work back to the community. CollectionSpace supports this by dividing the schemas for each service into three distinct sections, as indicated in Figure 1, below.

Figure 1

Figure 1. CollectionSpace schema extension model

The Common Entity Schema includes a set of fields that are widely used across a range of museums, and are defined in the above-mentioned CollectionSpace services. Domain- or Community-specific extensions are the fields and data widely used in a domain like Anthropology, Art History, or Botany. These can be successfully shared among institutions and will eventually allow for domain-specific service extensions to support functionality across a set of similar museums.

Local, idiosyncratic extensionswhich often define an institution, apart from those with similar interests and collections - are defined as deployment-specific extensions. Variances among different deployments can be relatively minor, but often are what determines whether a specific collections management application works for them or not.

CollectionSpace’s services expose collections information via standard REST APIs, in much the same manner as popular services like Flickr, Google Maps, etc. The same back-end that supports our application can be accessed directly to define mashups or more formal integrations with portals, archival and preservation systems, etc. Dissemination and publishing tools have easy access to collections information, and research applications have efficient access to collections information data without compromising database security or access policies.

For example, the three URLs below indicate REST resources that respectively provide access to the details for a given collection object, a list of summary information about loans, and a list of objects associated to a given loan:

your.museum.org/cspace-services/loans
your.museum.org/cspace-services/collectionobjects/{id}
your.museum.org/cspace-services/loans/{id}/collectionobjects

The information is returned as a standard REST payload (i.e., as XML content). Common libraries for accessing REST APIs make it easy to integrate CollectionSpace services with common Web development tools and languages. CollectionSpace’s services repository is divided into two main categories: entity services (collection object, person, numbering, voca) and utility services.

More than forty, along with instructions on writing a REST API to the services, are located on the site’s services description repository at http://wiki.collectionspace.org/display/collectionspace/Service+Description+Repository.

The CollectionSpace 1.0 services are:

Primary Entity Services

Descriptive Metadata
Search, Selection Context, Reporting
Activities
Vocabularies, Authorities, Standards
Organization

Utility Services

Data and Metadata Management
Workflow and Rules
Security
Administration and Monitoring
Miscellaneous

CollectionSpace’s underlying services adapt automatically to a configuration defining the schemas, and the REST payloads then include common schema information for each service, as well as any custom extensions.

A full CollectionSpace service description repository is availalable on-line at http://wiki.collectionspace.org/display/collectionspace/Service+Description+Repository.

The CollectionSpace Application Layer

The above-mentioned suite of services brings the principles of service oriented architecture to the guts of Collection Space. The CollectionSpace application layer integrates these services to create application-level data-feeds to the user-interface, described later. The application layer is concerned with three things: session and state maintenance; the cacheing and rebundling of data to ensure the efficient use of services; and, most important, configuration, customization and workflow.

Configuration, Customization, and Workflow

It is essential that configurability be at the core of CollectionSpace, and that it must be carefully designed. Our approach to configurability is multilayered.

First, the application is designed to be configured by a single XML file, or suite of files, representing those aspects of museum processes which impact on the Collections Management System: its records and their fields, the accompanying workflows, data types. This then not only provides instructions to the application in how to display and behave, but also becomes a record of the structure of the museum's procedures as they impact on collections management.

In a reasonably complex museum XML, representation of workflows can fast become incomprehensible, not only in terms of complex, turgid syntax, but also in the semantics required to be retained in the mind of an author during editing.

A traditional approach to these issues is diagrammatic. While the diagrammatic approach shows promise (and is certainly visually appealing), the approach tends to break down in complex cases. After all, the electronic circuit diagram of a television is similarly diagrammatic, and yet in many situations not an optimal representation of a television. Though there is nothing ‘wrong’ with a diagrammatic approach, it does not seem as if it addresses the fundamental issue of complexity.

Tags

Instead, the CollectionSpace technical team chose a ‘composite approach,’ which is a good compromise between retaining genuinely human-readable XML and a succinct representation of complexity. The design provides for a suite of XML tags covering identified common cases. However, it also provides the facility for individual museums to define their own tags, to represent their special data and procedures. If useful, such definitions can then be shared among museums.

The mechanism of definition of new tags, beyond those provided "out of the box" or by other museums, is through the use of javascript. Despite its many flaws, this is a simple language with a massive base of expertise beyond traditional software development. By running the javascript on the single, controlled environment of the CMS server, rather than in a browser, we avoid almost all of the controversy and complexity which has accompanied the history of javascript. The result is a language and practice which could be mastered by someone at almost any museum. It is important to note that when it’s launched, CollectionSpace will run out of the box. While neither XML nor Javascript skills will initially be necessary to adapt CollectionSpace, a group of communities of practice will also be working to provide, with release, a number of profiles, to suit a variety of institution types. Even if this is as far as a museum wishes to involve itself in this construction, the ability to choose a profile for the collections management sofware to closely match that institution will offer more than many existing offerings.

Plugins

There are greater levels of customizability for the more ambitious museum, enthusiast, service provider, or collaborative technology project member. CollectionSpace supports plugins - self-contained pieces of functionality which can be easily plugged into CollectionSpace by a user. These plugins can be distributed by developers to extend the functionality of CollectionSpace.

There are three areas where plugins are particularly valuable.

  • First, they assist a large museum, or a museum housed within a larger institution or part of a wider network with a shared infrasctrucute, with the integration of CollectionSpace with other backend institutional systems, such as single-sign-on, media and document management systems, perhaps library systems, etc.
  • Second, they will assist in the integration of CollectionSpace with systems that consume their data, such as on-line catalogues, exhibition systems, federations of museums, and so on.
  • Third, a plugin architecture allows CollectionSpace to become a platform for experimentation on novel means of exposing and manipulating a museum's data for interpretive uses outside of more traditional classificatory uses of collections information (Srinivasan et al., 2008). Such interpretive uses include use by source-communities, wiki-based techniques, or the accommodation of attributed multiple interpretations in fields, not to mention of course the oft-mentioned social media applications.

The plugin architecture of CollectionSpace is inspired by the OSGi (Open Standards Gateway initiative) module system, though it is not currently based upon it. As OSGi tools mature and experience in the technology grows, we intend to migrate to using it. It is an important target for the application layer developer that nothing more than basic Java servlet skills are required to use, develop and contribute to the code. This enables us to let the broadest community of developers to contribute to the code, and is particularly important in the application layer given its focus on customizability issues. This approach rules out the immediate use of a number of emerging software technologies and development frameworks (primarily through stack-trace pollution and non-intuitive control flow). We have knowingly traded away these potential uses in favor of “radical simplicity,” in an effort to cultivate a healthy community of contributors.

From Hardcore to No-Core

One challenge to writing a software platform which uses plugins extensively is maintaining sufficient "hooks" at the right places to allow sufficient control to be given to the plugins to alter behaviour. To achieve this, we use a "no-core" approach, where all functionality, including the key functions of CollectionSpace itself, is placed into plugins. This ensures that all relevant data is passed through the plugin data base. The only core in the ideal system handles registration, arbitration, and routing.

This creates challenges in the rapid release cycle approach adopted by CollectionSpace to ensure feedback from the community during development: the most important part of the first plugin implementation of an area of functionality is its API, and yet it is also the first part which must be developed, in advance of experience in the implementation of the feature. Our experience is that APIs developed in a vacuum tend to be of a poor quality. Therefore we initially develop "in-core", and when a feature is sufficiently mature that its role and scope within CollectionSpace is clear, we extract the API and "spin it out" into a plugin. This allows the ‘no-core’ philosophy to be quality-controlled at the release of key versions of CollectionSpace.

Each of the features of the application layer, from custom XML and XSLT snippets, Javascript tag implementations, and custom plugins, will be packagable as single files which can be installed into CollectionSpace by dropping them into the application, meaning that they can be adopted with minimal effort by museums who need not know any of the above to use the provided features.

The CollectionSpace User Interface

The CollectionSpace application layer and user interface are tied together using an open, feed-based approach. As mentioned previously, the application shares all collections data via RESTful URLs and JSON-formatted objects. JSON is particularly well suited to building performant client-side code, and is widely employed by Web developers working across a variety of programming languages. CollectionSpace's feed-based approach is also highly amenable to creating new HTML and JavaScript-based user interfaces as well as sharing collection data outside CollectionSpace, such content management systems or mobile interfaces such as Fluid Engage (http://fluidengage.org). The end result is a reliable yet deeply extensible architecture that will enable integration with a wide variety of museum systems and the external Web.

Designing software with as well as for museums

CollectionSpace's user interface technology is designed and developed in collaboration with the Fluid Project (http://fluidproject.org). Fluid is an open community of developers, interaction designers, and museum staff who work together to improve the user experience of software employed by museums, archives, universities, and other cultural institutions. Fluid is an unprecedented contributor to open source museum software, committed to ensuring that usability and accessibility are integral parts of the whole project, not bolted on at the end. Contrast this with other open or commercial software projects, which are created usually driven by technologists. Such solutions typically consist of a bewildering array of forms and technical jargon far removed from the day-to-day goals of museum professionals. Usability is central to CollectionSpace. However, since managing collections is complex and multifaceted, CollectionSpace was designed to accommodate this complexity without trying to "dumb down" the user interface for non-professionals. CollectionSpace is being designed using a flexible hybrid of Participatory Design and User-Centred Design (UCD). Participatory design brings the users of software directly into the design process, allowing them to be central contributors to the resulting user interfaces and application functionality of the software. Museum experts work side-by-side with interaction designers throughout the process. We also conducted a series of 2008 workshops in which museum staff from a wide variety of disciplines and institutions helped shape the processes, workflow, and design affordances of the system. Similar workshops have proven to be an effective design tool for other museum projects as well (Muise et al, 2008). UCD techniques such as design modeling and frequent paper prototype testing occur throughout the development process, helping the project to stay grounded in the realities of museum collections.

Building on top of the Web

The architecture of the CollectionSpace UI is built entirely with open Web standards including HTML, CSS, and JavaScript. Built to be extensible and easily modified by UI developers or Web staff, CollectionSpace's presentation technology leverages a handful of familiar client-side JavaScript technologies. At the foundation is jQuery (http://jquery.com), a popular library for creating rich and cross-browser compatible user interfaces. The overall UI layer is structured in terms of a highly flexible variation on the Model-View-Controller (MVC) design pattern (Basman & Clark, 2009), as implemented by the Fluid Infusion JavaScript application framework (http://fluidproject.org/products/engage). Infusion provides CollectionSpace with a framework that leverages the strengths of MVC's separation of data from presentational concerns, while abstracting the repetitive and vague code often associated with the Controller layer. In short, the system is very loosely-coupled yet approachable to developers. Each discrete part of CollectionSpace is implemented as an Infusion component, enabling the reuse of common widgets and behaviour across the whole application. From an end-user perspective, this improves the overall consistency of the UI. From a technical perspective, it ensures greater stability and a more compact, modular code base. Since the Infusion framework offers a number of infrastructural features that enable flexibility and greater accessibility, each component in the CollectionSpace UI can be customized or adapted without having to modify the actual code. CollectionSpace's user interface code has, itself, been carefully factored into three separate layers, each of which will already be familiar to most Web developers. The structural layer consists of HTML templates, while the presentation layer is defined as a set of CSS style declarations. The behavioural layer is implemented using Fluid Infusion components implemented in JavaScript. Each layer is loosely coupled, and the whole architecture is woven together with the collections data through the use of a declarative configuration format called the UI Spec. The UI Spec is generated by the previously described application layer, and is formatted as JSON (Javascript Object Notation). It provides instructions for each component on how to render and bind raw data into HTML templates. The UI Spec, along with the use of HTML and CSS, ensures that CollectionSpace avoids making hard-baked assumptions about how the user interface will look and feel to the end-user (i.e. particular organization implementing collectionspace).

Adapting to the data and processes of individual Museums

The user interface architecture has been designed to make few built-in assumptions about the nature of the data it is presenting. Instead, the UI reflects CollectionSpace's overall declarative approach to configuration, allowing museums to create discipline-specific profiles and schemas that are best suited to their needs. Along with the changes to schema and structure described previously, implementers can adapt the user interface itself, from cosmetic updates and label changes through to adding or removing basic user interface elements throughout the system. Much of this is achieved through the modularity and loose-coupling afforded by the Fluid Infusion component model. Again, the work of customizing the user interface is largely accomplished through the use of familiar Web technologies - museum Web teams; IT staff can also add or remove aspects of the CollectionSpace UI just by changing some HTML, CSS, and JSON. The ultimate goal os using such an architecture is to make it approachable by a wide range of developers and designers, without requiring them to learn complex programming languages like Java.

Skinning and accessibility

This Web-driven approach offers another opportunity for museum Web staff to customize CollectionSpace - the entire user interface can easily be re-skinned and styled to using standard CSS techniques. In cases where museums want to ensure a consistent look and feel across a range of Web-based systems, a Web designer can easily provide an individual stylesheet for CollectionSpace to customize any aspect of the layout, text styling, and branding. As mentioned previously, accessibility is a critical concern for CollectionSpace, and has been reflected in the design process from the beginning. CollectionSpace provides a number of features intended to make the user interface more usable and approachable by a wide range users. Recognizing that addressing accessibility issues also helps to improve the overall usability of the application - for example, ubiquitous keyboard shortcuts significantly improve the efficiency of the repetitive data entry typical of collections management systems. Highlights of CollectionSpace's accessibility features include keyboard navigation, support for large print, and the enhanced UI semantics of ARIA to enable use by assistive technology such as screen readers (Schwerdtfeger et al, 2009). This support for accessibility not only ensures that compliance with various pieces of accessibility legislation is more easily achieved, but it also substantially improves the usability and longevity of the software for a broad range of users.

Conclusion

Though it will work ‘out of the box’ (that is, if there were a box) on a single mid-range server, CollectionSpace and its sophisticated technical architecture provides Museum technology professionals with many points of entry into the customization and extension of a collections management software application, better enabling our collections information to play safely in the Web 2.0 playground. In addition, the open source orientation of CollectionSpace means that these customizations and extensions may be shared freely within the museum community, providing Museum professionals with the same ability to “create, participate in, share, refine, save, and re-use” our own tools that we increasingly demand of the digital experiences that we create for our visitors.

References

Basman, A., & C. Clark (2009) Infusion Framework Concepts. 2009, last updated June 16, 2009. Consulted January 31, 2010. http://wiki.fluidproject.org/display/fluid/Framework+Concepts#FrameworkConcepts-MVC

Institute of Museum and Library Services (2009). Museums, Libraries, and 21st Century Skills (IMLS-2009-NAI-01). Washington, D.C. Muise, K., K. Tanenbaum, R. Wakkary, & M. Hatala (2008). “A Report on Participatory Workshops for the Design of Adaptive Collaborative Learning (2008)”, in AH2008 Workshop on Adaptive Collaboration Support in Adaptive Hypermedia 2088, Bonn, Germany. http://www.sfu.ca/~rwakkary/papers/Muise_etal_A_Report_on_Participatory_Workshops.pdf

Schwerdtfeger, R., J. Craig, M. Cooper, L. Pappas, & L. Seeman (2009). Accessible Rich Internet Applications (WAI-ARIA) 1.0. 2009, last updated December 15, 2009. Consulted January 31, 2010. http://www.w3.org/TR/wai-aria/

Srinivasan, R., R. Boast, J. Furner, & Katherine M. Becvar (2008). “Digital museums and diverse cultural knowledges: Moving past the traditional catalog”. The Information Society, (2010). http://polaris.gseis.ucla.edu/srinivasan/Research/Proofs/SrinivasanetalTISBlobgects.pdf

Cite as:

Goodman, C. et al., Architecting CollectionSpace: A Web-Based Collections Management and Information System for 21st Century Museum Professionals<. In J. Trant and D. Bearman (eds). Museums and the Web 2010: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2010. Consulted http://www.archimuse.com/mw2010/papers/goodman/goodman.html