A Multimedia Information System for Governmental Historical Documents
Spangler de Andrade
PRODEMGE - Companhia de Processamento de Dados do Estado de Minas
Multimedia environments are transforming human-computer
interactions and allowing the creation of a new family of
products that could very well be the catalyst for launching
the second information revolution. This new generation of
products is helping not only to integrate multimedia into
existing environments, but to reengineer work processes.
At the very least, multimedia and imaging applications are
enriching existing applications by integrating images, voice,
and video. More important, they are helping us to rethink
information processing in various applications and, with
the introduction of multimedia computing, to revolutionarize
business, art, science, engineering, and manufacturing processes.
S. Khoshafian - The Multimedia Revolution, page 1,
in Multimedia and Imaging Databases, Morgan Kaufmann,
Los dos objetivos básicos de los archivos históricos
(la conservación y la difusion de los fondos) animam
todo el proyecto: el rápido incremento del número
de investigadores que repetidamente consultan los papeles
del Archivo hace que la documentación esté
sometida a un acelerado deterioro en los últimos
anos. Asimismo la demanda de una información más
completa y más fácilmente recuperable crece
en la actualidad en el terreno de los archivos igual que
en los demás campos de la información. Se
trata de dos urgentes problemas archivísticos ante
los que el desarrollo de un sistema informático
puede a la vez ser una solución que frene el grave
riesgo producido por la manipulación de los documentos
y responda a la necesidad de una información más
amplia, profunda y rápida para el investigador.
Como resultado de este proyecto se espera obtener
importantes beneficios para la conservación de
los documentos, ya que una vez digitalizados, la necesidad
de manipulación disminuirá sensiblemente.
Al mismo tiempo crecerán las posibilidades de difusión
de los fondos documentales del archivo: la localización
más rápida de los documentos, la visualización
directa de los mismos en pantalla, la posibilidad de consulta
remota de la base de dados textual y la facilidad de obtener
copias de los documentos en papel o en soportes informáticos,
mejorarán la utilización por parte de archiveros
e investigadores de todo el mundo.
Proyecto de Informatización del Archivo General
de Indias, Ministerio de Cultura, Fundãcion Ramon
Areces, Sevilha, Espanha, 1992
1 - Introduction
This work describes the elaboration of a multimedia information system
for research and popularization through World Wide Web of collections
of historical documents belonging to the Arquivo Público Mineiro
(Minas Gerais Public Archive).
The main objective of the development of the system is to increase
the process of informatization of the Arquivo Público Mineiro
what stores about 1600 linear meters of administrative and historical
documentation on the state of Minas Gerais, Brazil.
This paper addresses the use of new computer science technologies
for storage and processing of multimedia data as images, video, audio
and free texts (GHAFOOR, 1995), once these means of information representation
generally compose the collections of a public archive.
2-Complex Data and Multimedia Information
Nowadays we live together with a series of communication ways and
information as the image, the sound, the movement, the free text,
deeply inherent to the senses, to the spirit, to the history and the
human knowledge. These types of data are extremely rich, capable to
express quantitative or qualitative information in a friendly way
and with immediate perception for the user. They are, most of the
time, atavics and universals and their understanding is within
reach of the human being, independent of social, political, ethnic
and cultural factors.
In the entertainment, in the learning, in the arts, in the communication,
in the trade, in the sciences, those forms of information execute
important and preponderant functions.
From the beginning, in the fifties, until the current days, most
of the automated systems of information have used a restricted data
type: the classic or conventional data, composed by a limited chain
of alphanumeric symbols representing names, codes, measures, amounts
or values (ELSMARI, 1994; KORTH, 1995).
The forms of representation of information as image, audio, and video
possess much more complex structure than the small chains of letters
and numbers. Therefrom the pertinence of the denomination "complex
data" generally used to make allusion for those types of data in the
ambit of the Computer Science.
Complex data still have small representativeness in the storage and
management structures and in the available processing tools in the
commercial market of computer science, mainly if we consider information
systems that need to manipulate great amounts of those data. It is
still a segment under domain of the academic research and of the construction
of prototypes and first commercial products in the industrial area.
The inclusion, although incipient, of those "new data" in the context
of the computer science, has turned the development of information
systems less restricted and closer to the real world, facilitating
the human's interaction with the machine and propitiating a better
level of information, knowledge and understanding of the reality.
It is not for free the popular dictation: "an image is worth more
than 1000 words".
Multimedia is a new branch in Computer Science. We can date its beginning
in the last years of the eighties, and its real technological and
commercial evolution starting from 1993 (YOSHIDA, 1994). This way
it is premature to demand accuracy in its concepts and maturity in
its methods and tools. Precise definition does not exist for multimedia,
it remains, in this segment, a great dose of uncertainty and confusion
(RODRIGUEZ, 1995). The term that until few years ago it had no mean
became too including (SUTHERLAND, 1995). Still in agreement with RODRIGUEZ
(1995) multimedia expanded and turned out to be a field that challenges
rigorous definitions. For NEWTON (1997) a lot of people still visualize
multimedia as a diversified group of technologies looking for a purpose.
As the own name reveals, multimedia involves several media types,
that is to say, several ways to represent and to divulgate the information.
Those several media types include image, graph, animation, video,
free text, audio, each one with its specific properties. However,
multimedia data have a common characteristic concerning its representation
in computer: the need of considerable storage volume (DAVID,1996).
Those data make intensive use of the primary and secondary storage
means: volatile memory, magnetic disks, optic disks. As, more and
more, these types of data travel through networks widely distributed
as the Internet, powerful compression algorithms and high performance
network systems are necessary (KHOSAFIAN, 1996).
Independent of the existence of an accurate definition multimedia,
ratifying HOLSINGER (1996), is one of the most powerful ways, on humanity's
disposition, to communicate ideas, to present information and to experiment
new concepts. KHOSHAFIAN (1996) has similar perception: "multimedia
is the richest and expressive form to represent and interact with
the information". And he complements asserting to be the multimedia
an irreversible tendency in Computer Science and, in the future,
responsible for a dramatic revolution in the interaction between the
man and the machine. For MORAN (1995), the multimedia technology is
capable to modify our relationship with the world, the perception
of the reality, the integration of the time within the space. The
communication becomes more sensorial, multidimensional and non-linear.
There is a reenchantment for that technology because it allows a much
more intense interaction between the real and the virtual. RODRIGUEZ
(1995) advances that we are destined to become a society that uses
(and, perhaps, depends of) a plethora of multimedia applications that
will execute in personal or professional computers and in television
equipment. The perception of those and other authors (DAVID, 1996;
ROSEMBORG, 1993) shows clearly the importance of the multimedia computing.
Multimedia information systems use concurrently several types of
multimedia data being capable to organize, to synchronize and to present
that complex and including group of information in an interactive
way (DAVID,1996; KHOSAFIASN, 1996, GROSKY, 1994,1997). According to
ADJEROH (1997) those systems are characterized by the integration
of different types of multimedia data originating from several sources.
MARCUS (1996) emphasizes that although a considerable volume work
on multimedia already exists, produced in the last years, only a small
part specifically refers to multimedia information systems.
These information systems are not limited to any application type
and any specific area of knowledge. Multimedia applications are useful
for several types of users and professionals: students, educators,
doctors, economists, engineers, executives, artists, researchers,
scientists, etc. They are also important for the entertainment. This
way, the evolution of multimedia is interesting for all the segments.
Multimedia applications can be found where exists the need to manage
complex data. As classic examples can be mentioned the education areas
(local and distance training, digital libraries), health (database
of medical images), entertainment (games, video on demand, interactive
TV) and business (videoconference, electronic trade).
The Internet and the multimedia walk for an inseparable partnership.
A variety of tools and techniques are being developed to support multimedia
in network systems (EARNSHAW, 1997). A good example for that is INTERNET-2,
an alternative network for high-speed multimedia applications, already
initiated in United States. Languages as JAVA and VRML (Virtual
Reality Modeling Language) are good as platforms for the development
of sophisticated applications based on the World Wide Web (WWW).
The association of multimedia database management systems with the
Internet is of special interest for the information systems addressed
for museums and other institutions in charge of the guard and popularization
of art works and historical documents (LINS, 1995; LANZELLOTE). These
conjugated technologies have a great potential to enlarge and to democratize
the access to the humanity's cultural patrimony as asserts BESSER
(1995): "few technologies have offered as much potential to change
research and teaching in the arts and humanities as digital imaging.
The possibility of examining rare and unique objects outside the secure,
climate-controlled environments of museums and archives liberates
collections from around the world breaks down physical barriers to
access, and the potential of reaching audiences across social and
economic boundaries blurs the distinction between the privileged few
and the general public".
3 - Public Archive: Concept and Challenges
A public archive is defined as the group of produced or received
documents by government institutions due to their specific, administrative,
judiciary or legislative functions (Arquivo ...., 1996). In agreement
with the same source, document is an information register independent
of the physical media that contains it.
The growing demand for complete and easily recoverable information
of great archives provoked the appearance of methods and advanced
technologies in the field of the digitizing, storage, recovery and
presentation of images and other types of historical documents (Arquivo
Public archives or other collections maintaining institutions face
several problems, due generally to the great accumulation of documents
and their fragility, standing out the risk of degradation of the originals
due to their direct and frequent manipulation, and of the difficulty
of access to the information by the researchers and the public in
general (Arquivo ..., 1995; ARAÚJO, 1992). Mentioning GARCIA
(1994) on the digitizing work of the Archivo General de Indias in
Spain: "the free access to the papers has driven to what is denominated
the 'users inflation' in the reading rooms, inflation that visibly
is producing more damages in the documents than those produced by
the simple up to now to pass of time. In the Archivo General de Indias
there are documents that can be handled more than fifty times with
different objectives along the year. What would happen with them if
the appropriate measures were not taken?"
4 - The Arquivo Público Mineiro
It is more and more accentuated, in the current
days, the use of data processing systems in the several activities
of the Public, Federal, State and Municipal Administrations in Brazil,
with the objective to offer efficient and effective results in the
reach of the purposes of administrative interest and, also, to assist
the needs and the citizen's social and political rights.
The Arquivo Público Mineiro (APM) is a centennial institution
founded on July 11, 1895 by the state law number 126. It worked up
to 1902 in the historical city of Ouro Preto, when it was transferred
to Belo Horizonte, the recently built capital of the state of Minas
Gerais (Arquivo... ,1996). Nowadays the Arquivo Público
Mineiro is tied up to the Secretaria da Cultura and is installed at
a listed building registered by the Historical Patrimony (figure
Its objective is: to "pick up, to guard and to conserve produced
and accumulated documents by the organs of the public state administration,
guaranteeing to the citizens full access to them " (Arquivo ..., 1996).
It possesses a collection composed of textual and special documents.
For these last ones it can be understood: proceedings, cuttings, posters,
films, pictures, maps and plants. The collection includes documents
from the XVIII, XIX and XX centuries (Arquivo ..., 1996).
APM possesses now about ten million documental pages classified in
the following way (Arquivo ..., 1995):
Public documentation (95% of the total)
Private documentation (3%)
- Bound (80%)
- Still in sheets (15%)
Special documentation (2%)
The main services, rendered exclusively in its facilities, are (Arquivo
- Consultation room to the documents, with an average of a thousand
monthly consultations to the catalogs and documents;
- Support library to the users;
- Reproduction of documents, under authorization;
- Referring publications to its activities and its collections.
5 - Objects Model of the Arquivo Público
The "acervo" (set of collections) of the Arquivo Público Mineiro
is divided in collections or funds. Several types of collections exist:
textual, photographic, etc.
Each collection, seeking to facilitate the
research and in agreement with its cataloguing, is divided in series
and subseries, the division being used for chronological series. The
collection, as well as its series and subseries, are composed by documents.
A document is formed by one or more " document component " that can
be a text, an image, a video, an audio recording or any type of information
independent of its media.
The diagram of objects shown in the figure
2, based on the object-guided methodology OMT (RUMBAUGH, 1994),
illustrates this situation. The multimedia information system implemented
starting from this model tries to provide great flexibility in the
consultation and recovery of referring information to the collections
of a public archive.
6 - The Multimedia Information System
The development of a prototype seeking to motivate an informatization
project of the Arquivo Público Mineiro tries to make compatible
new technologies in Computer Science, mainly those based on multimedia
databases and multimedia information systems, with the specific needs
of a public archive in the conservation of the documental collections
under its guard and at the same time to turn it available to the public.
The authors tried to model a system guided to easy use and flexibility
in recovering historical documents and correlated information. This
system could assist the researcher with deep knowledge of the researched
object and also the laic, curious on a certain subject.
The Internet, due to its extraordinary diffusion for everyone, is
considered an excellent way for knowledge popularization and for the
democratization of the access to the information. For this reason
the developed application has chosen WWW as the main access path to
the historical documents.
The used tools for storage and management and for processing of the
digitized documents were selected by their technological updating,
interaction capacity with complex data as free texts and images and
also for their integration with the Internet through the use of browsers
such as Netscape from Netscape Communications Corp. and MS-Explorer
from Microsoft Corp.
Thus, after a search in the market, the following group of software
The use OODBMs is justified by the best adaptation of the object
oriented data model in relation to complex data representation than
the other data models, such as, for example, the relational model, usually
implemented in DBMSs (FOLEY, 1996; PAZANDAK, 1997; GROSKY, 1989).
- Object oriented database management system (OODBMS) Jasmine,
supplied by Computer Associates International Inc. (CAI) and Fujitsu
Inc. in evaluation period (McCRIGHT, 1997). Chosen by representing
the state of the art in DBMS (its commercial version was only released
in January of this year), being completely object oriented, with
Web support and capable to work with great volume of complex data
(CRAIG, 1997; FRANK,1997);
- Jasmine Development Environment (JADE), applications generating
system in Jasmine, from CAI (BOOKER, 1997);
- Java from Sun Microsystems Inc., a programming language designed
primarily for writing software to WWW (NEWTON, 1997);
- Java Proxies (JP), supplied by Technology Deployment International
Inc. A product developed by the partnership with CAI, it works as
the middleware between Jasmine DBMS and the Java interface;
- Interface using the Java language, chosen due to its portability
and web support;
- Symantec's Visual Cafe Pro from Symantec Corp., a rapid application
development tool (RAD) for Java (MARTIN, 1997);
- Digital image processing system developed by the Digital Image
Processing Nucleus - NPDI of the Federal University of Minas Gerais
- Browser for Web access, such as Netscape or MS-Explorer with
Java support (DeVONEY, 1997).
On the other hand, OODBMS is still emergent in the commercial market
of software, becoming an useful investigative process to observe the
behavior of this tool in a practical application as shown in this
work (FRANK, 1995). Jasmine implements the basic concepts of object
orientation such as encapsulation, polymorphism, inheritance, reuse
The use of a database management system also gives larger dynamism
to the application since each new document incorporated to the database
becomes immediately available for consultation.
Several research methods were implemented using the database management
Besides the research methods, topics were
incorporated to the information system with multimedia and hypertext
support, containing additional information, biographies, bibliographies
and glossary about themes inherent to the searched collection.
- Searches through the series and subseries in the way that a collection
is classified. The researcher navigates in the system through virtual
catalogs selecting his documents of interest;
- Keyword searches that guide the user for a certain subject, event,
- Textual searches through any word or expression present in one
or more document description.
Figure 3 schematically illustrates the
browsing levels allowed by the system:
Summarizing, the implementation of this informatization project
is justified for several reasons:
- It preserves the original collection of a public file, avoiding
the direct handling of the documents and their misleading;
- It facilitates the consultation of digitized documents through
different search methods, allowing the simultaneous access for several
users in geographically different places;
- It facilitates to improve the quality of the documents presented
to the user and to enhance interesting aspects of them with the
use of digital image processing techniques (DIP), such as brightness
and contrast control and borders enhancement, without altering the
original digitized document;
- It implements alternative search methods as textual and keyword
search, besides the habitual catalogs use;
- It allows the use of hypertext, making the searches more dynamic
- It allows several methods of remote and local access as Internet,
Intranet, CD-ROM, DVD, workstations and local networks.
7 - Arthur Bernardes Historical Collection
As a prototype, the system presently contemplates just one of the
historical collections of the Arquivo Público Mineiro, the
collection of Dr. Arthur da Silva Bernardes.
Arthur Bernardes, Brazilian statesman, born in 1875 and died in 1955,
was President of the Minas Gerais State in the 1918-1922 period, Brazilian
President in the 1922-1926 period, senator of the Republic and several
times federal deputy, being one of the most important persons of the
beginning of the Brazilian Republican history in the first decades
of this century (MAGALHÃES, 1973; MONTEIRO, 1994; AMORA, 1964).
The choice owed, besides the character's relevance, to the fact that
his collection is rich and completely classified, being constantly
researched. It is composed of a variety of types of hand written and
printed historical documents, as pamphlets, cuttings, pictures, correspondences
and even films representing well an archivistic collection.
Since it is a very extensive collection,
a selection was made among the documents, choosing the most important
ones, in way as to portray all the character's biography, the historical
period in what he lived and including all types of documents. Emphasis
was given to the photographic collection: about 300 photos, dated
from 1893 up to 1955, were selected in an universe of more than one
thousand and two hundred pictures.
To avoid their degradation, the selected original
documents were photographed in the Arquivo Público Mineiro
(figure 4) and the photographs were scanned
through flat or manual scanners and then stored in the database system.
In figure 5, some screens of the system
are shown, illustrating the several possible options for information
The prototype of the multimedia information system developed is executing
its main purpose of being a catalyst for a larger process of informatization
of the Arquivo Público Mineiro. The institution is now looking
for partnerships to implement a similar system that contemplates,
initially, all its pictures and plants of with more than 20.000 documents.
The use of the technology of object oriented multimedia databases
in conjunction with World Wide Web was revealed pertinent and interesting
for this type of application, in spite of some restrictions and incompatibilities,
associated with the incipient stage of many of the used tools, which
can be avoided in the future.
The authors are grateful to Capes, CNPq and Prodemge for financial
support and to Arquivo Público Mineiro for technical support.
The authors also would like to thank Leonardo Kenji Shikida and Frederico
Braga Torres Paulino, Computer Science students at UFMG, for their
participation in the generation of the multimedia information system.
ADJEROH, D. A., NWOSU, K. C. Multimedia Database Management - Requirements
and Issues. IEEE Multimedia, v. 4, n. 3, jul.-set. 1997, p. 24-33.
AGUILAR R. et al. In Spain a Project that honors history. Think,
n. 6, 1989, p. 6-9.
AMORA P., Bernardes o Estadista de Minas na República. São
Paulo: Companhia Editora Nacional, 1964, 234 p.
ARAÚJO, A. A., LAENDER, A. H. F. et al. Um sistema de banco
de dados de imagens para auxílio ao processo de conservação
e restauração de documentos históricos. SIBGRAPI
V - Simpósio Brasileiro de Computação Gráfica
e Processamento de Imagens, p. 53-56, 1992.
Arquivo Público Mineiro: Projeto de Digitalização
de Imagens. Belo Horizonte, Prodemge, 1995.
Arquivo & Administração Pública - o moderno
papel dos Arquivos Públicos. Belo Horizonte: APM, 1996 (primeiro
BESSER, H., TRANT, J. Introduction to Imaging. Los Angeles: Anderson
Lithograph, 1995, 50 p.
BOOKER, E. Jasmine sweentens database scnenario. InternetWeek, n.
695, dec. 1997, p. 29-30.
CHANG, Shi-Kuo, HSU, A. Image information systems: where do we go
from here? IEEE Trans. on Knowledge and Data Engineering, v. 4, n.
5, p. 431-442, out. 1992.
CRAIG, S. CA's object database gets two-step release. Computerworld,
v. 31, n. 27, jun. 1997, p. 3.
DAVID, M. M. Multimedia Databases. Database Programming & Design,
v. 10, n. 5, may 1997, p. 26-35.
DeVONEY, C. Browsers go to head a head. Computerworld, v. 31, n.
50, dec. 1997, p. 81-82.
EARNSHAW, R. 3D and Multimedia on the Information Superhighway. IEEE
Computer Graphics and Applications, v. 17, n. 2, mar.-apr. 1997, p.
FOLEY, John. Open the gates to objects. Information Week, n. 579,
p. 44(5), mai. 13, 1996.
FRANK, M. Future database technologies now. DBMS, v.8, n. 12, p.
52(6), nov. 1995.
FRANK, M. DBMS 1997 Buyers Guide. DBMS, v. 10, n. 6, sep. 1997.
GARCIA, P. G. Novas tecnologias no Arquivo Geral das Índias.
Collection - Revista do Arquivo Nacional, Rio de Janeiro, v. 7, n.
1-2, p. 75-90, jan./dez. 1994.
GHAFOOR, A. Multimedia Database Management Systems. ACM Computing
Surveys, v. 27, n. 4, dec. 1995, p. 593-598.
GROSKY, W. I., MEHROTRA, R. Image Database Management. IEEE Computer,
v. 22, n. 22, dec. 1989, p. 7-8.
GROSKY, W. I. Multimedia Information Systems. IEEE Multimedia, Spring
1994, p. 12-24.
GROSKY, W. I. Managing Multimedia. Communications of ACM, v. 40,
n. 12, dec. 1997, p. 73-80.
HOLSINGER, E. Como Funciona a Multimídia. São Paulo:
Quark do Brasil, 1995, 200 p.
KHOSHAFIAN, S., BAKER, A. B. Multimedia and Imaging Databases. San
Francisco: Morgan Kaufmann, 1996, 590 p.
KORTH, H. F., SILBERCHATZ, A. Sistemas de Banco de Dados. 2a. ed.
rev. São Paulo: Makron Books, 1995, 750 p.
LANZELOTTE, R. S. G et al. The Portinari Project: Science and Art
team up together to help cultural projects. Rio de Janeiro: PUC, 15
LINS R. D. et al. Projeto Nabuco: Processamento de Imagens de Documentos
Históricos. Recife: Universidade Federal de Pernambuco, Departamento
de Informática, 1995, 11 p.
MAGALHÃES, B. O. Arthur Bernardes Estadista da República.
Rio de Janeiro: Livraria José Olímpio Editora, 1973,
MARTIN, H. Java comes to a boil. Windows Magazine, v. 8, n. 11, nov.
1997, p. 156.
McCRIGHT, J. S. CA take up object cause. PCWEEK, v. 14, n. 52, dec.
MONTEIRO N. G. Dicionário Biográfico de Minas Gerais
- Período Republicano 1889/1991. Belo Horizonte: Assembléia
Legislativa de Minas Gerais, 1994.
MORAN J. M. Novas tecnologias e o reencantamento do mundo. Tecnologia
Educacional, v. 23, n. 126, set./out.1995, p. 24-26.
NEWTON, H. Newton's Telecom Dictionary. New York: Flatiron Publishing,
NORMAN, M. To universally serve where no database has served before.
Database Programming & Design, v. 8, n. 7, p26(8), jul. 1996.
PAZANDAK, P., SRIVASTAVA, J. Evaluating Object DBMSs for Multimedia.
IEEE Multimedia, jul.-set. 1997, p. 34-49.
Proyecto de Informatización del Archivo General de Indias.
Sevilha: Ministerio de Cultura, Dirección General de Bellas
Artes y Archivos, Fundacion Ramon Areces,1994.
RODRIGUEZ, A., ROWE, L. A. Multimedia Systems and Applications. IEEE
Computer, v. 28, n. 5, may 1995, p. 20-22.
ROSEMBORG, V. Guia de Multimídia. Rio de Janeiro: Berkeley
Brasil, 1993, 470 p.
RUMBAUGH, J., BLAHA, M. et al. Modelagem e projetos baseados em objetos.
Rio de Janeiro: Campus, 1994, 650 p.
SUTHERLAND, D. Computação Multimídia.
YOSHIDA, J. Multimedia is in the Chips. Electronic Engineering Times,
PRODEMGE - Companhia de Processamento
de Dados do Estado de Minas Gerais
Rua da Bahia, 2277, CEP 30160-012, Belo Horizonte
- MG, Brasil
fone 55 31 3391371
fax 55 31 3391259
UFMG - Universidade Federal de
Minas Gerais, Departamento de Ciência da Computação
Av. Antônio Carlos, 6627, CEP 31270-010, Belo
Horizonte -MG , Brasil
. This file can be found below http://www.archimuse.com/mw98/
Send questions and comments to email@example.com