DSpace Federation
2nd User Group meeting
Descriptions of presentations
ROBERT TANSLEY
Adventures in Time and DSpace
DSpace has come a long way since its beginnings at HP Labs and MIT Libraries.
Originally, DSpace was designed to be "digital shelf space" for the volumes
of intellectual output from MIT that are in digital rather than paper
form, for example datasets and rich media, in addition to text. This goal
was clearly not limited to MIT: universities, research institutions and
many other kinds of organisation face the same challenges. Not only does
this mean that DSpace could be of value to those organisations: these
organisations represent a vast pool of resources, and who might not be
able to build such a system from scratch by themselves, but would certainly
be able to contribute to the maintenance and enhancement of an existing
system. Thus, the vision for DSpace was born: Build a "breadth-first"
system to tackle the problems in a relatively simple way; then release
the system as open source, allowing the globally dispersed community of
researchers and developers to add depth, giving the system longevity and
growth not possible from one institution working alone. This vision appears
to have been well-founded. DSpace is now used by around 100 organisations
in 28 countries worldwide, with around 40 individuals having contributed
to development. New developments are contributed on a weekly basis, and
numerous research projects with forward-looking goals are growing around
the DSpace nucleus. DSpace has made the transition to a vibrant open source
project, its future and growth assured by the size of our community. In
this talk I will give a brief history of DSpace, and give a snapshot of
our community today. I'll then describe some of the opportunities and
challenges faced by our community moving forward.
PATRICK CARMICHAEL and RICHARD PROCTER
Collaboration, Coherence and Capacity-Building: Using DSpace to Support
a Major Social Science Research Programme in the UK
In this paper we describe how the Teaching and Learning Research Programme
(TLRP), an eight-year, £30 million programme charged with supporting
and developing educational research in the United Kingdom has implemented
and applied a DSpace as a repository for project and programme outputs
including published articles, conference papers, research reports, briefings
and press releases. The DSpace repository has become a major element in
the user engagement strategy of the programme; the OAI (Open Archive Initiative)
interface providing a basis for the development of a number of applications
and services to projects and other interested groups, including search
interfaces, dynamic web content, RSS feeds and other services. These have
also formed the basis of collaboration and communication between the TLRP
and other research programmes, indexes and resources. The TLRP is also
concerned to enable collaboration between researchers and to foster individual,
institutional and sector-wide research capacity. The OAI interface has
been used as the basis of data visualization tools, allowing the identification
of distinctive patterns of collaboration across the diverse projects of
the programme, and also allows exploration of similarities and differences
between education research (within and beyond the TLRP) and other disciplines.
MIKE SIMPSON and JIM DOWNING
NSpace: Exploring Architectural Design Principles for a Next-Generation
Institutional Repository
This presentation is intended to serve as a gentle but complete introduction
to the architectural design elements of NSpace, a proof-of-concept prototype
that explores an application infrastructure for a next-generation institutional
repository.
The NSpace architecture combines a number of emerging technologies to
provide a flexible, modular framework for institutional repository development.
The NSpace prototype uses a representational state transfer (REST) model
for client/server communication, and employs a staged event-drive architecture
(SEDA) manager for dynamic resource control and adaptive overload management;
it also uses an inversion of control (IoC) container for runtime configuration
of participating components.
The NSpace framework cleanly separates client interfaces ("frontends")
from server implementation details ("backend"). Frontends communicate
with one or more backend servers using protocol-neutral transaction objects
("transactions"): clients embed transactions inside service request objects
("requests") that implement protocol details (i.e. the XML-over-HTTP "RESTful"
communication model), and then execute the request. Request execution
by the client replicates the transaction object to the backend server,
which services the transaction and replicates the completed transaction
object back to the client. All details of request execution are completely
encapsulated by the service request object: from the client's point of
view, the state of the transaction object simply changes across the call
to the execution method. Although RESTful service requests are the current
default implementation for NSpace, the architecture is designed to easily
incorporate additional protocols (web services et al.) with a minimum
of effort.
The NSpace framework also modularizes the implementations of supporting
code ("support modules") within the backend server. Individual modules
are written as POJOs ("plain old Java objects") that include a transaction-handling
method (one that receives a transaction object, performs processing activity
on it, and returns it to the framework for further processing). Modules
are assembled into sequences of cooperating instances ("chains"), which
are then made remotely accessible through servicing endpoints ("services").
When the framework replicates a client transaction object, it is queued
to the first module in the appropriate chain. Transaction "fall through"
the chains, with each participating module given its turn to perform work;
at the end of the chain, the completed transaction object is delivered
back to the originating client. Again, the framework encapsulates and
manages all runtime queueing and dequeuing operations as transactions
move through the servicing chains: from the individual module's point
of view, its processing method is simply invoked once per transaction
passing through it.
Most usefully, the transaction objects (represented as Java interfaces
in code, or as DTDs in REST/XML representation) assume the role of the
protocol-neutral, implementation-independent fundamental units of work
done by the institutional repository. Development of frontend interfaces,
client/server communication methods, and backend functionality implementations
each become largely independent of each other, since each "end" of the
code is insulated from changes by the common ground of the transaction
objects.
The NSpace project is also exploring the current state-of-the-art in
coding techniques and practices: using Subversion as a source code repository;
JUnit for unit testing; Maven for project management; and Eclipse for
integrated code authoring.
SCOTT PHILLIPS CODY GREEN, ALEXEY MASLOV,
ADAM MIKEAL, BRIAN SURRATT and JOHN LEGGETT
DSpace XML UI Project Technical Overview
This presentation describes the modifications to DSpace by Texas A&M
Libraries to support an XML-based user interface. The purpose is to enable
the establishment of a unique look-and-feel for each community & collection
in DSpace; this may include integrating with a community's existing web
presence outside of DSpace. The goal of the DSpace XML UI project is to
increase the adoption of DSpace by these communities.
SCOTT YEADON, PETER RAFTOS, LEO MONUS
The Australian National University: Case Study: Creating Publications
from a Dspace Repository
Abstract
We demonstrate the re-publication of a complex document (text and interactive
CD) from a DSpace repository. The document components (over two hundred
text, schematic, audio, image and video components) are transformed into
open standard formats and stored in DSpace, together with a relationship
map. Appropriate presentations (PDF, HTML) are generated dynamically using
the Apache Cocoon framework. The implications of this case study for a
more general publication framework are considered.
Brief Description
By taking the digital objects used in the creation of a MacroMedia Director
combined CD/Book publication, the DSpace team at ANU has implemented a
prototype to demonstrate the framework for the ingest, preservation and
publishing of digital objects. The prototype was built using DSpace as
the digital object repository and Cocoon as the publishing/access mechanism.
While the prototype focuses on the re-creation of the CD/Book publication
in an on-line form, the implementation covers:
- a configurable ingest mechanism
XML-based configuration of item and metadata locations, pluggable XSL
stylesheets for adding semantic meaning to textual documents, independent
of repository software up until actual ingest
- creation of XML-based relationship metadata and a "product definition''
A prototype XML tag set was created to support the digital object relationships.
This information could be used to define compound objects at ingest,
create new compound objects (such as learning objects) from currently
unrelated objects in the repository and, in conjunction with the ''product
definition'' markup (superset of the ingest XML), publish from, or partly
from, a repository
- publishing digital objects
The re-use of digital objects for publishing and creating new compound
digital objects Through the (semi-automatic) disaggregation of existing
compound objects and by encouraging the ingest of digital objects at
their most granular level (i.e. individual images, text/documents, video,
audio), using relationship metadata and product definition markup new
compound objects and publications can be created.
This presentation will take the form of a demonstration of the publication
with discussions around how it was put together and the wider implications.
The ANU believes that further development of the prototype and a set of
supporting tools built over the prototype will yield significant benefits
in making depositing and re-use of digital objects easier.
RICHARD RODGERS
DSpace and the Grid
In this short presentation, we will survey current and future research
into efforts to integrate DSpace with 'grid' technologies for more scalable
storage. After a general introduction to the grid, we examine the work
surrounding DSpace and the Storage Resource Broker (SRB), a data grid
middleware component developed by the San Diego Supercomputing Center.
DSpace 1.3 will include some SRB support. Work extending this integration
in the context of DSpace 2.0 will also be described.
MIGUEL FERREIRA and ANA ALICE BAPTISTA
DSpace-Dev @ University of Minho - Development of tools and add-ons
for the DSpace platform
DSpace related activities at University of Minho (UM) started in April
2003 and since then many developments have been made. The first step was
taken by the University Documentation Services (SDUM) with the translation
of DSpace to Portuguese which was followed by the implementation of the
RepositóriUM, the university's institutional repository. The translated
version of DSpace has been downloaded and used in many other institutions
in Portugal and Brazil.
This very same version of DSpace served as a basis for the Papadocs
system - a system which aimed at providing access to all assignments developed
by the students of the Department of Information Systems. This instance
of DSpace also served as test bed for a series of add-ons created at the
department, namely, the Commenting Add-on, the Recommendation Add-on,
the Web of Communication Add-on and the Controlled Vocabulary Add-on.
This project has lead to the formation of several interesting ideas.
These include the improvement of the Controlled Vocabulary Add-on to support
more complex structures, such as thesauri, and the development of automatic
metadata extractors that could fill in some of the required metadata during
the process depositing new items.
ARD PRASAD
Using Multiple Metadata Formats in DSpace
Most commonly DSpace is being used as an institutional repository and
also as a discipline based repository. There have been attempts to extend
DSpace ability to host electronic theses and dissertations like the popular
Tapir and the attempt of The University of Manitoba to provide etdms metadata
format. However, the user community has often expressed the requirement
for other metadata formats like VRA core, IMS etc. Support for many metadata
formats will greatly enhance the use of DSpace and the type of resources
that could be preserved using DSpace. The task involves provision of
- creation of more elements in dctypregistry
table
- incorporating new elements in the submission workflow
- making
some of the newer elements searchable
- Displaying new elements in search
results
- most importantly, exposing the newly created metadata formats
through OAI protocol
The recent beta version (1.2.2beta) of DSpace comes with much needed
input forms with which one can define their own submission forms. This
feature adds additional ability of adding any metadata format to DSpace
in addition to exiting OAI-Dublin Core. This papers attempts to enumerate
various metadata formats and the required element sets. Further, it provides
guidelines for modifying various DSpace files for some of the popular
metadata formats in use like etdms and vra core.
DIMITRIOS KOUTSOMITROPOULOS
Enabling Multilingualism and I18N in Dspace
Two years ago University of Patras initiated a large project in order to reform the studies programmes of its departments. Along with departmental efforts a series of central support actions where set up and put on to development. Among these, the establishment of a central inter-departmental Institutional Repository that will facilitate management, preservation and dissemination of educational material has a major role in the effectiveness and success of the project. After a review process and taking in to account parameters like system availability, extent of support and use of state-of-the-art technologies DSpace was selected as the basis for the development of the University of Patras educational repository.
This presentation is focusing on the efforts that have been made in order to upgrade DSpace in to a truly internationalized system, instead of its mere localization. Specifically the presentation revolves around the following points:
- The methodology followed towards making the DSpace UI multilingual, recently incorporated in to DSpace 1.3
- Internationalization of dynamic content in servlets and tags (like dates, headers and other text)
- Submission of metadata in more that one language (metadata language description)
- Language-dependent item view
- Dynamic change between interface languages
- Lessons learned and suggestions for further progress on DSpace i18n
JIM DOWNING and GRACE CARPENTER
Exploring Strategies for Digital Preservation for DSpace@Cambridge
Cambridge University Library and MIT Libraries submit this proposal to share the outcomes of the digital preservation research work conducted through the DSpace@Cambridge project, concentrating on two main areas: Process Automation and Preservation Planning.
Automation
Digital preservation activity in its current form commonly involves a high level of human effort. In mediated archiving the archivist's efforts do not scale well. In self-archiving situations this effort can be a barrier to the adoption of the digital preservation activity. It behoves us, therefore, to look for opportunities to make this process more efficient through automation.
The potential for sustainable preservation of a digital object can be
improved by accurate identification of the file's type, validation of
the file against type specification, and technical metadata extraction.
Recently available software (e.g. JHOVE) [1] provides identification and validation
for a number of popular types, and there are older existing technologies
(e.g. the 'file' command) that have some useful functionality in this
area. DSpace@Cambridge will evaluate the different tools, investigate
storage strategies for technical metadata, attempt to gauge the utility
of certain types of technical metadata, and provide a technique for integrating
these into institutional repositories.
Preservation Planning
DSpace@Cambridge intends to develop strategy templates that will assist
institutions with the preservation planning process. Building on work
on format action plans done at the Florida Center for Library Automation
as part of the [2] DAITSS
project, we hope to create a system of machine readable preservation
strategies that can evolve to support future rendering processes, and
yet retain enough information that such processes can be human validated.
Although the initial aim will be on migration, it is hoped that the technique
can be extended to emulation and Universal Virtual Computer approaches.
Our hope is to prove this preservation strategy approach by writing a
migration tool for one or two formats capable of supporting migration
on ingest or migration on-the-fly. It should be possible to share strategy
templates between institutions.
[ 1 ] http://hul.harvard.edu/jhove/jhove.html
[ 2 ] http://www.fcla.edu/digitalArchive/pdfs/DAITSS.pdf
ELIN STANGELAND
Data exchange between the Norwegian research reporting systems and
Dspace
BORA is the institutional repository of the University of Bergen. In it
researchers from the University self-archive research publications such as
journal articles, reports, PhD-theses and masters theses.
Researchers at four of the Universities in Norway are currently reporting their
research to a reporting system called FRIDA, built on the recommendations of
the Ministry of Education and Research. IR project groups from the same four
universities are collaborating with the FRIDA development group in Oslo to
harvest data from parts of FRIDA (by using the OAI-PMH) and import these into
the local IRs. They are also planning to create a solution where the researcher
can upload the full-text documents (files) as well, transmitting these to the
local IRs.
For the Bergen Open Research Archive this means that we need to adapt DSpace to
handle the imported data. They should be injected into a specialised workflow,
which pools the data until staff have time to follow them up. There must also
be a way to "submit" the data from the "temporary pool" into their final
destination, the collections of BORA, creating traditional DSpace items.
ANNA MARIA TAMMARO and PIETRO GOZETTI
Academic staff expectations on DSpace services: results of a survey
at the University of Parma's Art and Humanities Faculty
DSpace has been implemented in the University of Parma in the second semester of 2003, with 2004 as a preparatory year of experimentation and organisation. The first Faculty to be involved in the Dspace project experimentation has been the Arts and Humanities Faculty, starting from the Department of Cultural Heritage, followed by some others on voluntary basis. A survey was done in March 2004, with the aim of personalising the services to different needs and behaviours of the Faculty staff, distributed in the Departments of: History, Political Science, Environment Studies, Education, Philosophy, Italian Literature, Foreign Languages and Literature, Classical and Medieval Philology and Cultural Heritages. The objectives of the survey were:
- measure and evaluate the use of electronic publications and the disposal
to publish on line by teachers;
- to know the behaviour and expectations of the humanistic teachers
about preservation, copyright, peer-review and Open Access;
- to understand if the characteristics of the different disciplines
could impact on the use of the Open Access archives;
- to verify what types of document Faculty teachers would like to store
more than others.
The percentage of reply has been of 53%, with History Department with the higher reply percentage and Environmental Department the lower.
The results have demonstrated a high percentage of teachers willing to deposit their publications in DSpace (66%), with differences related to specialisation, qualifications (researcher, professor, assistant), Departments and preferred typology of publications. The expectations about DSpace are high as a support for research publications and for learning and teaching. However many teachers need abilities and training to use the deposit system and most (80%) wish security and stability assurances for the documents stored. Research reports and articles (80%) are the preferred typology of documents to store.
The survey was useful for the criteria and policy of the DSpace organisation and policy, in particular for the role of support staff, facilitating promotion, submission and editing of documents.
ELIZABETH BREAKSTONE, HEATHER BRISTON
and CAROL HIXSON
Expanding the Focus of the IR: Scholars' Bank at the University of
Oregon
DSpace installations that have targeted scholarly, academic materials have been having a difficult time acquiring content. A variety of strategies for marketing institutional repositories to faculty have been discussed in the literature and at conferences. There are many DSpace sites at campuses around the world that have only one or two hundred items, even after two or more years of work. The same was true of the University of Oregon's site, Scholars' Bank, until recently. In the past six months, however, the University of Oregon's contributions to Scholars' Bank have increased 255% over what had been submitted in the previous 18 months. Hits against the archive have increased 224% in the same time period, with searchers coming in from all over the world. The presentation will discuss problems with the assumptions inherent in the original model and focus on how the University of Oregon turned the corner.
JON FERGUSON
Digital Libraries and Evidence in the Developing World Context
"Health Information for All" has been asserted as a prerequisite for meeting the Millennium Development Goals [1]. Much work is focussed on lowering the barriers of access to published information by free access journals, publications and through collaborative web-based publication. Can open-source digital-asset stores such as DSpace provide an effective way to enhance this initiative - especially within the context of poorer African countries where internet connectivity is not yet that reliable?
The Initiative for Maternal Mortality Programme Assessment (IMMPACT) is directly involved in evaluating maternal mortality interventions in 3 developing countries: Burkina Faso, Ghana and Indonesia. As part of this work we are building an evidence-base of direct and indirect causes affecting maternal mortality rate (MMR). This will become a platform to enable our researchers in 6 countries to reason and discover links between different data, papers and analyses. DSpace is providing an excellent bases for this project due to its low-cost, out-of-the-box deployment and ease of enhancement. Thus it has the potential to double both as a digital-library within the project and country partners but also enable us to investigate the use of web-ontologies for studying and sharing knowledge about this domain.
[1] Godlee, F., Pakenham-Walsh, N., Ncayiyana, D., Cohen, B., Packer, A. (2004)
"Can we achieve health information for all by 2015." The Lancet 364: 295-300.
NATHAN D. SARR
The Tools of UR Research
The University of Rochester is currently engaged in several projects which it feels will add great value to the DSpace project and its community. These projects include Researcher Pages, Statistics Display and the Checksum tool, which is being jointly developed by Cambridge, MIT, and the University of Rochester.
Each DSpace enhancement was developed out of research with users with specific objectives in mind. Researcher Pages were created to expand faculty interest, enhance their experience, and showcase their research in our DSpace system with the added result of increasing the size of our digital archive. Statistics Display was added to provide real-time, up-to-date information about DSpace usage as well as help evaluate the value it is providing to an institution and its community. The Checksum tool was developed to allow, at least at some level, assurance that we would be able to uphold our promise of digital preservation over time.
These enhancements are considered an invaluable part of our DSpace installation and each supports the others. The Researcher Pages garner support and interest from faculty, while the Statistics Display proves to our researchers and community the value of archiving their information. The checksum tool provides the necessary sense of security that any digital corruption or problems can be isolated and resolved before it is too late and data is lost forever. These enhancements put together provide a very compelling reason to use DSpace by the Community on a regular basis not just for data access but also for personal preservation and promotion.
This presentation will include a demonstration of each DSpace enhancement with a discussion of the goals, technical decisions, and trade offs resulting from the choices made. We will also briefly review the user research that led to the development of these tools. Following this we would like to talk about the future of DSpace and development goals at the University of Rochester.
RICHARD JONES
Incorporating local developments to DSpace
Increasingly community developments are becoming available to augment the existing DSpace functionality, and it is important that these developments both fit alongside the core system and have the potential to be incorporated into the main distribution. Open source development can be very successful if community developments are encouraged and easily added. Here we use the Tapir e-theses management tools developed at Edinburgh University Library as an example of third party development and how those developments can be integrated, in whole or in part, into the DSpace core. This includes: the design decisions that need to be made when first developing the additional software; determining the applicability of parts of the software for the main codebase, to aid a consistent direction of development; and finally the patch creation process, with the additional administrative code requirements that are not necessarily present in the original code.
This should provide a good introduction to developing for DSpace for those already doing so, or those preparing to do so.
ROBERT TANSLEY, SHEN XUKUN, QI YUE
The China Digital Museum Project
The China Digital Museum Project is a collaborative project between the Chinese Ministry of Education, Hewlett-Packard Company and several Chinese universities, including Beijing Normal University and Beihang University.
Many universities in China have one or more museums. In order to improve access to the artefacts in these museums, these universities are undertaking the digitisation of those artefacts. The principal aim of this project is to provide these universities with infrastructure based on DSpace to store, manage, preserve and disseminate the digitised versions of the artefacts. In the final phase of the project, there will be around 100 university museums with digital artefacts stored in federated DSpace installations.
In fulfilling this requirement, we need to address many problems associated with managing distributed digital asset management, including persistent identifiers, metadata standardisation, deployment processes and management, and content and metadata replication.
Another challenge that this project faces is how to consolidate the very diverse digital assets each university museum manages so that users can navigate the entire collection by subject. Hence we aim to create "virtual museums", formed by arrangements of digital assets by subject, regardless of physical location.
In this presentation we will describe the project and our current progress.
WILLIAM REILLY
CWSpace: Archiving MIT OpenCourseWare in DSpace
Charged with archiving all of MIT's OpenCourseWare content, the CWSpace project has been exploring new kinds of development on the Dspace platform in two key directions: packaging metadata and protocols. This talk reports on progress towards the support of new, standards-based ingest (and dissemination) functionality (e.g., IMS Content Packaging), as well as in the area of new "lightweight network interfaces" capabilities (e.g., Web Services protocols). Each of these development efforts is designed to support the needs of a new kind of content type that is coming from a non-traditional domain for institutional repository tools, namely, "courseware," or teaching and learning materials. A profile to the IMS Content Package for MIT's OpenCourseWare content has been developed, with hopes of wider applicability to learning management systems. Extending that profile to other packaging standards (e.g. METS) is discussed, especially in light of a general METS for DSpace SIP (Submission Information Package). Current investigation into technology options for Web Services (e.g. SOAP/WSDL; RESTful, WebDAV) are discussed, particularly in light of needs for packaging (aggregating) DSpace objects (Items, Bitstreams) to support the requirements that teaching and learning materials have for a relatively high degree of flexibility in their management and dissemination.
MATTHEW COCKERILL
Open Repository: Building a hosted repository service on DSpace
BioMed Central is using its experience as an open access publisher to provide a hosted repository service, Open Repository, that is built on top of DSpace. The goal of Open Repository is to allows organizations to operate a robust, flexible and customizable digital respository service, based on an industry standard platform, without needing a large investment in IT infrastructure and staffing. To make Open Repository possible, BioMed Central has modified Dspace code to allow for efficient hosting of multiple repositories on a single server - we plan to contribute these changes back to the DSpace project. We are also
extending DSpace with modules offering extra functionality for Open Repository users, including enhanced access statistics, automatic population of repositories with open access articles, and dynamic rendering of structured XML articles.
|