Abstract
Like libraries and librarianship in general, the Deutsche Nationalbibliothek (DNB) has in recent years been confronted with technology-driven changes in the information environment. These changes mean a considerable challenge in terms of the mandate of the DNB and the way to fulfill it. To cite one important example: how will DNB deal with a collection mandate extended to digital publications of all kinds, including the obligation to record these publications and make them available for current and future generations? How does it manage to remain a highly visible lighthouse in the seas of data and information? This will not happen accidentically, but will be the result of careful planning, determined actions, and continuous monitoring, based on a clear strategy and a systematic approach, and it means that processes and functions will have to be revised, terminated or newly established. The DNB has always considered itself as an innovative institution – the necessity to be open for recurring innovations and to initiate such developments has become more and more urgent. The DNB has, therefore, started a strategic process in 2013 to respond to this requirement. This process is new to the institution and its members and demands a lot of learning and preparation. A major first step was the definition of strategic goals for the years 2013 to 2016, complemented by a project for organizational development. These goals help the DNB to focus itself and they serve as a guideline for prioritizing projects and tasks. Examples for strategic priorities are a substantial increase in the collection of digital and web resources, development and implementation of automated cataloguing processes, stepping up digitization efforts, and building up an infrastructure for long-term preservation of digital content. However, there are other areas to be attended to and other challenges to be met – the strategic process and the organizational development are, for DNB, tools to continuously follow-up on innovation. This article intends to address the topic coming from two directions: On the one hand we describe the process development in DNB, and on the other hand we name examples and working areas, which might be relevant factors to successfully master the future.
Keywords
Introduction
The information environment, how researchers work and exchange information, how people get the information they need, how information is created, distributed and shared in platforms for working collaborations, the characteristics of information objects – all this has been dramatically changing in recent years and the pace of change will continue to be fast. Traditional publication workflows are supplemented or replaced by enhanced formats, objects, and manifestations. In terms of collecting and preserving web resources (which in itself is a huge challenge for cultural heritage institutions like national libraries), dynamic publications like news websites, online magazines, etc., pose additional problems. Another dynamic means an even greater challenge: the content of a book, a journal article, of a sound recording, etc., becomes dynamic in the sense that they are not finished and static, but subject to continuous change by various contributors.
All this highlights some fundamental changes and requirements (national) libraries have to consider. A closer look at the developments the German National Library (DNB) is confronted with makes it more concrete. An amendment of the legal deposit law in 2006 1 extended the mandate to collect all German publications published in Germany in digital form, text and music. As a result of this amendment, the number of items acquired daily increased step-by-step from approximately 2,500 items to 3,850 items (both physical and digital), whereas the number of staff members did not increase. As a result, the DNB has had to look for alternative ways of processing the objects depending on their format. New technical workflows and equipments have to be developed or implemented. Ways and methods have to be found and used to motivate the staff to become a part of the development.
The new challenges include the obligation to record digital publications and to make them available for current and future generations on the one hand, and the requirement to continue traditional workflows for traditional material like printed books and the demand for rule-based, intellectually improved metadata on the other. The transformation associated with these changes has implications for library users, too. They do not expect the national library to collect everything that appears on the net, but at least a significant amount that reflects the national culture. In essence national libraries like the DNB have to reformat their strategy, their workflows, and their perspective on users.
This is to be contrasted with the fact that libraries are confronted generally with frozen or even reduced budgets, so that they are forced to do more with less. Additional pressure is given by the rapid pace of technological change, forcing investments especially in the field of information technology. But that is not all – the processes cultural heritage institutions are working with, the standards they are based on are a matter of change, too. To be visible to the public and to provide their services, libraries must deliver their products (data and related services) in a more modern and future oriented way—therefore investment in new services is needed. An example of this is that libraries have to make their data web-compatible by transforming the data into the language of the Semantic Web RDF (Resource Description Format, see http://semanticweb.org/wiki/RDF). The enormous effort to adapt and automate formerly manual and intellectual data processing into automated workflows becomes an important driver for change.
All these elements require innovation in libraries. They need to renew their approach and they need to act quickly. Very often, the need for new methods and processes is accompanied by the re-organization of existing workflows; a much more complex process. With this in mind, the German National Library decided to start a systematic approach to reorientation, comprising a more focused strategy, clearly defined goals, and a climate of change and innovation.
From plan to actions at DNB
Starting points for changes are not gathered accidentally, but as a result of careful planning, determined actions, and continuous monitoring, based on a clear strategy. Processes and functions have to be revised, terminated or newly established to become faster or more efficient – both in terms of technical workflows and procedures for decision-making. A general goal of the DNB is to develop and implement methods and structures for future developments and to gain the ability for changes and creation of further continuous innovation. In 2012, the DNB started a strategic process to respond to this requirement. This process was new for the organization.
The initial major step in this process was the definition of strategic goals for the years 2013 to 2016. To support the actions and processes to reach these strategic goals, the DNB established a complementary project in 2014 for organizational development with the assistance of an external consulting company. The strategic goals help to clarify ambitions and objectives of the institution. They serve as guidelines for prioritizing projects and tasks. The process to devise the goals was such an important experience that they were defined by groups of experts in the library, then presented and discussed with staff members. An important step was an Open Space event held in 2012 in Leipzig, with the full staff of the DNB, where everybody could bring in his or her personal knowledge, opinions or fears.
The strategic goals were published online 2 and in print format, and are intended to provide an orientation to the library’s priorities. Thus, the strategic goals help in decision-making and to weigh projects against each other.
The main priorities until 2016 are: The German National Library should intensify its collection activities and adapt its acquisition processes and instruments to handle new types of publications. The German National Library should increase the use of automated data acquisition for cataloguing and indexing its holdings. The German National Library should improve the retrieval and usability of its holdings and the data describing them. The German National Library should extend the scope of measures aimed at preserving the long-term accessibility of its holdings and continuously optimize the relevant processes. The German National Library staff should identify themselves with the library’s objectives.
What specific activities or objectives arise from these priorities?
In the field of collection building and handling new types of publications, we are focusing on clearly defined web harvesting activities, increasing the collection of digital monographs to 80 percent of the national output and doubling the number of e-journals collected compared to 2013. In addition to enlarging the digital collection and the collection of so-called ‘grey literature,’ sound recordings and retrospective acquisitions are being stepped up as well.
In terms of cataloguing and record creation, the DNB aims to complement and/or substitute its traditional ways of generating metadata for its collections. The process of manual descriptive and subject cataloguing of each individual item is to be supported, accelerated, and enhanced by increased use of third party data and especially by expanding the use of automated procedures for metadata creation. The DNB started with the automation of certain cataloguing steps for digital monographs in German language several years ago. The goal is to continue to develop, supplement, and improve these machine routines and to apply them to other types of publications. But, in order to have exchangeable and widely usable records and metadata, standards and rules should be followed – at best agreed upon globally. The DNB will, therefore, keep up its involvement in the development and deployment of RDA, it will cooperate with the BIBFRAME (Bibliographic Framework Initiative, see http://www.loc.gov/bibframe/) initiative and work on a modern concept and rule-set for subject indexing. Keeping partners and customers informed through adequate measures of communication is also an activity derived from this strategic area.
It is a natural necessity that all media deposited with the DNB are easily found in the online catalogue and accessible either digitally or locally within the premises of the library. This means that the online catalogue should be as easy to use as search engines on the Internet – very intuitive, and successful in retrieving the information that is requested by the reader. Text and music objects should be digitized as much as possible. For copyright reasons, most of the objects stored at the DNB can only be used in the reading rooms. Copyright free materials should not only be offered via library catalogues and search engines, but also via virtual shop-windows on the website of DNB so that library users will know of their existence. All of DNB’s bibliographic metadata should be offered at no cost and under a CC0 (Creative Commons Zero) license to other interested parties. 3
With regard to preservation and conservation of library materials, the German National Library has broad-based experience regarding physical media and in ensuring the long-term accessibility of digital resources. It has earned national and international recognition through its innovative contributions to the mass deacidification of printed works, to the preservation of digital media, and as the driving force behind “nestor,” the German network of competence for digital preservation. In nestor, libraries, archives, museums and leading experts work together to ensure the long-term preservation and accessibility of digital sources. Led by the German National Library, nestor is a cooperation association including partners from different fields, but all connected in some way with the subject of “digital preservation”. 4 Particularly in the area of physical preservation, the German National Library should extend its national and international profile and reinforce its existing network structures. Another central goal is the integration of processes, which in many cases have hitherto been conducted separately, to create a uniform work structure. Thus, a preservation-oriented curation of physical materials should be comprehensively established. To address the long-term accessibility of digital resources, long-term preservation should become an integral part of the automatic processing workflow for all digital publications included in the collection mandate by 2016. In order to facilitate the cooperative approach, the DNB is pushing forward the cooperative long-term preservation service AREDO, the long-term digital preservation cooperation service of the German National Library, directed to partners from the cultural domain 5 , which is intended to be marketed externally and which is going into operation by 2014. Another area of interest is the digitization or recopying of at-risk objects whereby the informational content is transferred to an alternate medium. The goal is to convert a large part of the non-paper materials held on physical data carriers into digital form by 2016.
Lastly, a range of activities is planned to support and strengthen the identification of the staff members with the goals of the institution. This will require a personnel development plan, communication plans and platforms, various measures to strengthen the library’s management staff, implementation of target agreement processes, formulation of basic management principles, and clear assignment of responsibilities: one task → one person.
Most of the actions mentioned here are or will be realized as projects. Therefore, an important component of the process is project management, providing the frame in which the work takes place. Since such a structure was implemented in 2008, the DNB now has a lot of experience in managing the projects in a professional manner. Starting with the initial planning phase, the ‘project organization’ unit is involved in all the stages of project work thus ensuring, for example, that all necessary information technology (IT) resources are provided, all stakeholders are informed, and reports and documentation are written. An essential part of the project workflow is a tiered decision-making process based on clear judgments as to the feasibility and importance of a specific project. Project management is continuously adapted and developed to the requirements of the projects.
All these activities have to be reflected and adapted continuously. It often takes time to see the results of progress and change. Over time the organization becomes more experienced in terms of change management, and new topics are reflected and prepared to be a part of the library’s service portfolio. What follows is a detailed look at two areas as examples of what the DNB is facing.
Core areas of innovation at the DNB – Automated processes for cataloguing and indexing
As previously mentioned, part of the legal mandate of the German National Library is the creation of records for its collections and the production of the German National Bibliography. Expanding collection activities implies at the same time the necessity to expand cataloguing activities. Although the DNB was granted a number of additional staff positions as a consequence of the expansion of the legal mandate, it very soon became clear that asking for more resources in order to deal with the increasing number of publications would not be an option for the future. Therefore, the library had to look into the processes and methods of metadata generation and try to find alternative ways. In 2009, a decision was made to stop manually cataloguing online resources starting with the year 2010 and, instead, rely on metadata supplied by authors and publishers. It was also decided to set up a large project to develop methods for automated processing of monographic online publications. 6 This project, called Petrus 7 , was conducted from 2009 to 2011, with follow-up projects in 2012-2013. 8
In various scenarios, especially the creation of subject metadata, creating links to authority data was taken into consideration. In terms of subject cataloguing, the automated assignment of Dewey Decimal Classification (DDC) subject groups 9 and of subject headings taken from the same controlled vocabulary as used in intellectual indexing, were two of the scenarios addressed. One of the conditions of the project was that a system or software available on the market should be used to provide solutions and that homegrown software should be avoided. Therefore, the first two years of Petrus were dedicated to a market scan and thorough testing of several systems. After an invitation for public bid, the DNB decided to acquire and license the Averbis Extraction Platform, a system developed by the Averbis company located in Freiburg, Germany. The software consists of a classifying and an indexing component. Whereas the classifier (used for the assignment of DDC subject groups) has to be trained in using intellectually classified material, assistance is available in the form of an indexing component that comprises a range of software tools for linguistic analysis of texts and a dictionary providing the vocabulary to be assigned to documents.
The other scenarios in the Petrus project were the automatic creation of authority records for authors of online monographs and the identification and linking of parallel print and online manifestations of a work in order to re-use subject information and links to authority records for persons and corporate authors. This usually meaning that information from the intellectually catalogued print version was transferred to the online version. For this, the software tools were developed by the DNB’s IT department.
What has been achieved so far? All of the mentioned scenarios are in a productive stage now. Automated assignment of subject groups started in 2013, and by the end of the year, over 95,000 publications were enriched with a DDC subject group, over 40,000 online publications were amended with links to authority data, and subject information was transferred from a parallel print edition. The automated assignment of subject headings is, after several iterations of testing and improving the indexing software, about to start for German university publications. The automated creation of authority records is already in a stage of an overhaul; too many duplicate authority records for names of persons were produced, hampering manual cataloguing and user searching. This shows that there is no obstacle-free highway to automated cataloguing. But, obstacles can be overcome and problems solved with determination and patience.
All these measures already have improved the data quality and the accessibility of online monographs in the DNB’s collection. The advantage of automated cataloguing lies in the fact that it can be repeated on the same publication again if improved methods results in better data.
The Petrus project was conducted by a group of staff members belonging to various departments in the DNB, namely the IT and the cataloguing departments. Their participation in the project was on a voluntary basis and meant that they had to give up or reduce their regular tasks. Supervisors and colleagues were often dissatisfied with this fact, the whole enterprise was regarded with mistrust, and it caused a lot of anxiety among staff. Communicating reasons and objectives over and over again, therefore, was most essential. Also, the course of the project showed that it was indispensable to have a fixed group of people who were able to focus on such an ambitious project and who were determined to bring about success. It also became clear that the occupation with the whole area of automation of cataloguing and metadata creation eventually required the implementation of a new organizational unit. Novel and permanent duties like the continuous monitoring and further development of software routines, quality checking and error analysis, maintenance of the dictionary used for automated indexing, improvement of training data, construction of new workflows, etc., arose and had to be handled. Therefore, in 2013 such a unit was established, comprising some of the members of the former Petrus project group, but also with newly hired staff with specific knowledge and experience. Vacant posts were rededicated meaning that intellectual subject cataloguing had to be reduced. Innovation certainly is not available without cost. The decision to establish a new unit and to reduce intellectual subject cataloguing signaled that there would be no turning back. Thus, automated cataloguing became a strategic goal.
Core areas of innovation at the DNB – Information infrastructure
To fulfill all its tasks and to support all activities the library is focusing on, DNB has to extend its infrastructure systematically. This is not only a question of money, but also of organizational development.
The IT department at DNB serves both the internal needs of the library and, from an outside point of view, the needs and digital development activities of other libraries on a national and international level. Close cooperation with other departments of the DNB is essential. The public perception of the library depends largely on the quality of its digital services, and the IT department plays, not an exclusive, but an essential, role in ensuring quality.
The individual tasks are focused on the provision of an adequate and modern IT infrastructure for the DNB staff and users at all locations of the library, including help desk facilities for staff and external partners or users, and in the ongoing operation and expansion of library IT-based applications and work processes to support the core workflows of the library. Data services are provided to the German National Bibliography for the development, operation and expansion of the object and metadata transfer processes, the acquisition of third party data, data matching, and processes for matching data from various sources with the objective of data merging. IT staff participate in the development and maintenance of metadata standards, exchange formats and interface technology, in the operation and extension of a persistent identifier system for national and international users in the cultural heritage domain, and in the performance and extension of a variety of Linked Open Data services. To comply with the new legal mandate, the library needs the construction, operation and development of an infrastructure to ensure the preservation of digital resources, and research and development activities in the field of digital preservation followed by conceptual and operational implementation of mass methods for transferring at-risk data of digital and analog objects to a secure and long-term storage and preservation environment. The development of methods for automatic processing of objects requires the development and adaptation of automatic information extraction and indexing tools in cooperation with the cataloguing departments. In addition, methods for named entity recognition are evaluated to automate the analysis and evaluation of full text material.
Additional resources will be required to make the necessary changes in order to comply with the legal mandate. A continuous look forward to reduce the cost of routine activities in all functional areas is important. Simultaneously, rising expectations in terms of quality and stability of the services of the DNB to address rising expectations in terms of appropriate workflows and technical tools have to be implemented. In addition, other measures to be considered include the technical consolidation of unified platforms for projects and equipment development, definition and provision of development, testing, approval and operating environments (staging) in a process-based framework, the extensive use of virtualization technologies, increased use of external service providers and outsourcing, new forms of organization, flexibility and adaptation of existing structures in the area of software development and quality assessment, and the introduction of process management (based on Information Technology Infrastructure Library (ITIL) 10 for the core functions of IT.
A very important area of innovation is the process of transferring digital objects and related metadata from the publisher’s or producer’s side to the DNB. This process, which includes the validation, indexing and post-processing of these objects, has been automated as far as possible. Today, the DNB provides three interfaces for submission of digital publications to institutions, publishers, aggregators or service-providers – a web form for single objects and two automated methods, one to push and one to pull objects. Regardless of how data are ingested, the subsequent processing takes place in an integrated fully automated environment. 11
Finally, some steps towards a more systematic approach regarding the collection of online publications were taken – a dedicated, inter-departmental task force was set up for the job. The work was focused on collaborating with institutions, not on collecting individual objects. Aggregators and service-providers became involved and this group has been extended to software vendors who provide publishers with management systems.
Outlook
This article aims to provide an idea as to what extent new digital challenges and tasks can cause changes in all parts of a library, especially in national libraries that cannot abandon their traditional tasks completely – but only have the alternative to rethink their methods to fulfill the tasks. Changes in task fulfillment of national libraries have to be based on clear strategic priorities and looked at from three points of view – technical aspects, organizational consequences, and staff development. None of these views should be neglected. Technical aspects are the basis for developing new methods and tools and organizational consequences have to be set up to gain the necessary resources and implement new tasks within the institution. Nothing can be done without the library staff, who need to acquire necessary new skills and knowledge sets. They should also have a clear picture of the new processes and how and where their own position fits into the new scheme.
The German National Library carried out a lot of helpful single activities in order to meet the challenges resulting from the adoption of the legal mandate. What is required now is to bring together all these activities and put them into an overall view of the future. The experiences gained from this process can be used again if new challenges come up – and they will, for when have libraries ever had periods without change and new challenges within the last century?
