Abstract

Is there room in the market for two recently-published books covering the topic of information retrieval? The answer has to be an unhesitating ‘yes’, for these titles collectively provide an overview and deeper insights into a complex field that many students and practitioners find difficult, if not impenetrable. It is a fast-changing field, where innovation and change are a natural consequence of experiment and better understanding of the interplay between information-seeking behaviour, the technology, search procedure and matching algorithms. Hence, there is a need for regular state-of-the-art reviews. Both texts provide more: their editors and contributors have also provided a deeper perspective covering the origins and history of development and practical applications. These, then, are suitable for students interested in pursuing research in the field, for the seasoned researcher wishing to refresh memory and for the user who is interested in understanding what is going on inside the ‘engine’. If there is one common criticism, it is that neither text provides much of a view of the future – but, perhaps, the editors consider that speculation would be unhelpful and that documentation of the field is a sufficiently exciting aim.
Were one to be recommending to students a course of reading, the starting-point would certainly be with the Foreword of Interactive Information Seeking, Behaviour and Retrieval (IISBR), written by Tefko Saracevic: the history of research in the field is briefly outlined. But the emphasis is on explaining that the core of interest in the information retrieval domain is information behaviour, the dissemination of research and scholarly communication – not just the technology. A more detailed study of the development of interactive information retrieval is given in the first chapter, by Colleen Cool and Nicholas Belkin. This documents the difficulties surrounding evaluation of system performance and highlights the pioneers in the field and the systems they built. Belkin also highlights the emergence of interest in the behavioural aspects of information seeking and search behaviour.
This aspect is neatly complemented by David Bawden’s introductory chapter ‘Encountering on the road to Serendip?’ in Innovations in Information Retrieval (IIR): he pursues an historical theme but from the perspective of development of an understanding of browsing in modern information systems. While earlier researchers had focused on purposive information-seeking, Bawden reminds us that the Web environment is dominated by less structured methods and that, whilst we might professionally abhor such casual approaches it, nevertheless, represents the reality of how many people use information systems. In Chapter 2 of IISBR, Peiling Wang provides a useful summary of the models of information-seeking behaviour that have been developed over the last 40 years: this would be a good starting-point for appreciating the overall concepts, but further reading will be necessary if the underlying theory of each is to be understood.
The topic of classification and whether it still has importance is of perennial interest. ‘Classification revisited: A web of knowledge’, by Aida Slavic (Chapter 2 of IIR) surveys the theory and the various systems that are commonly used, together with considering the application of classification within a Web environment, concluding with the tantalising comment, attributed to Soergel, that ‘we have never actually seen a proper implementation of knowledge classification in a modern information retrieval system’ (p. 45). Perhaps this will prompt further research to answer the question.
Continuing to pursue the study of information behaviour, Elaine Toms reviews the newest approaches to purposive searching in ‘Task-based information retrieval’ (Chapter 3 of IISBR). Models of the search process dominate the early part, leading to a discussion of the nature and design of searches. This approach has important links with the teaching of the search process in information literacy, query negotiation and the facilitation of searching by an intermediary.
Raya Fidel, in Chapter 4 of IISBR, considers ‘Approaches to investigating information interaction and behaviour’ and reminds us that the field has a research history stretching back to the 1930s. The various types of research framework and methods of data collection are identified, leading to a summary critique. For the student this will serve as a convenient starting-point, providing an overview of an often-complex field. In Chapter 7 of the same work, Kalervo Järvelin discusses methods of evaluation of information retrieval systems, considering approaches based upon test collections, and those based upon user feedback. Critically, the point is made that, once one moves beyond the control that can be exercised in the laboratory, the range of possible approaches increases significantly and the difficulty of interpreting the results becomes much greater.
‘Information representation’ is the next chapter in IISBR, written by Mark Smucker. The significance of subject analysis and, in particular, the Cranfield experiments, is discussed: it is useful for the beginner to be reminded of these early insights if only to counter the view that ‘nothing happened before the web’. Later, Smucker discusses automatic indexing approaches with a summary of how some of the popular techniques are applied to web pages.
In Chapter 6 of IISBR (‘Access models’), Edie Rasmussen considers information retrieval systems from the user perspective, reminding us that ‘information retrieval involves a number of compromises’ (p. 95) and, also, that search engines may be subject to manipulation by ‘black hat’ approaches that seriously distort the validity of the results. This is not a topic that is often mentioned in the classroom and it is refreshing to balance the elegance of theory with the, often disappointing, results of searching. Rasmussen leads the reader through the various approaches to searching algorithm development, using a necessary minimum of mathematical concepts to explain each: a well-written chapter that should be accessible to even the non-numerate reader.
It is often asserted that users want a Google-like view of all search engines: Max Wilson, in Chapter 8 of IISBR, discusses the types of interface available, providing a ‘framework for thinking about the elements’. This is accomplished by, firstly, examining the Google interface and generalising its features; secondly by a short account of the history of development of interfaces for online systems, followed by an examination of the features typically to be found in modern search engine interfaces. This should raise critical awareness of the importance of the interface design and prompt interest in the alternatives available and under development, such as 3D displays.
To complement this, Ryen White follows on with a chapter discussing interaction and underlining the importance that such interaction plays in every step of the query process, display and evaluation of results. The role of relevance feedback is identified and various approaches described. Following on, techniques for query formulation and enhancing decisions in searching are presented. These, latter, seem to offer possibilities of significant developments in search system utility and it would be useful if, in future editions, these sections could be expanded.
IIR and IISBR, in their later sections, both focus on applications, though from somewhat different orientations. IIR considers the problems of locating fiction in a most interesting chapter by Anat Vernitski and Pauline Rafferty: interesting, because the approach is unexpected and the problems not necessarily widely understood or, even, recognised. The approaches to classification are, of themselves, a valuable insight and Vernitski’s work on intertextuality-orientated classification, together with her comments on the potential of Web 2.0 approaches for affective dimension indexing, should prompt further research into this field.
Music is another field that presents quite unique challenges, well explored by Charles Inskip in Chapter 4 of IIR. He explains that identifying music is complex because of the many aspects that can contribute to its character and that: ‘We need to examine organized sound more deeply if we are to determine fruitful paths to follow that will lead to successful retrieval approaches’ (p. 71). Thereafter, the schemes available are briefly described before considering the range of query types user of music may bring to the task of retrieval.
Both texts focus, also, on the special issues that are raised by web retrieval. Jaime Teevan and Susan Dumais consider ‘Web retrieval, ranking and personalization’ in Chapter 10 of IISBR, making the opening point that the vast scale of types of web content makes ‘Web retrieval … different from other types of information retrieval’ (p. 189). The focus in the chapter is on the difficulties of ranking and interaction: the conclusion is that new approaches to evaluation are needed. Following this, David Nichols and Michael Twidale consider the social relationship aspects in Chapter 11 (‘Recommendation, collaboration and social search’), reminding the reader that the relationship amongst users of information is a critical component of this social process. Their skilful use of a case study based upon features of the Amazon.com service highlights the potential of collaboration to attract new readers and inform about the value of a potential purchase. Furthermore, they develop these ideas to demonstrate that the potential exists in other milieu. Isabella Peters in her chapter of IIR, ‘Folksonomies, social tagging and information retrieval’, takes this idea further by exploring the link between information retrieval and how knowledge is structured and the role of tagging in refining relevance ranking. She concludes that: ‘It is through the users [my emphasis] that the traditional information retrieval of libraries and the internet is being transformed into “Information Retrieval 2.0”’ (p. 107).
Audio, video and image resources now form a major part of most library collections and, through the Web, a much larger corpus of information in these forms is available. Haiming Liu, Suzanne Little and Stefan Rüger, in Chapter 12 of IISBR (‘Multimedia: Behaviour, interfaces and interaction’), provide an overview of user interaction models and highlight the nature of search behaviour, including the ‘foraging’ theory. In the final chapter, Little and Rüger are joined by Evan Brown to discuss the complementary topic of ‘Multimedia: Information representation and access’. The two chapters present a fascinating insight into the complexity of multimedia indexing and search capacity, focusing on the theory and approaches to content-based retrieval from various media types.
Back to IIR: the penultimate chapter, written by Richard Kopak, Luanne Freund and Heather L O’Brien, considers ‘Digital information interaction as semantic navigation’. This takes the reader to the forefront of research into ‘immersive information systems’ that provide an ‘experience’ rather than a ‘transaction’ (p. 117), leading to learning. The chapter conveys the conceptual difficulties well and does not fail to excite the imagination about the potential that this might offer, especially as the profession is also exploring its role in relation to the user of information.
The final chapter of IIR is provided by Mike Thelwall: in a relatively short account the use of webometrics as a tool for assessing web search engine performance is discussed, concluding with a summary of studies of search engine bias. This last should be required reading for anyone developing information literacy materials or advising users on methods of searching. Thelwall also reminds us that the various search algorithms utilised by search engines ‘are so complex that they give quite strange results in some respects’ (p. 143). Perhaps, then, our professional ‘value-added’ is still in assisting the user to understand, interpret and improve the results of searches.
Both books are supported by copious references: those of IIR are collected at chapter ends, whilst IISBR provides a consolidated list. The latter arrangement is preferred by this reviewer because it places the list closer to the source of citation. Both have indexes but that of IISBR is just over five pages for 290 pages of text, whereas IIR provides 10 pages for 145 pages. Bearing in mind that both texts are complex, longer indexes for each would have been welcome. Overall, the production standard of both books is high, the layout being attractive and the texts accessible.
