Abstract

The Fourth Erich L. Lehmann Symposium on Optimality was held from 9 May through 12 May in 2011 on the campus of Rice University in the School of Engineering. As in the three previous Lehmann Symposia, the fourth symposium had as one of its goals the gathering of some of the top researchers in theoretical statistics to discuss, showcase and encourage technical developments in optimality in statistics.
A collection of plenary sessions, several invited technical sessions and a session for young investigators were masterfully coalesced by the scientific committee. Graduate students, Postdocs, young and senior faculty, and professionals from the financial, medical and energy industries close to the Rice campus benefited from discussions on theoretical issues of current interest. Central to many discussions was the interest on high-dimensional data which is by now ubiquitous in many modern problems in science.
Presentations given at the Symposium have been posted and can be downloaded from the Symposium webpage
Video presentations from the second and third Lehmann Symposia are publicly available at the sites
The conference was the fourth of a series of symposia. The idea of the Symposium originated in 2000–2001 through an informal discussion on the future of statistics between the writer and Victor Pérez-Abreu, the then Director of CIMAT. It was observed that in the last few years, a vigorous increase in the applications of statistics to cutting edge scientific problems has guided and facilitated many successful advances in basic science as well as in other realms of application.
As a consequence of this success, professional meetings and to some extent professional journals have become forums for many of these applications. The attention to applied problems is well deserved, and statistics and statisticians must play an important role in solving the myriad of scientific problems that arise in the various disciplines.
Nevertheless, the significant benevolent neglect of theoretical developments, and in particular of Optimality theory, raises concerns regarding the impact that such a trend can have for Statistics as a discipline. The NSF-ASA report ‘Statistics: Challenges and Opportunities for the Twenty-First Century’, August 2004, edited by B. Lindsay, J. Kettenring and D. Siegmund, provides an interesting discussion on the role that the ‘
Following the model of the famous and now defunct Berkeley Symposia, we decided to initiate a set of Symposia whose purpose is to catalyse theoretical research in statistics. Although probability has not had a strong presence in the Lehmann Symposia, it is our goal to increase the number of probability sessions and participants in future Symposia.
Erich L. Lehmann was thought to be an excellent person to honour through the Symposia because of his many professional contributions and his almost legendary concern for his graduate students and young members of the profession. In addition, Erich had first hand experience with the influential Berkeley Symposia.
The goal of the Symposia is to examine the role that Optimality can play, or should play, in modern statistics. Due to the advent of high throughput data collection technology and the parallel development of computing power to analyse such data, it often happens that statistical theory gives way to raw computing power. Although most of the new exciting computational/statistical methodologies have provided tools to make headway in many important scientific problems, a need to generalise and systematise this knowledge is now quite evident. The Symposia brings together a group of experts to discuss cutting-edge research optimality ideas in the context of modern statistical methodologies. It is believed that, although much progress has taken place in areas such as data visualisation and data mining and knowledge discovery among others, the subjects are ripe for the development of an optimality paradigm that allows for objective comparisons of methodologies. This new paradigm, although still to be defined, is necessary to push the research frontiers in these important areas. The conference showcased new developments by leading researchers in an environment conducive to the development of new human resources and an opening session showcased the work of young investigators. With the substantial contributions that statistics continues to make to the analyses of massive high-dimensional data arising in the biomedical sciences, national security, reliability of urban infrastructures, atmospheric sciences, etc., the need to synthesise this knowledge to more efficiently and effectively analyse such data has come to the forefront of the discipline. Current statistical efforts, for example, leading to a better understanding of the stochastic behaviour of the power grid, should help in the creation of an intelligent grid that can better respond to changes in the grid’s status and thus avert cascading failures that currently cost in the order of $104 billion annually in the United States alone. The symposium provided a forum to highlight the exciting and impacting theoretical work that is being developed to better understand the behaviour of these complex systems.
Sir David Cox, Erich L. Lehmann and Juliet P. Shaffer during the second Lehmann Symposium, May 2004
Rojo (2006a) provides a brief history of the Lehmann Symposia. In addition, Rojo and Perez-Abreu (2004) and Rojo (2006b, 2009) record some of the works presented during the first three Lehmann Symposia.
The first three Erich L. Lehmann Symposia in 2002, 2004 and 2007 had the pleasure and honour of having Erich L. Lehmann open the proceedings by giving the initial lecture. Sadly, Erich passed away at the age of 91 on 12 September 2009, two months short of his 92nd birthday, and his presence was sorely missed during the fourth Lehmann Symposium.
This special issue of the Statistical Modelling: An International Journal is dedicated with deep gratitude to Erich L. Lehmann’s memory. The issue contains eight papers based on work presented during the fourth Lehmann Symposium. All articles have been reviewed by at least two referees.
The first paper by Javier Rojo provides a personal account of, and reminisces about, Erich L. Lehmann from 1978, when the author first met E. Lehmann, until Lehmann’s death in 2009.
The second paper by Kjell Doksum revisits the estimators of Hodges and Lehmann (1963) and shows that these estimators may be thought of as maximising a rank likelihood, and several optimality properties of the estimators are discussed. The R software recently developed by Kloke and McKean (2012) will help in making some of these procedures mainstream.
Lehmann (1990) compared the views of Fisher and Neyman as they regarded the issue of model selection. The third paper, by Chen-Pin Wang and Booil Jo, develops a score based on the Kullback-Leibler divergence to compare models. They propose computing the divergence between the proposed (or reference model) and the model that maximises the likelihood within the null class of models; they measure the divergence by calculating its posterior (or predictive) expectation given the data, and asymptotic properties of the procedures are discussed.
The ideas of group families, and the companion concepts of equivariance and invariance, permeate much of Erich’s work (see, e.g., Lehmann, 1983, 1986). In the fourth paper, Megan M. Romer and Donald St P. Richards, in the context of two-step incomplete multivariate normal data, develop an exact stochastic representation of Hotelling’s T2 statistic that allows for the construction of exact level confidence ellipsoids for the mean vector. Ideas of invariance are used to show that the Hotelling’s T2 statistic is invariant under affine transformations.
The fifth paper by Jeffrey S. Simonoff, motivated by longitudinal and clustered data, proposes the use of regression trees to develop goodness-of-fit tests for detecting departures from the linear mixed model structure. Simulation studies suggest that the proposed methodologies are conservative and have good power to detect various model violations.
The paper by L. Tenorio, C. Lucero, V. Ball and L. Horesh discusses optimal experimental designs in the context of ill-posed linear inverse problems. The authors propose a criterion based on minimising the Mean Squared Error of an affine estimator of the inversion parameters. As the implementation of the methodologies requires numerical solutions, the paper also discusses ways of accelerating the necessary algorithms.
K. Rister and S. N. Lahiri propose first-order bias corrections for use when predicting a spatial process that is a nonlinear transformation of a stationary Gaussian process. The proposed bootstrap methods are shown to produce asymptotically unbiased predictions under certain conditions.
Finally, Charles Lewis and Dorothy T. Thayer discuss the issue of whether certain aspects of optimal multiple testing procedures are undesirable given that they violate the precept that they should be more conservative than individual tests. They use the random effects analysis of variance to illustrate their concerns.
I would like to thank Juliet P. Shaffer for providing valuable information and support. I also want to thank the editors Jeffrey S. Simonoff, Brian Marx and Herwig Friedl for accepting to engage in this project. Jeffrey Simonoff deserves special mention since he provided the avenue for this project when other journals balked at the idea.
Financial support for the fourth Lehmann Symposium was provided by The National Science Foundation, Pfizer Pharmaceutical Company, MD Anderson Cancer Center and the Texas Health Science Center at Houston. The efforts by Demissie Alemayehu (Pfizer), Valen Johnson (MD Anderson Cancer Center) and Barbara Tilley (UT Health Science Center) to acquire financial support from their institutions for the fourth Lehmann Symposium are specially appreciated.
