Abstract
The problem of intermediates in the fossil record has been frequently discussed ever since Darwin. The extent of ‘gaps’ (missing transitional stages) has been used to argue against gradual evolution from a common ancestor. Traditionally, gaps have often been explained by the improbability of fossilization and the discontinuous selection of found fossils. Here we take an analytical approach and demonstrate why, under certain sampling conditions, we may not expect intermediates to be found. Using a simple null model, we show mathematically that the question of whether a taxon sampled from some time in the past is likely to be morphologically intermediate to other samples (dated earlier and later) depends on the shape and dimensions of the underlying phylogenetic tree that connects the taxa, and the times from which the fossils are sampled.
Introduction
Since Darwin's book On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life [2], there has been much debate about the evidence for continuous evolution from a universal common ancestor. Initially, Darwin only assumed the relatedness of the majority of species, not of all of them; later, however, he came to the view that because of the similarities of all existing species, there could only be one ‘root’ and one ‘tree of life’ (cf. [11]). All species are descended from this common ancestor and indications for their gradual evolution have been sought in the fossil record ever since. Usually, the improbability of fossilization or of finding existing fossils was put forward as the standard answer to the question of why there are so many ‘gaps’ in the fossil record. Such gaps have become popularly referred to as ‘missing links’, i.e. missing intermediates between taxa existing either today or as fossils.
Of course, the existence of gaps is in some sense inevitable: every new link gives rise to two new gaps, since evolution is generally a continuous process whereas fossil discovery will always remain discontinuous. Moreover, a patchy fossil record is not necessarily evidence against evolution from a common ancestor through a continuous series of intermediates—indeed, in a recent approach, Elliott Sober (cf. [11]) applied simple probabilistic arguments to conclude that the existence of some intermediates provides a stronger support for evolution than the non-existence of any (or some) intermediates could ever provide for a hypothesis of separate ancestry. Moreover, some lineages appear to be densely sampled, whereas of others only few fossiliferous horizons are known (cf. [10]). This problem has been well investigated and statistical models have been developed to master it (see e.g. [6, 7]), [12]).
In this paper, we suggest a further argument that may help explain missing links in the fossil record. Suppose that three fossils can be dated back to three different times. Can we really expect that a fossil from the intermediate time will appear (morphologically) to be an ‘intermediate’ of the other two fossils? We will explore this question via a simple stochastic model.
In order to develop this model, we first state some assumptions we will make throughout this paper: firstly, we will consider that we are sampling fossil taxa of closely related organisms and which differ in a number of morphological characteristics. We assume this group of taxa has evolved in a ‘tree-like’ fashion from some common ancestor; that is, there is an underlying phylogenetic tree, and the taxa are sampled from points on the branches of this tree.
It is also necessary to say how morphological divergence might be related to time, as this is important for deciding whether a taxon is an intermediate or not. In this paper, we make the simplifying assumption that, within the limited group of taxa under consideration (and over the limited time period being considered), the expected degree of morphological divergence between two taxa is proportional to the total amount of evolutionary history separating those two taxa. This evolutionary history is simply the time obtained by adding together the two time periods from the most recent common ancestor of the two taxa until the times from which each was sampled (in the case where one taxon is ancestral to the other, this is simply the time between the two samples). This assumption on morphological diversity would be valid (in expectation) if we view morphological distance as being proportional to the number of discrete characters that two species differ on, provided that two conditions hold: (i) each character has a constant rate of character state change (substitution) over the time frame T that the fossils are sampled from, and (ii) T is short enough that the probability of a reverse or convergent change at any given character is low. We require these conditions to hold in the proofs of the following results. We will discuss other possible relations of morphological diversification and distance towards the end of this paper.
The simplest scenario is the case where the three samples all lie on the same lineage, so that the evolutionary tree can be regarded as a path (cf. Fig. 1). In this case, the path distance (and hence expected morphological distance) between the outer two fossils is always larger than the distance that either of them has from the fossil sampled from an intermediate time. But for samples that straddle bifurcations in a tree, it is quite easy to imagine how this intermediacy could fail; for example, if the two outer taxa lie on one branch of the tree and the fossil from the intermediate time lies on another branch far away (cf. Fig. 2). But this example might be unlikely to occur, and indeed we will see that if sampling is uniform across the tree at any given time, in expectation the morphological distances remain intermediate even for this case (cf. Fig. 2). Yet for more complex trees, this expected outcome can fail, and perhaps most surprisingly, the distance between the earliest and latest sample can, in expectation, be the smallest of the three distances in certain extreme cases.

When the tree consists of only one lineage from which samples are taken at times T1, T2 and T3, then clearly the distance d1,3 is always larger than d1,2 and d2,3. Consequently, E1 3 > max{E1,2, E2,3}.

For samples taken from different lineages of a tree, the distance d1,3 of one particular sample from time T1 to the one of T3 can be smaller than the distance of either of them to the sample taken at time T2. Yet in expectation we always have E13 > max{E1,2, E2,3} for two-branch trees. For more complex trees this can fail as we show in Example 2.7.
Thus, in order to make general statements, we will consider the expected degree of relatedness of fossils sampled randomly from given times. Our results will depend solely on the tree shape (including branch lengths) of the underlying tree and the chosen times.
Results
We begin with some notation. Throughout this paper, we assume a rooted binary phylogenetic tree to be given with an associated time scale 0 < T1 < T2 < T3. The number of Ti-lineages (of lineages extant at time Ti) is denoted by ni. For instance, in Figure 3, the number n1 of T1-lineages is 3, whereas the numbers n2 and n3 of T2- and T3-lineages are both 5. If not stated otherwise, extinction may occur in the tree. Every bifurcation in the tree is denoted by b1, where b0 is the root. Note that in a tree without extinction, the total number of bifurcations up to time T3 (including the root) is n3 − 1. For every bi let ti denote the time of the occurrence of bifurcation bt. We may assume that the root is at time t0 = 0.

A rooted binary phylogenetic tree with three times T1, T2, T3 at which taxa have been sampled. The dotted branches refer to taxa that do not contribute to the expected distances from one of these times to another and thus are not taken into account. On the other hand, bifurcation b2 at time t2 shows that extinction may have an impact on the expected values. Such branches have to be considered.
Now, for every bi, we make the following definitions:
It can be seen that bifurcations for which at least one branch of offspring dies out in the same interval where the bifurcation lies always have
Example 2.1
Consider the tree given in Figure 3. Here, the values
In the sampling, select uniformly at random one of the T i -lineages as well as one of the T j -lineages to get the expected length Ei, j of the path connecting a lineage at time Ti with one at time T j in the underlying phylogenetic tree. Then, the expectation that a fossil from the intermediate time T2 also will be an intermediate taxon of two taxa taken from T1 and T3, respectively, refers to the assumption that E1,3 > max {E1,2, E2,3}. We will show in the following lemma that this last inequality can fail and describe the precise condition for this to occur. Moreover, we later show that E1,3 can be strictly smaller (!) than both E1,2 and E2,3—that is the temporally most distant samples can, on average, be more similar than the temporally intermediate sample is to either of the two.
Note that if
In order to simplify the statement of our results, for all bifurcations b
i
set
Lemma 2.2
Given a rooted binary phylogenetic tree with times 0 < T1 < T2 < T3 and the root at time t0 = 0. Then, E1,3 ≤ E1,2 if and only if
In the above bracket, the three summands refer to different paths from time T1 to time T3. The first summand belongs to those paths that go directly from T1 to T3 and thus have length T3–T1. There are n3 such ways as every T3-lineage has an ancestor in T1. The second summand sums up all paths going along one of the bifurcations b
i
for i ≠ 0. For every i, there are by definition exactly
As there are altogether n1n3 different paths from T1 to T3 in the tree, we have:
Corollary 2.3
For a given tree there exist times 0 > T1 < T2 < T3 such that E1,3 ≤ E1,2 if and only if
Proof. If,
Corollary 2.4
If either (i) n1 = 2 or (ii) no extinction occurs in the tree and n2 = n3, then E1,3 > E1,2.
Proof, (i) Note that if n1 = 2, obviously only one bifurcation, say b
î
(for some î such that 0 ≤ t
î
< T1), contributes to the number n1 of lineages at time T1, all the branches added by additional bifurcations become extinct before T1. Thus:
Therefore,
Thus, by Corollary 2.3, E1,3 > E1,2.
Lemma 2.2 essentially states that the expected degree of relatedness from taxa of time T1 to taxa of time T3 can be larger than the one to taxa of time T2, but it requires the distance from T2 to T3 to be “small enough”. Whether such a solution is feasible can be checked via Corollary 2.3. Lemma 2.2 shows already how the role of intermediates depends on the times the fossils are taken from. Corollary 2.4(i) on the other hand shows how the tree itself has an impact on the expected values: if the tree shape (including branch lengths) is such that at time T1 only two taxa exist, then the just mentioned scenario cannot happen as the condition of Corollary 2.3 is not fulfilled.
However, we can prove an even stronger result, namely that not only E1,3 < E1,2 is possible, but E1,3 < min {E1,2, E2,3) can be obtained for a suitable choice of times T1, T2, T3. For this, we need the following lemma.
Lemma 2.5
Given a rooted binary phylogenetic tree with times 0 < T1 < T2 < T3 and the root at time t0 = 0. Then E1,3 ≤ E2,3 if and only if
Proof. As in the proof of Lemma 2.2, we have
Thus, E1,3 ≤ E2,3 if and only if
With the help of the two lemmas we can now state the following theorem.
Theorem 2.6
Given a rooted binary phylogenetic tree with times 0 < T1 < T2 < T3 and the root at time 0. Then, E1,3 ≤ min {E1,2, E2,3} if and only if the following two conditions hold.
Proof. The Theorem follows directly from Lemmas 2.2 and 2.5.
The following example demonstrates the influence of times 0 < T1 < T2 < T3 according to the above theorem.
Example 2.7
Consider again Figure 3.
(1) Assume t1 = 15, T1 = 100, t2 = 107, t3 = 109, T2 = 110, T3 = 130. Then, E1,2 = 137.33, E2,3 = 155.28 and E1,3 = 155.33. Hence, for this choice of times, we have E1,3> max{E1,2, E2,3}.
(2) Consider the same times as in the previous case, but choose T2 = 129 instead of T2 = 110. This means to move T2 further away from T1 and closer to T3. This change is enough to give completely different expected values: E1,2 = 156.33, E2,3 = 166.68 and E1,3 = 155.33. Hence, for this choice of times, we have E1,3 < min {E1,2, E2,3}.
Discussion
The analysis of the fossil record provides an insight into the history of species and thus into evolutionary processes. Stochastic models can provide a useful way to infer patterns of diversification, and they form a useful link between molecular phylogenetics and paleontology [8]. Such models would greatly benefit from incorporation of potential fossil ancestors and other extinct data points to infer patterns of evolution. In this paper we have applied a simple model-based phylogenetic approach to study the expected degree of similarity between fossil taxa sampled at intermediate times.
‘Gaps’ in the fossil record are problematic [10] as they can be interpreted as ‘missing links’. Therefore, numerous studies concerning the adequacy of the fossil record have been conducted (see, for example, [3, 9, 13]), and it is frequently found that even the available fossil record is still incompletely understood. This is particularly true for ancestor-descendant relationships (see, for instance, [4, 5]). For example Foote [5] reported the probability that a preserved and recorded species has at least one descendant species that is also preserved and recorded is on the order of 1%–10%. This number is much higher than the number of identified ancestor-descendant pairs. Thus, it remains an important challenge to recognize such pairs [1]. This is also essential with regard to ancestor-intermediate-descendant triplets, as it is possible that there are in fact fewer ‘gaps’ than currently assumed, i.e. that intermediates are present but not yet recognized. Such issues have an important bearing on any conclusions our results might imply concerning the testing of hypotheses of continuous morphological evolution, or concerning the shape of the underlying evolutionary tree based on the non-existence of certain intermediates.
Another challenge is to investigate different phylogenetic models for describing the expected degree of morphological separation between different fossil taxa sampled at different times. Our findings strongly depend on the assumption that morphological diversification is proportional to the distance in the underlying phylogenetic tree. This is justified if morphological difference is proportional to the number of differing discrete characters, that each of these characters changes at a constant rate over the time period of sampling, and that homoplasy is rare. This last assumption requires the rate of character change to be sufficiently small in relation to the time period of the sampling—the appearance of reverse or convergent character states will lead to a more concave (rather than linear) relationship between morphological divergence and path distance. A similar concave relationship might be expected for continuous morphological evolution as described by neutral Brownian-motion.
Thus, the impact of different assumptions on the role of intermediates could be further investigated. But even if we assume that diversification is proportional to time, there may be other ways to measure ‘distance’ that could be usefully explored—for instance, one could define the distance between two taxa to be the maximum (rather than the sum) of the two divergence times of the taxa back to their most recent common ancestor. This definition of distance allows the degree of relatedness to be higher for taxa on the same clade than for other taxa. In this case, there exist analogous results to Lemmas 2.2 and 2.5 (results not shown), but the formulae are somewhat different, particularly for Lemma 2.5.
Footnotes
Acknowledgement
We would like to thank Elliott Sober for bringing the mathematical aspects of intermediates in the fossil record to our attention, and for helpful comments. We also thank Matt Philips, David Penny and two anonymous reviewers for some helpful suggestions.
