The article is devoted to the theory of the design of experiments. It introduces a formal definition of factorial models and factorial designs. On this basis, it builds the mathematical foundations of the factorial design of experiments. The presented concept supports many important aspects of experimental design including the main one: the construction of the optimal designs.
The purpose of this article is to present a collection of results on the mathematical foundations of the factorial design of experiments obtained by the author. These results were published by the author in various editions in Russian (under the name V.Z. Brodsky) and were not available to a wide range of researchers.
In this article, all essential issues of the design of factorial experiments are considered from a single perspective. Achieving this goal required certain difficulties to be surmounted. This primarily refers to foundations of the theory. Hundreds of publications contain the words “factorial designs” in their titles. Yet not all authors use the same definition of this concept. When I was writing my first paper on the subject I could not see a meaningful definition of factorial designs in statistical publications. I do not see it even now, when I submit this paper.
Raghavarao (1971), in his marvelous book “Construction and Combinatorial Problems in Design of Experiments” says that the factorial design of experiments occurs when different combinations of the factors at various levels influence a character under study. However, any multidimensional design includes different combinations of factors. Then how the factorial designs differ from others? Apparently, Raghavarao excludes the one-dimensional case from his definition, though he does not state this outright. In many books, the multidimensional condition is explicitly included in the definition of the factorial designs. For example, this is what Mukerjee and Wu (2006) do in their brilliant book “A Modern Theory of Factorial Designs”. They define a factorial experiment as an experiment involving ( 2) factors that appear at levels. However, it is unclear from this definition how exactly the factorial design of experiments differs from any other multidimensional design. Given this definition, any multidimensional experiment has to be considered as factorial. Therefore, such a definition is not very productive. The same can be said about a definition of a general factorial design – a fractional factorial design - as a part of the full factorial design (an implicit definition of this kind is used by Cheng (2013) in his wonderful book “Theory of factorial design”).
Authors of numerous works in different fields of the design of experiments are unlikely to agree with any of these approaches to the definition of factorial designs. Will authors of articles on, say, polynomial designs (including in part rotatable designs) consider that they do research on factorial designs? I do not think so. In reality, none of them has ever used such a concept as “rotatable factorial designs”. No one would consider, for example, the rotatable design of second order in two variables as a fractional factorial design 5 (even though it consists of treatment combinations of two five-level factors).
What can then be said about the designs constructed numerically and satisfying, say, criterion of -optimality for different models and design spaces, as Fedorov (1972) did, developing the ideas of Kiefer and Wolfowitz (1960)? Will researchers and the readers of these papers consider such designs “-optimal factorial designs” only because they are multidimensional? Of course not.
What can we say about the authors of the book on the design of experiments? Do they use effectively the definition of factorial designs as multidimensional designs? Do they use such a definition, for example, when they think about a structure for their book? No, no one follows such a definition. On the contrary, authors of books on the general problem of the design of experiments (including the Raghavarao book quoted above) consider factorial designs separately from the sections devoted to other types of designs, for example, rotatable designs. On the other hand, books on factorial designs (including the Mukerjee and Wu book mentioned above) do not contain, say, a section devoted to rotatable designs. That means that authors of the book do not follow the definition of factorial designs as multidimensional ones. They structure their books based on an intuitive understanding of what factorial designs are. And this intuitive understanding has nothing to do with multidimensionality.
So, one of my goals was to introduce the concept of factorial designs and factorial models in a way that would reflect this intuitive understanding. The most important part of it, however, was not only to introduce definitions that would not frighten those who do research in the area of the design of experiments. My main goal was to make the introduced concept productive. And it seems to me that I managed to do it. In its final version, the concept supports many important aspects of experimental design, including the effectiveness of statistical inferences and construction of the designs.
A few more words about terminology. A number of authors started using the term “regular” to refer to the designs generated from finite geometries. In the beginning of the seventies, when my first works on the issue were published, I considered the concept of regular factorial designs as the designs that had certain statistical properties. The designs generated from finite geometries I called geometric. In this article, I will continue to follow these definitions.
Factorial models and designs
Factorial design
Consider observations and variables with values corresponding to the -th observation . Assume that a mathematical expectation of is the following function of and parameters :
where is a vector of unknown parameters; is a vector of given functions.
The variable is said to be quantitative if all are numbers. The variable is said to be qualitative if at least one of the values is a symbol (even if it is represented by number). This definition of quantitative and qualitative variables may not be regarded as strict. Rather it can be regarded as an explanation which model (for quantitative or qualitative factors) will be considered. In other words, quantitative variables are those for which the model for quantitative variables is considered; qualitative variables are those for which the model for qualitative variables is considered.
Each of different values of the variable in the design matrix (or just design) is called a level. The number of different levels of the variable is denoted by . We will set up a correspondence between symbols and different levels of the variable regardless of whether the variable is a quantitative or qualitative. In this case we actually deal with the factor (qualitative or quantitative) and its levels . Then the design matrix can be rewritten as
where the columns correspond to the factors, and the rows correspond to the treatment combinations (or treatments, runs) of design ; is the value of the factor in the -th treatment combination.
A design with runs for factors with levels respectively will be denoted by (or just ).
It is clear that the maximum number of different rows in the design matrix is equal to .
Definition 1.1.1. A design that consists of different rows is called a full design. A design that does not contain at least one of combinations of levels is called a fractional design.
We will not assume that a design does not contain identical treatment combinations.
Definition 1.1.2. A design is called symmetrical if all factors have the same number of levels. A design is called uniform if for any given factor, its levels appear equally in the design.
A design will be called factorial only with respect to a specific type of a model for which the design is considered (Brodsky, 1975). The types of factorial models will be listed below.
Factorial model for quantitative factors
Assume that in the design , all factors (with levels respectively) are quantitative. Then consider the following model:
In the model Eq. (1), the following notations and assumptions are used. is an observation that depends on . contains terms with products are constants, . The functions are linearly independent at points , i.e., Rg for any , where
The functions can be polynomials in of degree . In particular, for any the functions can be the Chebyshev orthogonal polynomials at points . In this case, the columns of the matrix are pairwise orthogonal. The corresponding model is called the Chebyshev model.
Definition 1.2.1. The model Eq. (1) is called a full factorial model for quantitative factors (or an -model) for the factorial design if contains all possible terms with the products .
Definition 1.2.2. A set of factors pairs of factors , triples of factors , etc., is called a factorial set if the following requirements are satisfied: if , then for all and
Definition 1.2.3. The model for the factorial design
is called a factorial model for quantitative factors for the factorial set (or an -model) if the following requirements are satisfied: if the model Eq. (4) includes the term for some set of then the model includes all terms for all (by definition, ).
It is evident that Definition 1.2.3 is consistent with Definition 1.2.2.
An -model Eq. (4) is a general model for quantitative factors. An -model, for example, is a special case of an -model.
Main effects and interaction effects
In an -dimensional Euclidean space we set up a correspondence between the -th coordinate of each vectors and the th treatment combination of the design .
Definition 1.3.1. A nonzero vector is defined to be a contrast if
Definition 1.3.2. The vector of the main effect of the factor of the design is a contrast with equal coordinates for the same levels of the factor in the design . The vector of the main effect is also called the vector of the interaction effect of order 0.
The definition of the vector of the interaction effect of -th order is based on the definition of the vector of the interaction effect of -th order.
Definition 1.3.3. The vector of the interaction effect of -th order (or the vector of the -factorial interaction effect) of the factors of the design is a contrast with equal coordinates for the same combinations of levels of the factors in the design , orthogonal to all vectors of interaction effects up to (-th order of the factors
We may omit word “vector” in the above two definitions.
A linear combination of several interaction effects of -th order of factors is, obviously, an interaction effect of -th order of the same factors or zero-vector. Therefore, a set of all interaction effects of (-th order of factors, together with the zero-vector, is a linear subspace of the space .
Definition 1.3.4. The number of degrees of freedom carried by interaction effects of -th order for the design is the dimension of the corresponding linear subspace.
It is evident that the number of degrees of freedom carried by main effects of the factor for any design is equal to .
The requirement of orthogonality of the -th order interaction effects to all interaction effects up to -th order of the same factor is obviously equivalent to the requirement of orthogonality to maximal linearly independent subset of the corresponding interaction effects.
Definition 1.3.5. The matrix composed of the maximum subset of independent vectors of main effects of the factor is called a matrix of main effects of the factor . The matrix composed of the maximum subset of independent vectors of interaction effects of the factors and is called a matrix of interaction effects of the factors and , etc.
Introduce the following notation:
where is a unit vector (with the elements equal to 1).
We will assume that any matrix in is normalized in such a way that sum of squares of elements of any its column is equal to . In the matrix , for each subset of identical rows, we delete all but one row and add a left column consisting of 1. Denote the resulting matrix by .
Matrices can be used as matrices Eq. (104) for the -model Eq. (103), since
The main theorem for full design
Definition 1.4.1. For two vectors and introduce operation called multiplication, such that product
Let columns of the ()-matrix be , and columns of the ()-matrix be . Then, by definition,
Theorem 1.4.1. For a full factorial design, any interaction effect of a set of factors is orthogonal to any interaction effect of other set of factors and the number of columns of the matrix is equal to .
Proof Consider any two rows of the matrix of the full design . It can be shown that for these two rows, there exists a column corresponding to some factor, such that for selected two rows, the factor has different levels. Without loss of generality, it can be assumed that the first two rows and the last column are considered. Select the columns in the matrices to make them pairwise orthogonal. It is evident that in the full design, all levels of the given factor occur equally. Therefore, the columns of the matrix will be pairwise orthogonal. Besides, for any and
Define . The number of columns of the matrix equals
Define by the following recurrence relation:
The number of the columns of the matrix and the number of the columns of the matrix are connected by the following obvious relation:
Consider the matrix . The number of the columns of the matrix , by Eqs (8) and (9), equals . Hence, is a square matrix.
We will prove that the selected two first rows of the matrix are orthogonal. Let the first two rows of the matrix be and . Then the first two rows of the matrix are
Their scalar product
Hence, any two rows of the matrix are orthogonal.
To prove that any two columns of the matrix are orthogonal we need to show that sum of squares of elements of any row of the matrix is constant:
where is element of the -th row and the -th column of the matrix .
Matrices of the form , included in the matrix , contain columns. For each of these columns, its elements are equal for all treatments of the design with the same combinations of levels of the factors . By what we have already proved, each of these columns is orthogonal to all other columns. Hence, it can be proved by induction that these columns are interaction effects of the factors .
The number of different combinations of levels of the factors equals . All vectors of main effects and interaction effects of these factors belong to an -dimensional subspace () of an -dimensional space , because elements of each of the vector of main effects or interaction effects are equal for all treatments with the same combinations of levels of the factors and all vectors of these effects are orthogonal to the unit vector.
Since
the number of linearly independent -factorial interaction effects of the factors may not exceed
By using matrices with pairwise orthogonal columns, we get sets of orthogonal interaction effects. It can be shown that by using linearly independent (not necessarily pairwise orthogonal) main effects of the matrix , we get linearly independent interaction effects. To prove that, consider the following lemma.
Lemma 1.4.1. Let , and be matrices of size , and respectively and
where is a nonsingular square matrix of order . Then the matrices and are related by a nonsingular linear transformation.
Proof Let be the -th column of the matrix . Then, by Eq. (12),
Therefore, for any , the matrices and are related by a nonsingular linear transformation. This proves the lemma.
Matrices and for any are related by a nonsingular linear transformation. Therefore, using Lemma 1.4.1 repeatedly, we get that is related by a nonsingular linear transformation with . Hence, consists of linearly independent interaction effects of the factors .
This completes the proof of Theorem 1.4.1.
Definition 1.4.2. A set of linearly independent interaction effects of the factors is called full if the number of those effects of the set is given by Eq. (11).
Note 1 to Theorem 1.4.1. The proof of Theorem 1.4.1 gives us a method of construction of interaction effects for the design as a product of main effects of the factors. By using a full set of orthogonal main effects of the factors, we get a full set of orthogonal interaction effects. By using a full set of linearly independent main effects of the factors, we get a full set of linearly independent interaction effects.
Note 2 to Theorem 1.4.1. If for any , the functions of the -model Eq. (103) are chosen in such a way that
all columns of the matrix Eq. (104) except the first are vectors of main effects of the factor . If, in addition,
then, by the proof of Theorem 1.4.1, a scalar square of any column equals . Therefore, by Theorem 1.4.1 and Note 1 to the Theorem 1.4.1, the coefficient matrix Eq. (105) for a full factorial model is the matrix of main effects and interaction effects of the factors for the design .
A model of true effects for quantitative factors
Hereafter, we will consider the Chebyshev model only if the structure of the design leads to orthogonality of all effects. Otherwise, we will consider so-called model of true effects for quantitative factors.
Consider a full design with runs for all factors of the design . Define a vector of true values for the design as follows:
To define a vector of true effects for quantitative factors, form the following matrix for the design :
where all matrices have pairwise orthogonal columns (scalar squares of the columns of the matrix equal . Then define
at the points of the design with the factors where is the vector of true effects Eq. (14).
By the definition of the matrix of main effects and Note 2 to Theorem 1.4.1, the model Eqs (15) and (16) is a special case of the -model and, therefore, a special case of the general factorial -model for quantitative factors. We will call Eqs (15) and (16) the -model of true effects.
Denote the parts of the matrix and the vector for the factorial set by and respectively. If elements of the vector that do not correspond to the factorial set equal zero, Eq. (16) will be as follows:
It is evident that the model Eq. (17) is also a special case of the factorial -model. We will call Eq. (17) the -model of true effects. If it does not matter or if it is clear which type of a model for quantitative factors we consider, we will omit the words “true effects”.
The -model Eq. (17) can be extended to a wider domain:
where is the -th row of the matrix .
Full rank theorem
The following three paragraphs are devoted to finding a condition under which it is possible to construct an orthogonal design (Brodsky, 1971).
Let in the design , the number of different combinations of levels of factors
Theorem 1.6.1. I. The condition Eq. (19) is necessary and sufficient for the number of degrees of freedom carried by any -factor interaction effects of factors of is determined by Eq. (11). II. If Eq. (19) holds, is a matrix of full rank.
Proof Necessity of the condition Eq. (19) is evident. Show sufficiency of the condition Eq. (19) and that statement II of the theorem is true.
By the hypothesis of the theorem, the design contains a subset forming a full design for the factors . For the design , we generate matrices of effects up to order of the factors and the matrix
Theorem 1.4.1 implies that for any interaction effect of a given subset of factors orthogonal to any interaction effect of a different subset of factors. Hence, the columns of the matrix taken one from each matrix of effects are pairwise orthogonal. Therefore, is a matrix of full rank.
For each combination of levels of the factors in the design , select the row that corresponds to this combination in the matrix . Denote the resulting matrix by
where the number of columns of the matrix equals the number of columns of the matrix . It is evident that is also a matrix of full rank.
Now we will construct the matrix
that contains vectors main effects, interaction effects of the factors of the design , and the vector . The number of columns of the matrix is equal to the number of columns of the matrix . Denote the -th columns of the matrices and by and respectively. The first column of the matrix is the first column of the matrix . We will construct the next columns recurrently. Assume that we have constructed first independent columns of the matrix in such a way that the -th column of the matrix is a linear combination of the first columns of the matrix . Also assume that any of the columns that belong to the matrix are linearly independent interaction effects of the factors . Then the method of construction of the -th column is the following.
Let the -th column of the matrix is . Then make the following assignment:
where
The first columns of the matrix are independent, therefore is nonsingular, its inverse exists, and
is nonzero column. is a linear combination of columns of and , therefore, we get independent columns. The elements of for the same combinations of level of the factors are equal. It is evident that
Hence, is an interaction effect of the factors .
Therefore, the matrix contains linearly independent columns of main effects, interaction effects of the factors , and .
By Theorem 1.4.1, for the design the number of degrees of freedom carried by main effects of the factor and interaction effects of the factors are equal and respectively. Therefore, each of the matrices and contains independent columns; each of the matrix and contains independent columns. The number of linearly independent columns in equals therefore contains a maximum set of linearly independent main effects of the factor for the design .
Suppose that and columns of the matrices constitute a set of linearly independent vectors. Since a nontrivial linear combination of independent main effects of the factors is a main effect of the factor there exists a nontrivial linear combination of the vectors , that equals zero (where are vectors of main effects). On the other hand, can be expressed as nontrivial linear combinations of the columns of the matrix . It follows that there exists a nontrivial linear combination of and columns of matrices that equals zero, which is a contradiction. Therefore, and all columns of matrices are linearly independent. By using simple algebraic operations, we can get that the number of degrees of freedom carried by interaction effects of the factors and equals . Therefore, any matrix of interaction effects of first order in contains a maximum set of linear independent interaction effects of first order.
By using the same type of argument, we can get the following. If any of matrices of linearly independent interaction effects of order in contains a maximum set of vectors, then any of matrices of linearly independent interaction effects of order in contains a maximum set of vectors.
This completes the proof of Theorem 1.6.1.
Consider the matrix and the matrix Eq. (104). It is evident that these matrices are related by a linear nonsingular transformation. It follows that the matrices and are related by a linear nonsingular transformation as well. Therefore, we get a simple corollary.
Corollary to Theorem 1.6.1. If the condition Eq. (19) is satisfied, then is a matrix of full rank; coefficient matrices and of the design for any two -models are related by nonsingular linear transformation.
When considering -factorial interaction effects, we will assume that the condition Eq. (19) is satisfied.
When considering a factorial set as a set of factors and their subsets in accordance with Definition 3.2.2, we will also consider a factorial set as a set of main effects and interaction effects in accordance with the following definition.
Definition 1.6.1. A set of main effects and interaction effects of the factors is called a factorial set if the following condition holds. If interaction effect of the factors belongs to the set full set of interaction effects of factors belong to the set for all and
Definition 1.2.2 is consistent with Definition 1.6.1, because of obvious one-to-one correspondence between subsets of factors and subsets of main effects and interaction effects of these factors.
The condition of proportional frequencies
This paragraph is devoted to the fundamental concept introduced by Plackett (1946) – the condition of proportional frequency.
Let the -th level of the factor occurs times and the -th level of the factor occurs times in the design . Let the -th level of the factor occurs times with the -th level of the factor . Consider the -matrix . It is evident that
Consider the -dimensional vector and the (-matrix . Matrix has rows including different rows (corresponding different combinations of levels of the factors and columns that are linearly independent.
For each subset of the identical rows of , select only one. For the corresponding elements of vector , calculate their average. Denote the resulting matrix and the column by and respectively.
Any column of main effect is orthogonal to the unit vector. Hence,
Therefore, Theorem 1.7.1 presents a necessary condition of pairwise orthogonality of vectors of main effects (one from each factor). This condition – the condition of proportional frequencies – states that the levels of one factor occur with each of the levels of other factor with proportional frequencies.
Definition 1.7.1. Let be the number of the appearances of the combination of levels of the factors respectively in the design . Then the set of requirements
is called the condition of proportional frequencies for the factors .
Definition 1.7.2. The condition of proportional frequencies Eq. (23) is said to be satisfied for a factorial set if Eq. (23) is satisfied for each group of factors of any two elements of the set .
Let the design includes all level combinations of the factors and . Then the number of independent first-order interaction effects of these factors will be determined by Eq. (11). In this and only this case, the matrix will be square. Assume that the condition
is satisfied for the factor As in the proof of Theorem 1.7.1, we get that the levels of the factor occur with each combination of the levels of the factors and with proportional frequencies:
If the condition Eq. (22) is satisfied for the factors and , then, by Theorem 1.7.1,
Therefore
A similar conclusion can be made for any number of factors. For this purpose, we will consider the following partitioning of a set of the factors . The first partition splits the factors into two sets. The second partition splits each of the sets of the first partition (if it contains more than one factor) into two subsets. And so on. The resulting partition of the factors is called a full partition if each subset of the last partition contains only one factor.
Theorem 1.7.2. Suppose that for factors there exists a full partition such that the following condition holds. Any subset of the -th partition is split by the -th partition into two subsets and such that
Then
The combinations of the levels of the factors occur with each combination of the levels of the factors with proportional frequencies;
The condition of proportional frequencies is satisfied for the factors .
Now we are going to prove a sufficiency of the condition Eq. (23) for pairwise orthogonality of main effects and interaction effects (one from each matrix of effects). The proof will follow from the following lemma.
Lemma 1.7.1. Suppose that for the factors and the column ,
Then the sum of the elements of corresponding to any combination of the levels of the factors equals zero.
Proof, and, by Theorem 1.6.1, is a square nonsingular matrix. is orthogonal to all columns of the matrix . Hence, all elements of equal zero, which was to be proved.
It follows from Lemma 1.7.1 that orthogonality of the interaction effect -th order of the factors to all interaction effects -th order of these factors implies that the sum of elements of corresponding to any combination of the levels of any factors of the factors are equal to zero.
Theorem 1.7.3. If the condition of proportional frequencies Eq. (23) is satisfied for factors, all main effects and interaction effects of these factors (one from each set of effects) are pairwise orthogonal.
Proof Let and be two arbitrary subsets (of and factors respectively) of the set of factors such that the condition of proportional frequencies Eq. (23) is satisfied. Summing up both parts of Eq. (23) for all levels of certain factors, we get that the condition of proportional frequencies Eq. (23) is satisfied for any subset of the factors of the given set of factors. In particular, the condition of proportional frequencies is satisfied for the set of factors belonging to the union . Denote by and arbitrary interaction effects ()-th and ()-th order respectively of the factors belonging to and . Consider two cases:
The set is empty;
The set is not empty.
is a contrast, therefore, the sum of its elements equals zero. The condition of proportional frequencies is satisfied for the factors of the set . Therefore, the combinations of the levels of the factors from the set occur with each combination of the levels of the factors from the set with proportional frequencies. The elements of are equal for the same combinations of the factors from the set . Hence, the sum of elements of corresponding to any combination of the levels of the factors from equals zero. Therefore, and are orthogonal.
Any effect , by the definition of interaction effects, orthogonal to any effects of the factors from the set . By Lemma 1.7.1, for the part of the design corresponding to any combination of the levels of the factors from , the sum of elements of equals zero. The combinations of the levels of the factors from the set occur with each combination of the levels of the factors from the set with proportional frequencies. In particular, the same is true for the combinations of the levels of the factors from for the part of the design. Hence, the sum of elements of equals zero for any combination of the levels of the factors from . Therefore, is orthogonal to .
This completes the proof of the theorem.
It is evident that for any matrix of effects, all effects can be selected pairwise orthogonal. In this case if the condition of proportional frequencies is satisfied for any factors of the design, all main effects and interaction effects up to order are pairwise orthogonal.
Construction of interaction effects
Theorem 1.8.1. For any factors for which the condition of proportional frequencies is satisfied, the product of the vectors of main effects of the factors respectively is a vector of interaction effects of the factors .
Proof It is evident that elements of depend only on the combinations of the levels of the factors . Hence, we need only to prove that is orthogonal to and any main effects and interaction effects up to order of the factors .
For two factors and , orthogonality of and ( is any vector of main effects of the factor , perhaps, identical to ) is equivalent to orthogonality and . By the term of the theorem, is orthogonal to all vectors of main effects of the factor . Therefore, by Lemma 1.7.1, the sum of the elements of corresponding to any level of the factor equals zero. Since has equal elements for the same levels of the factors , is orthogonal to .
Continue the proof by induction. On the ()-th step , we get the column . Its orthogonality to any columns of main effects and interaction effects up to order is equivalent to orthogonality of two columns and . By the induction hypothesis, is interaction effect -th order and, therefore, orthogonal to all main effects and interaction effects of the factors . Hence, by Lemma 1.7.1, the sum of elements of corresponding to any combination of the levels of the factors equals zero.
Thus, the proof is complete.
Theorem 1.8.2. Suppose that the condition of proportional frequencies Eq. (23) is satisfied for given set of the factors . Let all matrices of main effects contain pairwise orthogonal columns. Then all possible products of the columns (one for each factor) produce the full set of pairwise orthogonal interaction effects of the factors .
Proof Consider two different products of the columns (one for each factor): and . For these two sets of columns, at least one pair (let it be and ) contains different columns. The elements of the column depend only on the levels of the factors . Now we have to prove that the sum of the elements for any combination of the levels of the factors equals zero. Indeed, by Theorem 1.7.2, for any combination of the levels of the factors , the levels of occur with proportional frequencies. Hence, for all combinations of the levels of the factors , the sums of the elements of have the same sign. By the term of the theorem, and are orthogonal. Hence, these sums equal zero.
Thus, the proof is complete.
Theorem 1.8.2 can be generalized for the case of matrices of main effects that not necessarily contain orthogonal columns.
Theorem 1.8.3. Suppose that the condition of proportional frequencies Eq. (23) is satisfied for given set of the factors . Then all possible products of the columns (one for each factor) produce the full set of linearly independent interaction effects of the factors .
The proof of the theorem is similar to the proof of the corresponding part of Theorem 1.4.1.
Effects of levels and interaction effects of levels
For the sake of simplicity, without loss of generality, consider a design with three factors.
Let , where is an observation that corresponds to the point of a full design with the -th level of the factor , the -th level of the factor , and the -th level of the factor . An asterisk instead of an index means that we take the average over all levels of the corresponding factor. For example,
Definition 1.9.1. The number is called a true average; the number is called an effect of the -th level of the factor .
Definition 1.9.2. The difference between the effect of the -th level of the factor for a subset with the -th level of the factor and the effect of the -th level of the factor is called an interaction effect of the -th level of the factor and the -th level of the factor and denoted by .
Definition 1.9.3. The difference between the interaction effect of the -th level of the factor and the -th level of the factor for a subset with the -th level of the factor and the interaction effect of the -th level of the factor and the -th level of the factor is called an interaction effect of the -th level of the factor , the -th level of the factor , and the -th level of the factor and denoted by .
It is evident that Definitions 1.9.2 and 1.9.3 are correct, since they are symmetrical for the factors , and . For example, for the design with three factors,
Other effects of levels and interaction effects of levels are defined analogously.
Any effect of the level or any interaction effect of the levels is a linear combination of the mathematical expectations of observations for . The coefficients of such linear combinations form the vectors that we will call vectors of effects of levels and vectors of interaction effects of levels. Denote by the vector of the effect of the -th level of the factor and denote by the vector of the interaction effect of the -th level of the factor and the -th level of the factor , etc.
In Eq. (24), coefficient for the treatment combination with the levels , , and of the factors , and respectively is .
Coefficient for the treatment combination in which the factors and appear at the levels and respectively and the factor appears at the level other than , equals .
Coefficient for the treatment combination in which the factor appears at the level and the factors and appear at the levels other than and respectively, equals .
Coefficient for the treatment combination in which the factors , and appear at the levels other than and respectively equals .
The summary for all elements of the vector of the interaction effects of the levels , , and is given in the following table.
Vector of Interaction Effect of Levels , , and
Elements of vector of interaction effect of levels , , and of factors , , and . Respectively
Each of rows of the table corresponds to a set of treatments. If, for example, the factor appears at the level in the set, the corresponding cell of the table has index . If the factor appears at the level other than , the corresponding cell is left empty. The table cells for the factors and are filled analogously.
Let
We will also use the similar notations in the similar cases.
The element of the vector of the interaction effect of the levels , , and for the -th obsevation is
We can easily prove it if we take into account the following equalities:
Analogously, in the general case, the element of the vector of the interaction effect of the levels of the factors is
In particular, the element of the vector of the effect of the level of the factor is
Hence,
or
Therefore, the following theorem has been proved.
Theorem 1.9.1. The vector of the interaction effect of the levels of the factors respectively is, apart from a proportionality factor, the product of the vectors of the effects of the levels of the factors respectively.
It is evident that are the vectors of main effects of the factors respectively. Hence, the vector of the interaction effect of levels , by Note 1 to Theorem 1.4.1, is the vector of the interaction effect of the factors for the design .
It is easy to verify that any vectors of all vectors of main effects of the factor , with the elements (, form the set of linearly independent vectors. A similar statement holds for the factors . All possible products of the selected independent main effects (one from each factor) form interaction effects of levels of the factors . By Note 1 to Theorem 1.4.1, these interaction effects of levels form a set of linearly independent vectors. Therefore, the following theorem has been proved.
Theorem 1.9.2. Any vector of the interaction effect of levels of the factors is a vector of an interaction effect of the factors . Maximum linearly independent subset of vectors of interaction effects of levels of the factors contains exactly vectors.
A model of true effects for qualitative factors
Denote the matrix of all vectors of effects of levels of the factor by denote the matrix of all interaction effects of levels of the factors and by , etc.
Proof For the sake of simplicity, without loss of generality, consider a design with three factors.
In the matrix , consider the row corresponding to the -, -, and -th levels of the factors and respectively. In the matrix , consider the row corresponding to some combination of levels of the factors , and . Then a scalar square of these two rows of the matrices and equals
If the given row of corresponds to the levels , , and of the factor and respectively, then
Hence, it has been proved that the diagonal elements of are equal to 1.
Assume that in the given row of , at least one of the factors and appears at the level other than , and respectively. Without loss of generality, assume that the factor appears at the level other than . Then Eq. (2.10) becomes
where
This proves the theorem.
Now we will define the vector of true effects for qualitative factors:
where , as before, is the vector of mathematical expectations at the points of full design. By Eqs (29) and (31), the following equality holds:
We can consider Eq. (32) as a model, that is true for all points of .
Denote by and the parts of the matrix and the vector respectively corresponding to the factorial set . Assume that the elements of the vector that do not correspond to the factorial set are equal to zero. Then Eq. (32) becomes
The coefficients of the model Eq. (32) and, therefore, the model Eq. (33) are easy to interpret. This interpretation becomes evident if we recall the definition of the effects of levels and interaction effects of levels.
The coefficient matrix for the model Eq. (32) for the full design is not full rank matrix. For example, the sum of the columns belonging to is . Therefore, the solution of the normal equations of the method of least squares for the parameters of the model is not unique. However, there exists a system of linear equalities for these parameters
such that the matrix
has full rank and no row of is represented by a linear combination of rows of the matrix . In this case, for the matrix design , i.e., for the full design with the restriction Eq. (34) on the parameters , there exists a unique solution for LS estimates of the parameters (Scheffé, 1959).
Consider the -th row of the matrix of the vector of effects of levels of the factor .
Split the matrix into submatrices to correspond to the partitions of and :
Then for the full design , for example, the matrix is
Denote the columns of the matrices and by and respectively, adding the indices corresponding to the indices of . The columns of the matrix is .
Lemma 1.10.1. If for the vector
then is the vector of the interaction effect of the factors and for any vector of the interaction effect of these factors there exists the vector , such that Eq. (39) holds and
Proof It is evident that any column of the matrix , as any other vector of the interaction effect of the factors , can be represented by a linear combination of columns of the matrix , namely
Let
By the definition of a vector of an interaction effect, is orthogonal to all columns of the matrix
for any factors of . Therefore, by Lemma 1.7.1, the sum of the elements of corresponding to any combination of levels of the factors is equal to zero, i.e.,
Now we have to prove that if the condition Eq. (39) holds, is the vector of the interaction effect of the factors .
Summing up, for example, the first equality of Eq. (40), we get
That means that is a contrast. It follows, by Eq. (39), that Eq. (41) holds, and, therefore, Eq. (40) holds as well. Hence, the sum of elements of corresponding to any combination of level of the factors of the factors is equal to zero.
This completes the proof of the lemma.
Theorem 1.10.1. The matrix Eq. (35) is a full rank matrix and no row of can be represented by a linear combination of the rows of the matrix .
Proof The fact that no row of can be represented by a linear combination of the rows of the matrix is obvious. Now we have to prove that there is no nonzero vector such that
The model Eq. (32) with the restrictions Eq. (38) is called a full factorial model of true effects for qualitative factors (or a -model of true effects).
Let be a submatrix of the matrix corresponding to the factorial set . Then the model Eq. (33) with the restrictions is called a factorial model of true effects for the factorial set for quantitative factors (or a -model of true effects). Hereafter, we will not necessarily keep the words “true effects” in the notation of these models.
For the -model obviously, the following two conditions are satisfied:
The -model Eq. (33) of true effects contains an absolute term and terms with all effects of levels for any factor.
If the model contains at least one term with an interaction effect of levels of the factors then the model contains all terms with interaction effects of levels of any factors of the factors .
A mixed model
Consider the full design together with the design for the case when the factors are qualitative and the factors are quantitative.
For the qualitative factors, as in Section 2.10, we will use the matrices of all vectors of effects of levels of the factors . For quantitative factors, as in Section 2.5, we will use the matrices of vectors of main effects of the factors for the design . For vectors of interaction effects of qualitative factors we will apply the matrix of all vectors of interaction effects of levels of the factors :
For quantitative factors we will apply the matrix of interaction effects of the factors :
For the qualitative factors and the quantitative factors we will use the matrix :
By using the line of proof of Theorem 1.9.2, we get the following theorem.
Theorem 1.11.1. Any vector of the matrix Eq. (45) is a vector of an interaction effect of the factors . A maximum linearly independent subset of vectors of interaction effects of the matrix Eq. (45) contains exactly vectors.
Denote
Theorem 1.11.2.
Proof Consider the full design with runs for the only factor . By Theorems 1.4.1 and 1.10.1, we can easily see that
a vector of true effects of the mixed model. Theorem 1.11.2 implies that
There exist equalities similar to equalities Eq. (2.10), for the parameters Eq. (49) of the mixed model Eq. (50) with the summation indices . Let denote the matrix of coefficient of the corresponding system. Then
Using methods similar to the methods of Section 2.10, we can show that the following theorem holds.
Theorem 1.11.3.
is a full rank matrix, and no row of is represented by a linear combination of the rows of .
The model Eq. (50) with the restriction Eq. (51) will be called the mixed full factorial model of true effects (or the -model of true effects).
Denote by , and the parts of the matrices , and the vector respectively corresponding to the factorial set . Assume that elements of the vector that do not correspond to the set are equal to zero. Then the following model (which will be called the mixed factorial model of true effects for the factorial set , or the -model of true effects) holds:
We may omit the words “true effects” in the notation of the model.
The model Eq. (52) can be extended to a wider domain. In this case, we get the following model:
where is the -th row of the matrix
Equivalence of factorial models
We now focus on equivalence of factorial models in the sense of properties of related regression. The -model and the -model of true effects are special cases of the -model of true effects. Hence, the only model (of considered factorial models) that is not a special case of the -model of true effects is the general -model Therefore, to prove equivalence of all types of factorial models for factorial set we have to prove equivalence of any two -models (i.e., any -model of true effects and any -model of true effects) and equivalence of -models of true effects and the general -model.
Consider a set that consists of vector and a full set of linearly independent effects for the factorial set for the full design . For the fractional design (i.e., for the design that does not include some combinations of the levels), consider a set of vectors with the following property. Its coordinates corresponding to some combination of levels of the factors are equal to the elements of vectors of the set corresponding to the same combination of levels for the design . We will call the vectors of effects of the set the vectors of effects generated by the design and the factorial set and denote them by upper index . Let be the coefficient matrix of the design for the -model Eq. (52).
We now focus on the problem of estimability of the parameters of the model Eq. (52) for the fractional design that includes some treatment combinations (not necessarily different) of the full design .
Lemma 1.12.1. The matrix
is a matrix of full rank if and only if vectors of effects generated by the design and the factorial set are linearly independent.
Proof Let be a nonzero vector such that
Then , and, by Lemma 1.10.1,
where is the vector of the main effects of the factor is the vector of the interaction effect of the factors
Therefore,
where includes those and only those rows of that correspond to treatments combinations of the fractional design i.e., (the vectors of effects generated by the design and the factorial set . It follows from Eq. (55) that these vectors effects generated by the design and the factorial set are linearly independent. By Lemma 1.10.1, any vector of the interaction effect of the factors can be represented as . Then the corresponding vector of the effect generated by the design is
By virtue of the assumption, there exist (not equal simultaneously zero) such that
Consider three set of factors corresponding to the fractional design : and the quantitative factors such that . For the factors , consider the -model of true effects
with the restrictions on the parameters
For the factors , consider the -model of true effects
with the restrictions on parameters
For the factors consider the general -model Eq. (106) with the coefficient matrix .
Theorem 1.12.1. If for the design one of the matrices
is a full rank matrix, then any of them is a full rank matrix.
The proof of the theorem follows from Lemma 1.12.1 and the Corollary to Theorem 1.6.1.
Theorem 1.12.1 and Lemma 1.12.1 imply that the existence of unique solution of the normal equations of the method of least squares does not depend on whether the factors are qualitative or quantitative. It depends only on whether the vectors of effects generated by the design and the factorial set are linearly independent or not. The design is called nonsingular if these vectors of effects (generated by the design and the factorial set ) are linearly independent.
Theorem 1.12.2. For nonsingular factorial design , all factorial models for the same factorial set are equivalent in the sense of properties of related regression (for any point, estimates of the regression function are equal and variances of these estimates are equal).
Proof First, we will prove that the general -model and any -model of true effects are equivalent. Second, we will prove that any -model of true effects and any -model of true effects are equivalent.
Consider the -model of true effects
with the domain that not necessarily coincides with such that
Then it is evident that
The coefficient matrix of the design for the model Eq. (58) .
Consider the general -model with the domain :
Denote the coefficient matrix of the design for the model Eq. (59) by . The submatrix of the matrix has the size and the rank . The submatrix
of the matrix has the same size and rank . By the Corollary to Theorem 1.6.1, these submatrices are related by the nonsingular linear transformation :
The matrices and are also related by the nonsingular linear transformation:
Besides, for the treatment combinations of the following equality holds:
Then for the point , LS estimate for the model Eq. (58), by Eqs (61) and (62), coincides with LS estimate for the model Eq. (59):
The variance of the estimate at the point , by Eqs (61) and (62), coincides with the variance of the estimate :
Assume that the model Eq. (58) is defined on such that the equality similar to Eq. (60) holds over the domain , i.e., that
Then Eq. (62) and, therefore, Eqs (63) and (64) are satisfied for all points .
To prove equivalence of the -model and -model, we will show the following: A reduction of the model Eqs (56)–(57) to the model without restrictions on parameters, with a coefficient matrix of a full rank, leads to the -model of true effects
where contains the vectors of effects of the factorial set for the design .
Consider the model Eq. (56) with the restrictions on the parameters Eq. (57). These restrictions are split into the following partial restrictions:
where is the corresponding part of the vector , i.e., . Let be one of the solution of Eq. (66). Then, by Lemma 1.10.1, the vector corresponds to the vector of the interaction effect of the factors
Lemma 1.12.2. A set of linearly independent vectors such that corresponds to a set of the linearly independent vectors of the interaction effects of the factors .
Proof The equality
implies that
Since ,
It is easy to see that Eq. (68) implies Eq. (67). This proves the lemma.
For the given design, the LS estimates possess some optimal properties. Hereafter, we will fix a method of statistical estimation (the LS method) and will solve the task of finding effective designs for factorial models. We will distinguish between two shades of meaning of the concept of an effective design: the criteria of optimality and the desirable properties of the design. There is no clear distinction between these two concepts. However, they can be described as follows.
The criteria of optimality are mathematically clear requirements for the design. These requirements in the majority of cases may be seen as an expansion of the concept of the best linear estimates. The desirable properties are those properties that are not very clear from the point of view of? mathematician but natural for the practitioners involved in experiment.
Hereafter we assume that all designs are nonsingular. By Theorem 1.12.1, whether the design is singular or nonsingular does not depend on the type of the factorial model for the factorial set . Therefore, we introduce the types of nonsingular designs in accordance with the following definition.
Definition 2.0.1. A nonsingular design for the factorial model for the factorial set containing all possible elements of factors is called the design of resolution A nonsingular design for the factorial model for the set containing all possible elements of factors is called the design of resolution if all effects of the set are estimated with no bias in the model for the set containing all possible elements of factors . The design of resolution 3 is also called the design of main effects and the corresponding model is called the model of main effects.
Optimality criteria of designs
If for the model Eq. (75) and the design Eq. (76), the information matrix is a full rank matrix, the covariance matrix of the vector of estimates is
The matrix is called a normalized information matrix and the matrix is called a normalized covariance matrix.
The first three criteria of optimality will be introduced for the model without restrictions on parameters. They allow an interpretation that is associated with the size of the dispersion ellipsoid of the parameter estimates.
Definition 2.1.1. The design is called -optimal on the set of designs if
The dispersion ellipsoid of the parameter estimates of the -optimal design has minimal volume.
The criterion of -optimality (also called the Mood’s criterion) is the most popular one. It will be used also for the model Eq. (77) with the restrictions Eq. (100) on parameters. In this case, the matrix in Eq. (73) will correspond to the information matrix for the reduced model Eq. (80). A property of -optimality of a design is invariant to any nonsingular linear transformation of the parameter vector. Based on that it is easy to prove that a property of -optimality of a design is invariant to a selection of the vector of new parameters of the reduced model.
Definition 2.1.2. The design is called -optimal on the set of designs if
The dispersion ellipsoid of the parameter estimates of the -optimal design has minimal length of a diagonal of the circumscribed parallelepiped.
The criterion of -optimality is also called the Kishen’s criterion.
Definition 2.1.3. The design is called -optimal on the set of the designs if
where is an eigenvalue of the matrix .
The maximum axe of the dispersion ellipsoid of the parameters of estimates of an -optimal design has minimal length.
The criterion of -optimality is also called the Ehrenfeld’s criterion.
The following two criteria are related to the properties of the regression function in the domain.
Definition 2.1.4. The design is called -optimal in the domain on the set of the designs if
where is the variance of the estimate of the regression function at the point .
Definition 2.1.5. The design is called -optimal in the domain on the set of the designs if
The following two criteria (orthogonality and regularity) will be often used in here, although, at first glance, they have no such statistical justification as the previous criteria of this paragraph. However, at the end of this chapter, we will show why these criteria are important for applications.
Definition 2.1.6. A design is called orthogonal for the given model if the covariance matrix of the parameter vector of estimates for this model and for the design is diagonal.
Definition 2.1.7. A factorial design is called regular of strength if the condition of proportional frequencies is satisfied for any factors.
The following theorem is a corollary to Theorem 1.7.3.
Theorem 2.1.1. A regular factorial design of strength allows obtaining a set of pairwise orthogonal main effects and interaction effects up to the order . A regular factorial design of strength allows obtaining a set of pairwise orthogonal main effects and interaction effects up to the order such that each of them is orthogonal to all interaction effects of the order .
Theorem 2.1.1 implies that a regular factorial design of strength is a special case of the design of resolution .
Definition 2.1.8. A factorial design is called regular for the factorial set if there exists a factorial model for the factorial set for which this design is orthogonal.
By the Definition 2.1.8, a regular factorial design of strength is a special case of the regular factorial design for the factorial set , containing all possible elements of factors (
Note that regularity of the design for the factorial set does not imply orthogonality of the design for any model for the set .
Theorem 2.1.2. The following three statements are equivalent:
The design is regular for the factorial set ;
For the design all main effects and interaction effects corresponding to the factorial set (one from each set of effects) are pairwise orthogonal;
In the design the condition of proportional frequencies is satisfied for the factorial set .
Proof Equivalence of the statements 2 and 3 follows from Theorems 1.7.2 and 1.7.3. Now we will show that the statement 1 of the theorem implies the statement 2. Indeed, it follows from the statement 1 that the coefficient matrix has pairwise orthogonal columns. Hence, the columns corresponding functions
are orthogonal to the unit vector and, therefore, are main effects of the factors .
The column corresponding to the product has equal elements for the given combination of levels of the factors . Hence, it follows from the statement 1 that this column is orthogonal to all main effects of the factors and all interaction effects of these factors of the order . Therefore, the column is an interaction effect of the factors . Hence, the statement 2 holds.
It is easy to see from Theorem 1.8.2 that the statement 2 implies the statement 1.
This completes the proof of the theorem.
Very often, it is not easy to construct the design that satisfies all or even some of optimality criteria. So the designs that satisfy only one of the criteria are also important and useful.
Desirable properties of designs
We start considering the desirable properties of designs with the property related to the number of treatments of the design, which is very important for practitioners involved in experiment.
Definition 2.2.1. A design is called saturated for the factorial -model if the number of runs of the design is equal to the number of parameters of the model
We also apply the Definition 2.2.1 to models that include qualitative factors (with the restrictions on parameters). In this case we reduce the number of parameters in the Definition 2.2.1 by the number of linearly independent restrictions.
Among of other desirable properties of designs we note the following two:
Simplicity of calculations and interpretation of the results of observations;
Possibility to split the design into blocks when all experiments cannot be carry out in homogeneous conditions.
In the next chapters, we will address issues of construction of optimal designs with desirable properties.
Equivalence of d-and g-optimal designs
Let be a set of all designs with the domain that is closed and bounded. Then the following theorem of Kiefer-Wolfowitz holds:
Theorem 2.3.1 (Kiefer & Wolfowitz, 1960). The following statements are equivalent:
The design is -optimal on ;
The design is -optimal on ;
.
Theorem 2.3.2 (Kiefer & Wolfowitz, 1960). The information matrix of -(-)optimal design is unique on . The maximum of the variance of the estimate of the regression function on is reached at points of the design.
Theorem 2.3.1, generally speaking, does not hold if is a subset of the set of all designs. For example, for the subset of designs with the fixed number of treatments, - and -optimal designs are not equivalent. In this case the statement 3 of Theorem 2.3.1 holds neither for - nor for -optimal design.
Criterion of average variance
Let be a factorial design for the -model Eq. (106). The variance of the estimate of the regression function at point
is equal to
By Theorem 1.4.1 and Note 2 to Theorem 1.4.1, functions in the -model Eq. (106) can be selected in such a way that
where .
Then the average variance over is
where is the number of parameters in the model and . By the results of Section 2.12, it follows that the variance of the estimate of the regression function depends neither on the type of the model for the factorial set nor on the choice of the functions in the model. Therefore, the following theorem holds.
Theorem 2.4.1. The factorial design is -optimal on for any factorial model for the factorial set if and only if it is -optimal for the -model satisfied the condition Eq. (78).
Theorem 2.4.1 can also be obtained as a consequence of Theorem 2.12.1 of the book by Fedorov (1972).
Let the levels of the factor occur times respectively in the design . In this case, obviously,
Definition 2.4.1. The number
is called the coefficient of uniformity -th level of the factor .
It is evident that if the -th level of the factor occurs in the design more than times, if the -th level of the factor occurs in the design less than times, if the -th level of the factor occurs in the design exactly times, .
The last equality holds, in particular, for uniform designs for any level of any factor.
Definition 2.4.2. The average of the coefficients over all levels of the factor
is called a coefficient uniformity of the factor
Definition 2.4.3. A factor is called uniform if all its levels occur in the design with equal frequency. Otherwise, a factor is called nonuniform.
It is evident that for uniform factors for nonuniform factors .
Consider the regular design of main effects for the factors , i.e., the regular design for the factorial set that consists of only the factors . Then all functions in the model
can be selected in such a way that the design will be orthogonal for the model.
Let the values of the functions at the points of the design form the matrix of main effects of the factor (Section 2.3). Hence, the coefficient matrix of the design for the model Eq. (81) is
All columns of are pairwise orthogonal and scalar square of any of them equals .
The covariance matrix of the vector of parameter estimates of the model Eq. (81) is
Let be the vector of functions corresponding to the levels of the factors respectively. The variance at this point is
The normalized (per one treatment combination) variance at this point is .
Consider the matrix
Then
Now introduce the following matrix:
is a square matrix of order . It follows from Eq. (83) that
It is evident that in uniform regular designs, where is the number of parameters of the model; in nonuniform regular designs, .
Consider the following efficiency function related to the criterion of the average variance:
Then for uniform regular designs, . For nonuniform regular designs, . Hence,
Therefore, it makes sense to express the efficiency function related to the criterion of the average variance as 100%.
Emphasize that Eq. (90) holds only for factorial models and designs. In connection with the last comment, consider the following example.
In the domain
consider the design
for the model
It can be verified that this design is -optimal. The lack of an absolute term makes the model “nonfactorial”. And, therefore, we should not expect that Eq. (90) still holds. Calculate now the variances of estimates of the regression function (divided by at 8 points of . They are 1, 1, 1, 3/4, 3/4, 3/4, 3/4, 0. The average variance is 6 /8. Hence, an efficiency of the design equals 133%.
D-optimality of regular factorial designs
In this paragraph we will obtain conditions under which regular factorial designs are D-optimal (Brodsky, 1975) and establish the relationship with other criteria of optimality.
First, consider the regular design of main effects for the model Eq. (81). It follows from Eq. (85) that for a uniform design, a normalized variance at any point (in particular, at any point of the design is equal to the number of parameters to be estimate:
i.e., maximum of a normalized variance over is reached at points of the design . Hence, the design is -optimal on .
Assume that in the design , at least one factor, say is nonuniform. For each factor, select the level with the maximum value of coefficient of uniformity. Let these levels be . Since
it follows from Eq. (85) that . Therefore, a nonuniform regular design cannot be -optimal for the model Eq. (81) even on . Hence, the following theorem has been proved.
Theorem 2.5.1. A regular factorial design of main effects is -optimal for the model Eq. (81) on if and only if it is uniform.
Suppose that the functions in the model Eq. (81) form a set of orthogonal polynomials in at the points of the design such that Eq. (82) holds. Denote by and minimal and maximum values of the variable respectively. The property of D-optimality of the design on , by Theorem 2.5.1, holds for any choice of values of for each of levels of the factor . We will now try to answer the question how to select the values of to ensure that the resulting design is optimal on the cube . Hereafter, without loss of generality, we will consider a design on the cube .
Let be the regular uniform design for the quantitative factors for the model of main effects. Let values of the variable that appear in the design be the following: endpoints of the interval and roots of the first derivative of the -th Legendre polynomial. Well known (Hoel, 1958; Guest, 1958) that the one-dimension design on interval that consists of these points is a -optimal for the model
That implies the following:
The maximum in Eq. (92) is reached on interval at the points . For the design the normalized variance at the point
Therefore, the design is -optimal. The design is also -optimal for the model, obtained from the model Eq. (81) by a linear transformation of its functions. Hence, the following theorem has been proved.
Theorem 2.5.2. Consider a regular design of main effects for the factors with the levels respectively for the model Eq. (81), where is polynomial in of degree . Suppose that the variables take values at the endpoints of the interval and at roots of the first derivative of the -th Legendre polynomial. Then the design is -optimal for the model Eq. (81) on the cube if and only if it is uniform.
Consider now the regular design for the factorial set for the -model:
It follows from Eq. (89) that for the uniform design, normalized variance at any point (in particular, at any point of the design is equal to the number of parameters to be estimated. Therefore, the design is -optimal in the domain . On the other hand, by Eq. (89), if at least one factor of the design is nonuniform, there exists the point for which the normalized variance exceeds the number of parameters. Hence, a nonuniform design cannot be -optimal for the model Eq. (93) even on the . Therefore, the following theorem has been proved.
Theorem 2.5.3. A regular factorial design for the factorial set is -optimal for the factorial -model on if and only if it is uniform.
Let the function be the set of orthogonal polynomials in at points of the design is defined on the interval . A property of -optimality of a regular design for the factorial set on does not depend on the values for each of levels of the factor. We will find the set of the values that makes the design -optimal on the cube
Consider a regular uniform design for the factorial set . Assume that different values of variable in the design are the endpoints of the interval and the roots of the first derivative of the -th Legendre polynomial on this interval.
For the design for -model Eq. (93), the normalized variance at the point
Therefore, the design is -optimal. The design is also -optimal for the model obtained from the model Eq. (93) by linear nonsingular transformation of the set its functions. Hence, the following theorem has been proved.
Theorem 2.5.4. Let in the -model Eq. (93) be a polynomial in of degree . Then the regular factorial design for the factorial set with the variables that have different values at the endpoints of the interval and at the roots of the first derivative of the -th Legendre polynomial is -optimal for the -model Eq. (93) on the cube if and only if the design is uniform.
Let be a -optimal uniform design with runs on the interval for the model
The following theorem is a generalization of Theorem 2.5.4 and can be proved analogously.
Theorem 2.5.5. A regular factorial design for the factorial set where for each levels of the variables match with levels of the variables of the design , is a -optimal for the -model Eq. (93) on the cube if and only if the design is uniform.
Theorem 2.5.6. The -optimal regular uniform design from Theorem 2.5.5 is -optimal and -optimal if the set of the functions satisfies the condition Eq. (78).
Proof The theorem is a corollary to Theorem 2.11.1 of the book of Fedorov (1972) and the results of this chapter.
Now consider the mixed factorial -model for the factorial set for the qualitative factors and the quantitative factors :
The domain is defined as follows:
It follows from Section 2.11 that for a -dimensional vector of parameters of the -model, the following equality holds:
where is a matrix; is the vector of elements, which are the new parameters. The value can be calculated based on Theorem 1.9.2. After the reparametrization we will have the following model:
For different new parameters , we will have different matrices related to each other by linear nonsingular transformations. This is equivalent to linear nonsingular transformations of the set of functions of the model Eq. (97). Property of -optimality is invariant to such transformations. In view of the last remark, we give the following definition.
Definition 2.5.1. The design is called -optimal for the -model with the restrictions Eq. (96) on Eq. (95) if it -optimal for any model Eq. (97) that is the result of reparameterization of the model Eq. (94).
Let be a regular uniform design for the factorial set for the qualitative factors and quantitative factors . Assume that the functions (which are included to the -model for quantitative factors) are pairwise orthogonal so that condition Eq. (78) holds. Then, by using methods similar to those in Sections 3.4 and 3.5, we can show that the normalized (per one treatment combination) variance of the estimate of the regression function at the point is equal to
If the variables take values at the points of the design , then, by Eq. (3.5) and Theorem 1.9.2,
The multiplier in Eq. (99) matches with the number of parameters of the model Eq. (97). Now it is evident that by using the line of the proof of Theorem 2.5.4, we can prove the following generalization of Theorem 2.5.5.
Theorem 2.5.7. Let the design be a regular factorial design for the factorial set for the quantitative variables and the qualitative factors . Let for each levels of the variables match with levels of the variables of the design . Then the design is -optimal for the -model Eq. (94) with the restrictions Eq. (96) on Eq. (95) if and only if it is uniform.
Note to Theorem 2.5.7. It is easy to prove that for the design from Theorem 2.5.7, a statement similar to Theorem 2.5.6 holds.
The results of this chapter provide a justification for the applicability of the criterion of regularity in the design of experiments.
BG-criterion
A geometric interpretation of -optimal (and close to -optimal) designs is based on the volume of a multi-dimensional ellipsoid. This interpretation does not give a clear understanding on a relative effectiveness of designs when we want to compare them. In this paragraph we will consider a transformation that reduces the multi-dimensional characteristic of -optimality to the linear one. The transformation was introduced by Brodsky and Golikova (1981). It is applicable not only to factorial models but also to other polynomial models.
Consider one preliminary example (Brodsky & Golikova, 1981) for the design of second order. Let be the -optimal design on the cube
for the polynomial model of second order. Let be the information matrix of the design .
For the same model, consider another design obtained by multiplication of all coordinates of any point of by 0.99. Practically, these two designs can be regarded as “almost the same”.
It is easy to calculate that
where is an information matrix for the design .
Hence, the design is “three times worse” than the design (based on determinant of information matrix). Therefore, it is obvious that it does not make much sense to compare designs based on the determinant of the information matrix. This is not surprising: the clarity of the comparison is lost in moving to multidimensional characteristics. Therefore, all criteria are usually reduced to linear characteristics.
It is for this reason that a number of authors performs certain transformations on this criterion (the determinant of the information matrix). The most popular transformation is a root of degree . Some authors assume that , where is the number of parameters of the model. However, more often, they assume . Now we will see what this yields for the example above.
It is easy to calculate that
i.e., the design that has to have an 99%-efficiency is interpreted as 96.5%-optimal (for ) or 98.3%-optimal (for ). The difference is not so big, especially for . So, it may seem that the transformation meets the goal. However, one more example will show that it does not.
In our example, replace the design with the design obtained by multiplication of all coordinates of any point of by 0.90. It is easy to calculate that for it will be interpreted as 83%-optimal (instead of 90%-optimal) and for , as 69%-optimal. It is possible to give even stronger examples.
It turns out that the use of these transformations only aggravates the situation. Indeed, since determinants of information matrices usually differ by several orders of magnitude, nobody wants to compare them. It is simply stated, instead, which is the greater. The transformations give the impression of comparability of criteria. The researcher might choose a not quite appropriate design (e.g., one with a large number of experiments) just because it has a “significantly” better characteristic than others, while in reality, the characteristics of all designs might be very close to each other.
Is it possible to find a transformation that never distort (in the sense mentioned above) characteristic of -optimality? Such a transformation for polynomial models is a root of degree
where is the number of terms of order in the model .
We will call a corresponding criterion the BG-criterion.
Denote the number of variables by and the number of parameters of the model, by . Then for the designs of the first order (for example, two-level factorial design of main effects), Eq. (100) will be as follows:
For designs of second order,
For the example under consideration, (instead of usually used 36 and 72).
Therefore, BG-criterion of optimality of the design (related to the criterion of -optimality and expressed by determinant of the information matrix of the design ) is:
Using Eq. (101), we get that the value of BG-criterion equals 99% for the design and equals 90% for the design .
A geometric interpretation of the BG-criterion is obvious. If for the given design , the value of BG-criterion equals, say, 95%, then a -optimal design with coordinates of any points multiplied by 0.95 has the same determinant of information matrix as the design .
Geometric designs
Splitting of degrees of freedom
This and the next paragraphs are devoted to the fundamental concept introduced by Bose (1947) – the nature of degrees of freedoms split in a full symmetrical design where is prime.
Definition 3.1.1. Let and be two sets of observations. Then a vector of coefficients of the linear function of observations
is called a contrast between these two sets of observations.
It is evident that Eq. (5) is satisfied for the vector of coefficients of the linear function Eq. (102).
Assume that all observations are divided into sets of observations in each in such a way that no one observation belongs to two sets. Then there exist different contrast between these sets. It is evident that maximum number of linearly independent contrasts equals . An example of linearly independent contrasts could be the contrasts between any fixed set and all other sets. The contrasts between these sets are said to carry degrees of freedom.
The following lemma is evident.
Lemma 3.1.1 (Bose, 1947). Suppose that all observations are divided into subsets of observations each in one way, and into sets of observations each in another way so that for every split, each of observations belong to one and only one of subset. Then if for any subset of the first split, observations belong to any subset of the second split, a contrast between any two subsets of the first split is orthogonal to a contrast between any two subsets of the second split.
Consider the full symmetrical design where is prime, is integer. In the design , every level of a factor corresponds to an element of Galois field . Then any treatment combination of the design with the factors at levels can be represented by the point of an -dimensional finite Euclidean space
Let be the pencil of parallel flats in . By this pencil, all treatments are divided into subsets of treatments (each subset corresponds to one flat of the pencil). Different flats of the pencil have no points in common, and there exists one flat that passes thorough each point of . Hence, each treatment belongs to one and only one subset. Therefore, the maximal number of linearly independent contrasts between these subsets is . In this case the pencil of parallel flats is said to carry degrees of freedom.
Consider two different pencils and of parallel flats.
Theorem 3.1.1 (Bose, 1947). A contrast between any two subsets generated by the pencil is orthogonal to a contrast between any two subset generated by the pencil
Proof Any given flat of the pencil intersects different flats of the pencil in different -flats. No two of -flats have any point in common (otherwise two different flats of the pencil would have a point in common). Any -flat contains exactly points. Therefore, each -flats of the pencil contains exactly points of points belonging to the given -flat of the pencil . Now the theorem statement follows from Lemma 3.1.1.
The number of different pencils of parallel flats is equal to . Each pencil carries degrees of freedom. Hence, all degrees of freedom carried by all contrasts can be split up to sets (generated by pencils of parallel flats) of degrees of freedom each so that any contrast corresponding to one set is orthogonal to any contrast corresponding to another set.
Nature of degrees of freedom carried by parallel pencils
Following Bose (1947), consider the nature of degrees of freedom carried by the pencil
Suppose that of coordinates of the pencil Eq. (103), coordinates are nonzero (without loss of generality, and the rest of them ( are equal to zero. Any -flat of the pencil Eq. (103) is
where is one of the elements of . Consider two points such that the -th coordinate of one of them equals the -th coordinate of other point for all . It is evident that these two points either simultaneously satisfy Eq. (104) or do not. Therefore, coordinates of a contrast between any two flats of the pencil Eq. (103) are the same for the same combinations of .
When , coordinates of a contrast between any two flats of the pencil
depend only on levels of the factor and, by the definition, form a main effect of the factor. Since the pencil of parallel flats carries degrees of freedom, the pencil Eq. (105) generates a full set of linearly independent main effects.
When , coordinates of a contrast between any two flats of the pencil
depend only on levels of the factors and . By Theorem 3.1.1, this contrast is orthogonal to all main effects of the factors and . Therefore, it is an interaction effect of the factors and . The number of different pencils of type Eq. (106) equals . Each of them carries degrees of freedom, and these degrees of freedom for one pencil of type Eq. (106), by Theorem 3.1.1, are orthogonal to degrees of freedom of other pencil of type Eq. (106). Therefore, all pencils of type Eq. (106) produce the full set of linearly independent interaction effects of the factors and .
Increasing we get the following theorem.
Theorem 3.2.1 (Bose, 1947). If coordinates of the pencil Eq. (103) are nonzero and the rest of them are zero, a contrast between any flats of the pencil Eq. (103) is an interaction effect of the factors . The pencil Eq. (103) carries degrees of freedom. The number of different pencils generating interaction effects of the factors equals .
Let be the vertex of the fundamental simplex. Consider the bundle of the parallel flats in . Assume that the coordinates of this bundle are nonzero, and the rest coordinates are equal to zero. Then the vertex of the bundle is
The vertex Eq. (107) passes thorough all vertices of the fundamental simplex, except the vertices . Then the following theorem follows from Theorem 3.2.1.
Theorem 3.2.2 (Bose, 1947). The pencil of parallel flats corresponds to interaction effects of order of the factors if and only if the vertex of the corresponding bundle passes through all the vertices of the fundamental simplex other than
Hypercubes of strength t
Consider a full symmetrical design with each factor at levels.
Set up a one-to-one correspondence between the levels of the factors and the elements of Galois field denoted by . The full design corresponds to the points of a finite Euclidean space . Denote coordinates of points of this space as . Then the system of independent equations
with coefficients quite obviously defines the subset of points of or the subset of the full design
Definition 3.3.1. The design consisting of points satisfying the system of linearly independent Eq. (4.3) is called a geometric design.
We will also use Definition 3.3.1 of geometric designs when right-hand sides of Eq. (4.3) have any elements (not necessarily zero). However, hereafter, except Section 4.6, we will assume that right-hand sides of Eq. (4.3) are zero.
Theorem 3.3.1 (Rao, 1950). points satisfying the system Eq. (4.3) form a hypercube of strength if and only if there is no nontrivial linear combination of Eq. (4.3) that contains less than nonzero coefficients.
Proof Assume that any nontrivial linear combination of Eq. (4.3) contains at least nonzero coefficients. Then any combination of any set of factors satisfies the system Eq. (4.3). Indeed, without loss of generality, we can select the factors as a set of factors. Fix them at certain levels. Then Eq. (4.3) is transformed into the following system:
Each equation of Eq. (4.3) contains at least one nonzero coefficient. Without loss of generality, assume that . Sum up the -th equation and the -th equation with appropriate multiplier. Then we get the equation with . This equation also has at least one nonzero coefficient. We can assume that . Going further with this process of diagonalization, transform the system Eq. (4.3) to a semi-diagonal type. Then we get that the system Eq. (4.3) always has at least one solution, and the number of different solutions is constant and equals . Hence, any combination of levels of any factors occurs exactly times. Therefore, the design is a hypercube of strength .
Now suppose that there exists a linear combination of Eq. (4.3) that forms the equation
where . Then, it is evident that all combinations of levels of factors will not satisfy Eq. (4.3). This proves the theorem.
Theorem 3.3.2 (Bose & Bush, 1952). Suppose that there exists the matrix of size ( is prime) such that any of its submatrix of size has rang . Then there exists an orthogonal array
Proof Consider the matrix of the full design of size . We will show that the matrix of size is the orthogonal array .
Let be a submatrix of size of the matrix and is a submatrix of size of the matrix corresponding to . Since each row of is a combination of different rows of the matrix . Hence, each row in occurs times. Therefore, is an orthogonal array of strength and index
Elements of rows of the matrix can be interpreted as coordinates of points in a finite projective space such that no of them belong to a subspace of dimension or less. Therefore, this condition is equivalent to the condition of Theorem 3.3.2.
We will show that two conditions of Theorem 3.3.2 are equivalent to the condition of Theorem 3.3.1.
Theorem 3.3.3. The following three statements are equivalent:
There exists the matrix of size ( is prime) such that any its submatrix of size has rank
There exist points in projective space such that no of them belong to a subspace of dimension or less.
There exists a system of Eq. (4.3) such that there is no nontrivial linear combination of the equations that contains less than nonzero coefficients.
Proof Suppose that the statement 3 of the theorem holds. We will use the following transformations of the matrix: addition a multiple of one row to another and permutation of rows or columns, and call them elementary transformations. It is evident that the matrix of coefficient of the system Eq. (4.3) by elementary transformations can be converted to the following matrix:
Each row of the matrix Eq. (111) is a nontrivial linear combination of rows of the matrix . Hence, each row of the matrix Eq. (111) contains at least nonzero elements. I.e., for any there is at least nonzero elements among numbers . A similar statement can be made for any nontrivial combination of rows of the matrix Eq. (111). Namely, a nontrivial linear combination with nonzero coefficients of rows of the matrix Eq. (111) contains at least nonzero elements, and at least of them occur in the first columns.
Consider the matrix
and some nontrivial linear combination of its rows with rows selected from the first rows and rows selected from the rest rows. It is evident that the first elements of this linear combination contain at least nonzero elements. That means that any submatrix of size of the matrix
of size has rang .
Suppose that the statement 1 of the theorem holds and rank of the matrix equals . Select independent rows in the matrix (without loss of generality, assume that these rows are the last rows of . Then each row of the matrix can be represented by a linear combination of rows of the matrix:
Consider the matrix The number of nonzero elements in the -th row of the matrix cannot be less than . Otherwise, by Eq. (4.3), could be represented by a linear combination of these nonzero elements and therefore, there would exist linearly independent rows of the matrix . Thus, we have arrived at a contradiction.
For the same reason, a nontrivial linear combination rows of the matrix cannot contain less than nonzero elements.
Form the following matrix of size :
Using the properties of the matrix , we can get that any nontrivial linear combination of any rows of the matrix Eq. (115) contains at least nonzero elements. Since , the matrix of required size can be generated from the matrix Eq. (115) by deleting any rows.
This completes the proof of the theorem.
We will say that we are using geometric method of construction of hypercubes of strength when we construct them based on Theorem 3.3.1 or Theorem 3.3.2. In this case points of the hypercube satisfy the system of type Eq. (4.3).
Alias sets of pencils of parallel flats
In this paragraph, we will concentrate on the nature of pencils of parallel flats in fractional geometric designs (Brodsky, 1972).
Consider the full symmetrical design (the design and all pencils of parallel flats
The nature of the contrasts generated by these pencils is defined by Theorem 3.2.1. Let be the subset of points of the design generated by independent equations Eq. (4.3).
The Eq. (4.3) is called generating relations of the design . The pencil is called generators of the design . Note that for the given design , a selection of the generators is not unique. The pencils
( are not equal simultaneously to zero) is called defining pencils of the design . It is evident that we can get a unique representation of the defining pencils Eq. (117) if we set to 1 the first of nonzero coordinates. Therefore, the total number of different defining pencils of the design generated by Eq. (4.3) equals .
For the design consider vector with coordinates equal coordinates of the contrast generated by the pencil in . In this case we will say that the contrast in generates in . If is a defining pencil of the design , then all points of belong to one of the flats of Eq. (116) (for and no points of belong to other flats of Eq. (116) (for . In this case generates zero vector in . It is evident that the vector is generated by those and only those contrasts that correspond to defining pencils of the design . In this case we shall say that the vector and vectors generated by defining pencils of the design belong to the same alias set (for the design ). We also shall say that defining pencils of the design belong to the same alias set.
If is not a defining pencil of the design , each flat of the pencil Eq. (116) intersect in points of -flat, which we denote by . In this case each point of the design belong to one and only one flat of the pencil Eq. (116) and, therefore, one and only one flat . Hence, a pencil of parallel flats in generates a pencil of parallel -flats in the design which we denote by .
Consider the flat
of the pencil
It is evident that the flat Eq. (118) intersects Eq. (4.3) in the same points as Eq. (116). Besides, all flats intersecting Eq. (4.3) in the same points as Eq. (116) are represented by Eq. (118).
Since the pencils are linearly independent, different sets Eq. (119) correspond to different pencils. Therefore, the total number of different pencils of type Eq. (119) equals . Hence, for any pencil that is not defining pencil of the design there exists pencils, including , that generate the same pencil in the design . We shall say about such pencils that they belong to the same alias set of the design . We shall also say that contrasts generated by these pencils belong to one alias set.
The total number of the pencils in equal . An alias set of defining pencils of the design consists of pencils. Therefore, the number of different alias sets of nondefining pencils equals
Consider two different pencils and that are generated by pencils of two different alias sets for the design . The rows of the matrix
are linearly independent. Hence, any given -flat of the pencil intersects different -flats of the pencil in different -flats. Any two of these -flats have no point in common, because two different -flats of the pencil have no point in common. Any -flat contains exactly points. Therefore, each -flat of the pencil contains exactly points of points belonging to the given -flat of the pencil . Therefore, by Lemma 3.1.1, degrees of freedom carried by the pencil are orthogonal to degrees of freedom carried by the pencil . Therefore, the following theorem has been proved.
Theorem 3.4.1. All pencils of parallel flats in are split into alias sets with pencils in each and one alias set with defining pencils. The pencils belonging to the same alias set generate identical pencils of parallel flats in the design . The pencils from different alias sets generate pencils of parallel flats in the design with orthogonal degrees of freedom.
It follows from the proof of Theorem 3.3.1 that if the design contains all combinations of levels of factors , no defining pencil has all coordinates other than simultaneously equal to zero. The pencil that have part of coordinates (namely, not equal to zero and the rest of coordinates equal to zero, cannot be a defining pencil and cannot be in the same alias set with the pencil that has all coordinates other than simultaneously equal to zero.
Therefore, if the design contains all levels of the factor , the pencil corresponding to main effects of this factor in cannot be a defining pencil of the design . Hence, the pencil in generated by the pencil forms contrasts orthogonal to the vector , with coordinates that depend only on levels of the factor . Therefore, the pencil also forms a full set of main effects of the factor in the design .
If the design contains all combinations of levels of two factors and , any pencil corresponding to interaction effects of these factors in cannot be a defining pencil of the design . Hence, the pencil in generated by the pencil forms contrasts orthogonal to the vector and all main effects of the factors and (because the pencils corresponding to main effects of factors and cannot be in the alias set together with the pencil . Since the pencil forms contrasts with coordinates that depend only on levels of the factors and , these contrasts are interaction effects of the factors and . Contrasts corresponding to all pencils form a full set of interaction effects of the factors and .
Continuing this reasoning by induction, we get the following theorem.
Theorem 3.4.2. If the design contains all combinations of levels of the factors or (which is the same) no defining pencil has all coordinates other than simultaneously equal to zero, all pencils corresponding to interaction effects of the factors in generate pencils of parallel flats in the design corresponding to a full set of interaction effects of these factors in the design .
We will get independent effects if we select not more than one pencil from each alias set.
Defining relation
Consider a geometric design generated by independent Eq. (4.3). We will call the following relationship a defining relation of the design :
The coefficients in Eq. (120) match with the coordinates of the defining pencils. A so-called standard defining relation is derived from the defining relation Eq. (120) by multiplying of each of its side by the element such that the first nonzero coefficient of the side equals 1.
By Theorem 3.3.1, the geometric design is a hypercube of strength if and only if no side of the defining relation Eq. (120) contains less than nonzero coefficients.
Two-level designs
Any geometric two-level design (as any geometric design) is uniform. I.e., any level of any factor occurs in the design exactly times ( is the number of treatment combinations in the design). Hence, the Chebyshev model will be the same as the -model of true effects. Then a full factorial model is
where for one of two levels of the factor and for another level.
Consider two matrices of the design : and , where , if the factor occurs in the -th treatment of the design at level 0, and , if the factor occurs in the -th treatment of the design at level 1. Then, obviously, the following two equalities are equivalent:
Therefore, the system of the generating relations Eq. (4.3) for the geometric two-level design is transformed as follows:
where ( or 1), .
The system Eq. (4.6) in accordance with this section corresponds to a subset of points of the full design .
The expressions of form or is called an interaction (as opposed to an interaction effect), or an - letter interaction if exactly numbers of equal 1. We will use a concept of generating, defining, independent interactions similar to a concept of generating, defining, independent flats, i.e., equalities of type Eq. (4.3). Generating interactions will also be called generators.
An elementary transformation of a set of interactions is multiplication of one of the interactions of the set by other interactions of the set.
Let be the coefficient matrix of the full design (the design for a full factorial model Eq. (121). As a simple consequence of Theorem 1.4.1 and Note 1 to the Theorem 1.4.1 we get the following theorem.
Theorem 3.6.1. The matrix is a square matrix with elements 1 and 1; all columns of the matrix are pairwise orthogonal; the values and at the points of the design form the vector of the main effect of the factor and the vector of the interaction effect of the factors respectively.
Sometimes, we will use notations and in the design matrix and in the coefficient matrix instead of 1 and 1 respectively.
Consider the matrix of the full design (the design and its submatrix defined by the following generating relations:
where are independent interactions .
In the matrix select rows corresponding to the design . Denote the resulting matrix by . Denote by the matrix composed of one representative from each set of identical columns of the matrix .
Theorem 3.6.2. For the design defined by the generating relations Eq. (123) the following statements hold:
The matrix has size columns of the matrix are split into alias sets, so each alias set has identical columns, columns from different alias sets are orthogonal.
There exist columns of the design that form the full design (the design . There exist no columns of selected columns such that their product (in the sense of Definition 1.4.1) corresponds to a defining interaction.
The matrix is identical to the coefficient matrix of the design for the full factorial model.
Proof Statement 1 of the theorem follows from Theorem 3.4.1. We will prove statements 2 and 3 by induction. Let . Assume that and are the matrices that contain those row of the matrices and respectively that satisfy the first generating relation .
If belongs to the interaction delete the column corresponding to from columns of . Then the rest columns form the full design (the design , because these is no two identical rows in the . Indeed, if such two rows exist (say, the -th and the -th rows), the -th and the -th elements of the deleted column have different signs (otherwise we get two identical row in , which is a contradiction). Therefore, the -th and the -th elements of the column corresponding to the interaction have different signs (which is also a contradiction).
Statement 3 of the theorem for is obvious, because contains a unit column, all columns of the full design , and all possible products of columns of the full design .
Now assume that the theorem is valid for for the design defined by generating relations
We will prove that the theorem is valid for for the design defined by generating relations
Let
be the defining interactions of the design . Then it is evident that
are the defining interactions of the design . By the induction hypothesis, among selected columns of the design forming the full design there are no columns which product produces the defining interactions Eq. (124). Therefore, there are no columns among of them which product produces two or more defining interactions Eq. (125). Indeed, if such two interactions exist, they are interactions of type and . Their product is an interaction of type which is a contradiction. Now select the columns forming the full design and consider only those columns with the product that forms an interaction from Eq. (125). Delete any of these columns. The remaining columns, obviously, form the full design (the design ). Among of them, there are not columns with the product that forms an interaction of the defining relation Eq. (125).
This completes the proof of the theorem.
By Theorem 3.6.2, the columns of the matrix split to alias sets; each alias set has identical columns; columns from different alias sets are orthogonal. It is evident that the alias set containing the interaction can be found multiplying all interactions of the defining relation
by . Therefore, the alias set that contains the interaction is
We cannot find unique LS estimates of parameters of the full factorial model Eq. (121) for the fractional design (the design ), because the coefficient matrix , obviously, contains identical columns and, therefore, the matrix is singular. However, if the model contains only one interaction from each alias set, we can find unique LS estimates. Indeed, in this case the matrix with orthogonal columns is the coefficient matrix of the design . The same is valid for the model that has not more than one representative from each alias set of interactions. Hence, the LS estimate of the vector of parameters of the model is
Assume that we are using the design for the postulated model
that contains one interaction for each alias set, but the real model
is the full factorial model Eq. (121) (the matrix is derived from the matrix by deleting of the columns included in ). Then the LS estimates Eq. (127) are biased. By Eq. (92),
where is the bias matrix.
Put together the identical columns in . Then the matrix of size is
where each row contains elements equal . Hence, the bias matrix is:
Using Eq. (128), we get the system of scalar equalities. Considering any of them, we get the following theorem.
Theorem 3.6.3. Suppose that we are using the design to estimate coefficients of the postulated model that contains one interaction for each alias set, but the real model is the full factorial model Eq. (121). Then the estimate Eq. (127) of the coefficient corresponding to the interaction is biased. The bias is equal to the sum of effects corresponding to all interactions (excluding ) that belong to the alias set containing .
Hence, any estimate Eq. (127) is an unbiased estimate of the sum of effects corresponding to the interactions from one alias set. Such effects are called confounded.
In this paragraph we will discuss estimates of effects for a geometric design assuming that the model contains not more than one interaction from each alias set.
Now we will focus on construction technique of a family of geometric designs, introduced by Box and Hunter (1961) and will give a series of theorems based on their ideas.
Consider generators of the geometric design . They can be treated as independent interactions. Change signs of some of them. It is evident that the resulting interactions are also independent. They correspond to a geometric design that is different from the initial design. There are ways of allocating signs plus and minus to generators. All corresponding designs are said to belong to the same family.
Definition 3.6.1. Generators of one of the design of the family are called principal generators if they have only positive signs; the corresponding defining relation is called a principal defining relation; the corresponding design is called a principal design of the family.
The whole set of defining relations of the same family can be represented by the following formal relation:
where are the principal generators.
Lemma 3.6.1. If the interactions
are independent, the interactions
are also independent.
Proof Suppose in the contrary that the following relationship is valid:
where , , and equal either or 1.
Then we get
Since the interactions Eq. (129) are independent, we came to a contradiction. This proves the lemma.
Lemma 3.6.1 can be reformulated as follows.
Lemma 3.6.2. If Eq. (129) are generators of the design , Eq. (130) are also generators of the design.
Lemma 3.6.3. For two designs of the same family, we can select generators in such a way that all of them are pairwise identical except one pair with the generators that have different signs.
Proof Consider two designs of the same family. The first design has the generators Eq. (129). The generators of the second design are
By Lemma 3.6.2, the interactions Eq. (130) are the generators of the first design, and the interactions
are the generators of the second design, which was to be proved.
Definition 3.6.2. The design that contains all treatments of the designs is called an aggregated design
Note that an aggregated design is not similar to a unit of the set theory. For example, if each of two designs contains the same treatment combination, the aggregated design includes this treatment twice.
Theorem 3.6.4. The aggregated design of two geometric designs with the generators Eqs (129) and (131) is also a geometric design with the generators
Proof Since the interactions Eqs (130) and (132) are generators of the first and the second designs respectively, the aggregated design satisfies the relations
It is evident that the interactions of Eq. (133) are independent. Since the number of interactions in Eq. (133) is , these interactions are the generators of the aggregated design.
Thus, the proof is complete.
Since defining relations of two geometric designs of the same family differ only by signs, each alias set of interactions of one design corresponds to some alias set of the other design with interactions that differ only by signs.
Theorem 3.6.5. Let and be two geometric designs of the same family. Then estimates of the aggregated design are half-sums and half-differences of pair of the unbiased estimates of sums of effects in corresponding alias sets of the designs and .
Proof Since the defining relation contains all possible products of the generators, a half of interactions (including 1) of the defining relation of the design by Lemma 3.6.2, are identical to a half of interactions of the defining relation of the design other interactions of the defining relation of the design differ by signs from corresponding interactions of the defining relation of the design . Therefore, in any pair of corresponding alias sets, a half of interactions in and ) are identical, and a half of interaction in and in ) differ by signs. It is evident that the interactions belong to one alias set in , and the interactions belong to other alias set. The column of the coefficient matrix of the design corresponding to interactions denote by . The column of the coefficient matrix of the design corresponding to interactions denote by . Then the column of the coefficient matrix of the design corresponding to the interactions is
The column of the coefficient matrix of the design corresponding to the interactions is
The estimate corresponding to the alias set in is
where , and are vector-columns of observations in the designs , and respectively.
The estimate corresponding to alias set is
which was to be proved.
Let be the geometric design with the generators . Assume that first generators do not contain variable . Let be the design derived from by deleting the column . Then the following theorem holds.
Theorem 3.6.6. is a geometric design with the generators
Proof For the design obviously, the following relations hold:
because these relations hold for the design and interactions Eq. (134) do not contain . By Lemma 3.6.1, the interactions Eq. (134) are independent. Their number equals . Hence, they are the generators.
This completes the proof of the theorem.
Next, we will present a few more theorems for two-factor geometric designs from the article by Brodsky and Brodsky (1977). The theorem will be accompanied by examples from the same article. Partially, these theorems are consequences of the results presented in this article for the general case of geometric designs with factors at levels. However, in the article of Brodsky and Brodsky (1977) one can find the direct proofs of the theorems for 2, and we will present below some of them.
Theorem 3.6.7. points satisfying the system Eq. (4.3) contain all combinations of the levels of the factors with equal frequency if and only if any nontrivial linear combination of Eq. (4.3) contains at least one nonzero coefficient other than the -th-th.
The proof of Theorem 3.6.7 is similar to the proof of Theorem 3.3.1.
The set of the factors , by Theorem 3.6.7, contains all combinations of the levels with equal frequency if and only if no defining pencil has simultaneously all nonzero coordinates other than the . Therefore, the following theorem holds.
Theorem 3.6.8. The design corresponding to the generating relations Eq. (4.6) contains all combinations of the levels of the factors , …, with equal frequency if and only if is not a defining interaction for any 0 or 1.
Following statement due to Rao (1950) is a special case of Theorem 3.6.8 and restatement of Theorem 3.3.1.
Theorem 3.6.9 (Rao, 1950). The design corresponding to the generating relations Eq. (4.6) is a hypercube of strength if and only if all defining interactions contain more than letters.
The following theorem is a consequence of Theorem 3.4.1 for two-level designs.
Theorem 3.6.10. For the design corresponding to generating relations Eq. (4.6), all effects of the design are split into alias sets (). One of them (defining) contains effects. Each of the rest alias sets contains effects. Effects from different alias sets (one from each set) generate pairwise orthogonal effects in the design . Effects of the same alias set generate identical effects in the design .
The nature of the effects of alias sets is defined by the following theorem that is a consequence of Theorem 3.4.2.
Theorem 3.6.11. If is not a defining interaction of the design for any 0 or 1, all main effects and interaction effects of the factors , …, in the design generate main effects and interaction effects of the same factors in the design .
Let be the generators of the design . Then the defining relation of the design is
Let be some interaction. Then it follows from Eq. (135) that
All interactions in Eq. (136) are different and their number is equal to (including maybe 1). Hence, the following theorem holds.
Theorem 3.6.12. An alias set that includes an interaction can be represented by the interactions of Eq. (136). If no interaction of Eq. (136) equals 1, the interactions Eq. (136) and only they form alias set including . If one of the interactions Eq. (136) equals 1, the rest interactions and only they form defining alias set including .
Theorem 3.6.13. Let
be the pairs of confounded effects in the design . Then is confounded with (they belong to the same alias set).
Proof It is evident that
where is a defining interaction of the design
Hence,
where is, obviously, a defining interaction. This proves the theorem.
Theorem 3.6.14. Let be all interactions of the same alias set. Then and only they form all defining interactions.
Proof The alias set of interactions , by Theorem 3.6.12, can be represented as , , , , …, …. This proves the theorem.
If the model (part of the model Eq. (121)) contains at least two interactions that belong to the same alias set of the design , coefficient matrix of the design has identical columns. Therefore, the information matrix is singular and the solution of the normal equations of the method of least squares for the parameters of the model is not unique. If the model contains not more than one interaction from each alias set, the solution of the normal equations is unique.
For a special case of the model of main effects that includes only 1-letter interactions, the solution of the normal equations is unique if and only if for the design there is no alias set that contains more than one 1-letter interaction. The last condition, by Theorem 3.6.9, is equivalent to the condition that the design is a hypercube of strength 2.
A case when the model contains (except all 1-letter interactions ) also some interactions can be reduced to the main effect model as follows. Instead of the nonsingular design for the model
consider the nonsingular geometric main effect design for the model
where correspond to additional factors .
Assume that for the design the following condition holds: there exist the generators . Then and belong to the same alias set for any . However, the interactions belong to the different alias sets, because is a nonsingular main effect design. Therefore, the interactions belong to the different alias sets. Therefore, the following theorem holds.
Theorem 3.6.15. If there exists a nonsingular geometric main effect design in runs for the model Eq. (138) with the generators including then there exists a nonsingular geometric design in runs for the model Eq. (137).
Theorem 3.6.16. For the design corresponding to Eq. (4.6), there exist such and nondefining interactions (unique for the given and one from each alias set) that no interaction of contains any of letter .
Proof It is evident that there exist such that the set of the generators Eq. (4.6) can be converted to the set of generators by elementary transformations and the following condition holds: the generator contains and does not contain .
Consider now some interaction that belongs to the alias set . Assume that contains and does not contain . Then, by Theorem 3.6.12, and has the property specified by Theorem 3.6.14. To prove that the interaction is the only one for the given , assume in contrary that there exist two such interactions and . By Theorem 3.6.14, their product is a defining interaction (not containing . On the other hand, a defining interaction should contain some of as a product of some generators. This contradiction proves the theorem.
Theorem 3.6.17. In the design corresponding to the generating relations Eq. (4.6), there exist factors (columns) forming the full design . The elements of any of remaining columns are the products of the corresponding elements of some columns of (fixed for the given ).
Proof Similar to the proof of Theorem 3.6.16, find such and generators of the design that the generator contains and does not contain . Then, obviously, any defining interaction contains at least one letter of or (which is the same) the interactions that contain no letter of is not defining. I.e., the interaction is not defining for any or 1. By Theorem 3.6.8, the design contains all combinations of the levels of the factors . These factors, obviously, form the full design. We can get the elements of columns from the generating relations by using the following formula:
where the generators do not contain .
Theorem 3.6.18. There exist such representatives of all nondefining alias sets (one from each set) that their product equals either 1 or any given defining interaction.
Proof By Theorem 3.6.16, we can find such that there exist nondefining interactions (one from each alias set) containing only letters . The number of nondefining alias sets equals . The number of different interactions containing only also equals . Therefore, selected nondefining interactions include all different interactions containing only . It is evident that the number of the interactions including the given letter equals . Therefore, the product of selected nondefining interactions equals 1.
To make this product equal to the given defining interaction replace the interactions with the interaction (the interaction , belongs to the alias set with the interaction
This proves the theorem.
The following theorem is a simple consequence of Theorem 3.6.18.
Theorem 3.6.19. There exist such representatives of all alias sets (one from each set) that their product equals either 1 or any given defining interaction.
Theorem 3.6.20. The product of the interactions from different alias sets equals either 1 or a defining interaction.
Proof By Theorem 3.6.18, we can select the representatives of alias sets so that . However, for any representative of the -th alias set, , where is a defining interaction. Hence, , which was to be proved.
Application: Program algorithm of factorial designs
The introduced theory of factorial designs can be useful not only for future theoretical studies but also for applications. One of example of such approach is a computer algorithm (Brodsky et al., 1978; Brodsky, 2014) of construction of optimal and close to them factorial designs. The algorithm includes two basic modules and is based on a combination of analytical methods, a catalog of basic designs, and numerical procedures.
The application of general numerical procedures for constructing an optimal design seems appropriate only for relatively small dimensions. This is true not because they lead to time consuming calculations for large dimensions but mostly because they do not give the factorial structure of the designs, so essential for a clear interpretation of the results of experiments. That is why the described algorithm has been chosen as a combination of analytical techniques and numerical procedures. The algorithm generates designs that are obtained as transformations of some class of prebuilt regular uniform (nongeometric) designs and as transformations of several types of geometric designs.
We used three types of transformations. The first type of transformation – collapsing of the factors – is introduced by Chakravarti (1956) and developed by Addelman (1962). An example of this type of transformation is given below in the diagram.
Three-level Factor
Two-level Factor
0
0
1
1
2
0
The second type of transformation is called the splitting of factors. It was introduced by Addelman (1962). An example of this type of transformation is given below.
Four-level Factor
Two-level Factors
0
0 0 0
1
1 0 1
2
0 1 1
3
1 1 0
The third type of transformation – replacement of factors – is also the technique by Addelman (1962). This method is the inverse to the splitting procedure. It transforms three two level factors to one four level factor.
The algorithm works for the factorial models that include main effects of quantitative and/or qualitative factors, any set of two-factor interaction effects of two-level factors, and also all interaction effects of sets of three two-level factors.
Input data are supposed to include the following information:
The number of factors and the numbers of their levels.
Required interactions of two-level factors.
The maximal number of experiments.
A size of the maximal block.
There exist many methods leading to construction of effective designs for different types of factorial models. However, it is relatively easy to construct a design using a given method, but it is much more difficult to solve the inverse problem, namely, the task of finding method of construction corresponding to the requested input data. To illustrate this point, consider the following example. Let
be the generators of a geometric design for 14 two-level factors in 32 runs.
Anyone familiar with a methodology of geometric designs can easily construct the design and the alias sets for the given generators Eq. (139). An analysis of the alias sets shows that the design is nonsingular (and therefore has a wide range of optimal properties) for the model that includes absolute term, main effects of all 14 factors , all two-factor interaction effects of factors and three two-factor interaction effects of the following pairs of factors: , and . A pair of main effects of two-level factors together with interaction effects of these factors is equivalent to main effects of a four-level factors. Therefore, instead of 14 two-level factors we can construct a design for 8 two-level factors and 3 four-level factors. One of these four-level factor can be treated as block factor (with the block size 8), and two other four-level factors can be treated as two qualitative factors. Therefore, the geometric design defined by the generators Eq. (139) can be transformed to an effective nonsingular design in four blocks (size 8 each). The model will include absolute term, effects of levels of two qualitative four-level factors, effects of levels of a block factor, main effects of eight two-level factors and all two-factor interaction effects between the first five of them.
Now consider the inverse problem, which actually occurs in practice. Suppose that we need to estimate parameters of the model that includes absolute term, effects of levels of two qualitative four-level factors, main effects of eight two-level factors and all two-factor interaction effects between the first five of them. Besides, the number of experiments that have to be conducted in homogeneous environments shall not exceed 10. The problem of finding generators that lead to a nonsingular design for the model specified above is a very laborious task. It presents a significant challenge even for experts in the theory of factorial designs.
In other cases, we are also faced with a problem to construct an optimal factorial design with the required properties while navigating through a huge amount of methods of construction. That is why the most sensible approach in this situation is an elaborate computer algorithm for the construction of effective designs.
The described algorithm is based on three prebuilt components (Brodsky et al., 1978; Brodsky, 2014): catalog of regular uniform (nongeometrical) designs, catalog of optimal transformations, and set of geometrical designs. Numerical procedures have three goals: to find generators of the geometric design corresponding to program input, to find optimal combinations of transformations, and to split the resulting design into groups with experiments performed in homogeneous environment. The algorithm is fast and finds optimal solution for very complicated tasks. In simple cases, this algorithm can be also used for manual operation with a help of all three prebuilt catalogs.
Conclusion
The article introduces mathematical foundation of theory of factorial design of experiments. It presents a formal concept of factorial model and designs. The most significant results that are based on the introduced concept are obtained in the following areas. The first series of results are represented by theorems developing the fundamental concept of the frequency proportionality condition originally introduced by Plackett (1946). Second area are represented by the results on equivalence of various factorial models, whether they belong to traditional regression models, models for analysis of variance, or mixed models. The third area of important results contains many powerful theorems on the optimality of factorial plans for various types of factor models. In particular, these results include theorems based both on the properties of regular factorial designs and on ideas of Hoel (1958) and Guest (1958) for polynomial regression. Another series of results represents further developments of the fundamental idea of Bose (1947) on the splitting of degrees of freedom in a full symmetrical design. In accordance with the introduced concept, this idea is developed for fractional factorial plans. One more series of theorems is associated with the factorial designs obtained as a solution of a system of linear equations in finite Euclidean space. Many of these results related to two-level designs that have the most of application.
Thus, the introduced concept of factorial designs and factorial models supports many important aspects of experimental design, including the effectiveness of statistical inferences and construction of the designs, thereby opening the way for advancements in the theory of experiments.
Footnotes
For a discrete domain of points .
References
1.
AddelmanS. (1962). Orthogonal main effect plans for asymmetrical factorial experiments. Technometrics, 4, 21-46.
2.
BoseR. C. (1947). Mathematical theory of the symmetrical factorial design. Sankhyā, 8, 107-166.
3.
BoseR. C. & BushK. A. (1952). Orthogonal arrays of strength two and three. Annals of Mathematical Statistics, 23, 508-524.
4.
BoxG. E. P. & HunterJ. S. (1961). The 2k-p fractional factorial designs, Part 1. Technometrics, 3, 311-351; Part 2. Technometrics, 3, 449-458.
5.
BrodskyL. I. & BrodskyV. Z. (1977). Properties of geometric designs 2k (in Russian). In NalimovV. V. (ed.) Regression experiments. Design and analysis (85-102): The Moscow University Press.
6.
BrodskyS. (2014). An introduction to the factorial design of experiments: Manhattan Academia.
7.
BrodskyV. Z. (1971). On orthogonal designs (in Russian): The Moscow University Press.
8.
BrodskyV. Z. (1972). Multifactorial regular designs (in Russian): The Moscow University Press.
9.
BrodskyV. Z. (1975). Factorial experiments: models, designs, optimality (in Russian). In NalimovV. V. (ed.) Design of optimal experiments (51-105): The Moscow University Press.
10.
BrodskyV. Z.BrodskyL. I.MaloletkinG. N. & MelnikovN. N. (1978). On computer catalog of factorial designs of experiment (in Russian). In MarkovaE. V. (ed.) Problems of Cybernetics, Issue 47, Mathematical-statistical methods of analysis and design of experiments (6-24): USSR Academy of Sciences.
11.
BrodskyV. Z. & GolikovaT. I. (1981). On compatibility of optimality criteria of designs (in Russian). In MarkovaE. V. (ed.) Problems of Cybernetics, Non-traditional approach to the design of experiments (158-160): USSR Academy of Sciences.
12.
ChakravartiI. M. (1956). Fractional replication in asymmetrical factorial designs and partially balanced arrays. Sankhyā, 17, 143-164.
13.
ChengC.-S. (2013). Theory of factorial design Single- and multi-stratum experiments: Chapman and Hall, CRC Press.
14.
FedorovV. V. (1972). Theory of optimal experiments: Academic Press.
15.
GuestP. G. (1958). The spacing of observations in polynomial regression. Annals of Mathematical Statistics, 29, 294-299.
16.
HoelP. G. (1958). Efficiency problems in polynomial estivation. Annals of Mathematical Statistics, 29, 1134-1145.
17.
KieferJ. & WolfowitzJ. (1960). The equivalence of two extremum problems. Canadian Journal of Mathematics, 12, 363-366.
18.
MukerjeeR. & WuC. F. J. (2006). A modern theory of factorial designs. New York, NY: Springer.
19.
PlackettR. L. (1946). Some generalizations in the multifactorial design. Biometrika, 33, 328-332.
20.
RaghavaraoD. (1971). Construction and combinatorial problems in design of experiments. New York, NY: John Wiley.
21.
RaoC. R. (1950). The theory of fractional replication in factorial experiments. Sankhyā, 10, 81-86.
22.
SchefféH. (1959). The analysis of variance. Oxford, England: Wiley.