Mathematical foundations of the factorial design of experiments

Abstract

The article is devoted to the theory of the design of experiments. It introduces a formal definition of factorial models and factorial designs. On this basis, it builds the mathematical foundations of the factorial design of experiments. The presented concept supports many important aspects of experimental design including the main one: the construction of the optimal designs.

Keywords

Factorial design factorial models optimality of factorial designs main effects effects of interactions condition of proportional frequencies regular designs finite euclidean space geometric designs definition relation alias sets

1. Introduction

The purpose of this article is to present a collection of results on the mathematical foundations of the factorial design of experiments obtained by the author. These results were published by the author in various editions in Russian (under the name V.Z. Brodsky) and were not available to a wide range of researchers.

In this article, all essential issues of the design of factorial experiments are considered from a single perspective. Achieving this goal required certain difficulties to be surmounted. This primarily refers to foundations of the theory. Hundreds of publications contain the words “factorial designs” in their titles. Yet not all authors use the same definition of this concept. When I was writing my first paper on the subject I could not see a meaningful definition of factorial designs in statistical publications. I do not see it even now, when I submit this paper.

Raghavarao (1971), in his marvelous book “Construction and Combinatorial Problems in Design of Experiments” says that the factorial design of experiments occurs when different combinations of the factors at various levels influence a character under study. However, any multidimensional design includes different combinations of factors. Then how the factorial designs differ from others? Apparently, Raghavarao excludes the one-dimensional case from his definition, though he does not state this outright. In many books, the multidimensional condition is explicitly included in the definition of the factorial designs. For example, this is what Mukerjee and Wu (2006) do in their brilliant book “A Modern Theory of Factorial Designs”. They define a factorial experiment as an experiment involving $n$ ( $\geqslant$ 2) factors that appear at $s_{1},\ldots,s_{n}(\geqslant 2)$ levels. However, it is unclear from this definition how exactly the factorial design of experiments differs from any other multidimensional design. Given this definition, any multidimensional experiment has to be considered as factorial. Therefore, such a definition is not very productive. The same can be said about a definition of a general factorial design – a fractional factorial design - as a part of the full factorial design (an implicit definition of this kind is used by Cheng (2013) in his wonderful book “Theory of factorial design”).

Authors of numerous works in different fields of the design of experiments are unlikely to agree with any of these approaches to the definition of factorial designs. Will authors of articles on, say, polynomial designs (including in part rotatable designs) consider that they do research on factorial designs? I do not think so. In reality, none of them has ever used such a concept as “rotatable factorial designs”. No one would consider, for example, the rotatable design of second order in two variables as a fractional factorial design 5 ${}^{2}$ (even though it consists of treatment combinations of two five-level factors).

What can then be said about the designs constructed numerically and satisfying, say, criterion of $D$ -optimality for different models and design spaces, as Fedorov (1972) did, developing the ideas of Kiefer and Wolfowitz (1960)? Will researchers and the readers of these papers consider such designs “ $D$ -optimal factorial designs” only because they are multidimensional? Of course not.

What can we say about the authors of the book on the design of experiments? Do they use effectively the definition of factorial designs as multidimensional designs? Do they use such a definition, for example, when they think about a structure for their book? No, no one follows such a definition. On the contrary, authors of books on the general problem of the design of experiments (including the Raghavarao book quoted above) consider factorial designs separately from the sections devoted to other types of designs, for example, rotatable designs. On the other hand, books on factorial designs (including the Mukerjee and Wu book mentioned above) do not contain, say, a section devoted to rotatable designs. That means that authors of the book do not follow the definition of factorial designs as multidimensional ones. They structure their books based on an intuitive understanding of what factorial designs are. And this intuitive understanding has nothing to do with multidimensionality.

So, one of my goals was to introduce the concept of factorial designs and factorial models in a way that would reflect this intuitive understanding. The most important part of it, however, was not only to introduce definitions that would not frighten those who do research in the area of the design of experiments. My main goal was to make the introduced concept productive. And it seems to me that I managed to do it. In its final version, the concept supports many important aspects of experimental design, including the effectiveness of statistical inferences and construction of the designs.

A few more words about terminology. A number of authors started using the term “regular” to refer to the designs generated from finite geometries. In the beginning of the seventies, when my first works on the issue were published, I considered the concept of regular factorial designs as the designs that had certain statistical properties. The designs generated from finite geometries I called geometric. In this article, I will continue to follow these definitions.

2. Factorial models and designs

2.1 Factorial design

Consider $N$ observations $y_{1},\ldots,y_{N}$ and $m$ variables $X_{i}\left(i=1,\ldots,m\right)$ with values $X_{iu}$ corresponding to the $u$ -th observation $\left(u=1,\ldots,N\right)$ . Assume that a mathematical expectation of $y_{u}$ is the following function of $X_{iu}$ and parameters $\theta_{1},\ldots,\theta_{k}$ :

$\displaystyle Ey_{u}=\bm{\Theta}^{T}\bm{f}\left(X_{1u},\ldots,X_{mu}\right),$

where $\bm{\Theta}^{T}=\left(\theta_{1},\ldots,\theta_{k}\right)$ is a vector of unknown parameters; $\bm{f}=\left(f_{1},\ldots,f_{k}\right)^{T}$ is a vector of given functions.

The variable $X_{i}$ is said to be quantitative if all $X_{iu}$ are numbers. The variable $X_{i}$ is said to be qualitative if at least one of the values $X_{iu}$ is a symbol (even if it is represented by number). This definition of quantitative and qualitative variables may not be regarded as strict. Rather it can be regarded as an explanation which model (for quantitative or qualitative factors) will be considered. In other words, quantitative variables are those for which the model for quantitative variables is considered; qualitative variables are those for which the model for qualitative variables is considered.

Each of different values of the variable $X_{i}$ in the design matrix (or just design) $\bm{D}_{X}=\left\{X_{iu}\right\}(i=1,\ldots,m;$ $u=1,\ldots,N)$ is called a level. The number of different levels of the variable $X_{i}$ is denoted by $s_{i}$ . We will set up a correspondence between symbols $0,1,\ldots,s_{i}-1$ and different levels of the variable $X_{i}$ regardless of whether the variable $X_{i}$ is a quantitative or qualitative. In this case we actually deal with the factor $F_{i}$ (qualitative or quantitative) and its levels $0,1,\ldots,s_{i}-1$ . Then the design matrix can be rewritten as

$\displaystyle\bm{D}=\begin{Vmatrix}F_{11}&\ldots&F_{m1}\\ \ldots&\ddots&\ldots\\ F_{1N}&\ldots&F_{mN}\\ \end{Vmatrix},$

where the columns correspond to the factors, and the rows correspond to the treatment combinations (or treatments, runs) of design $\bm{D}$ ; $F_{iu}$ is the value of the factor $F_{i}$ in the $u$ -th treatment combination.

A design with $N$ runs for factors $F_{1},\ldots,F_{m}$ with $s_{1},\ldots,s_{m}$ levels respectively will be denoted by $s_{1}\times\ldots\times s_{m}//N$ (or just $s_{1}\times\ldots\times s_{m}$ ).

It is clear that the maximum number of different rows in the design matrix is equal to $s_{1}\ldots s_{m}$ .

Definition 1.1.1. A design $s_{1}\times\ldots\times s_{m}//N$ that consists of $N=s_{1}\ldots s_{m}$ different rows is called a full design. A design that does not contain at least one of $s_{1}\ldots s_{m}$ combinations of levels is called a fractional design.

We will not assume that a design does not contain identical treatment combinations.

Definition 1.1.2. A design is called symmetrical if all factors have the same number of levels. A design is called uniform if for any given factor, its levels appear equally in the design.

A design will be called factorial only with respect to a specific type of a model for which the design is considered (Brodsky, 1975). The types of factorial models will be listed below.

2.2 Factorial model for quantitative factors

Assume that in the design $\bm{D}$ , all $m$ factors $F_{1},\ldots,F_{m}$ (with $s_{1},\ldots,s_{m}$ levels respectively) are quantitative. Then consider the following model:

$\displaystyle Ey\left(X_{1},\ldots,X_{m}\right)=b_{0}+b_{1}^{(1)}f_{1}^{(1)}% \left(X_{1}\right)+\ldots+b_{1}^{\left(s_{i}-1\right)}f_{1}^{\left(s_{i}-1% \right)}\left(X_{1}\right)+b_{m}^{(1)}f_{m}^{(1)}\left(X_{m}\right)+\ldots+b_{% m}^{\left(s_{m}-1\right)}f_{m}^{\left(s_{m}-1\right)}\left(X_{m}\right)+\Pi.$ (1)

In the model Eq. (1), the following notations and assumptions are used. $y\left(X_{1},\ldots,X_{m}\right)$ is an observation that depends on $X_{1},\ldots,X_{m}$ . $\Pi$ contains terms with products $k_{i_{1}\ldots i_{r}}^{\left(q_{1}\ldots q_{r}\right)}f_{i_{1}}^{\left(q_{1}% \right)}\left(X_{i_{1}}\right)\ldots f_{i_{r}}^{\left(q_{r}\right)}\left(X_{i_% {r}}\right)(k_{i_{1}\ldots i_{r}}^{\left(q_{1}\ldots q_{r}\right)}$ are constants, $i_{1}\neq\ldots\neq i_{r})$ . The functions $1,f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ are linearly independent at points $X_{i1},\ldots,X_{iN}$ , i.e., Rg ${\bm{G}}_{i}=s_{i}$ for any $i=1,\ldots,m$ , where

$\displaystyle{\bm{G}}_{i}=\begin{Vmatrix}1&f_{i}^{(1)}\left(X_{i1}\right)&% \ldots&f_{i}^{\left(s_{i}-1\right)}\left(X_{i1}\right)\\ 1&f_{i}^{(1)}\left(X_{i2}\right)&\ldots&f_{i}^{\left(s_{i}-1\right)}\left(X_{i% 2}\right)\\ \vdots&\vdots&\ddots&\vdots\\ 1&f_{i}^{(1)}\left(X_{iN}\right)&\ldots&f_{i}^{\left(s_{i}-1\right)}\left(X_{% iN}\right)\\ \end{Vmatrix}$ (2)

The functions $f_{i}^{(q)}\left(X_{i}\right)$ can be polynomials in $X_{i}$ of degree $q$ . In particular, for any $i=1,\ldots,m$ the functions $f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ can be the Chebyshev orthogonal polynomials at points $X_{i1},\ldots,X_{iN}$ . In this case, the columns of the matrix $\bm{G}_{i}$ are pairwise orthogonal. The corresponding model is called the Chebyshev model.

Definition 1.2.1. The model Eq. (1) is called a full factorial model for quantitative factors (or an $A^{f}$ -model) for the factorial design $\bm{D}$ if $\Pi$ contains all possible terms with the products $k_{i_{1}\ldots i_{r}}^{\left(q_{1}\ldots q_{r}\right)}f_{i_{1}}^{\left(q_{1}% \right)}\left(X_{i1}\right)\ldots f_{i_{r}}^{\left(q_{r}\right)}\left(X_{i_{r}% }\right)(i_{1}\neq\ldots\neq$ $i_{r})$ .

The coefficient matrix for the $A^{f}$ -model Eq. (1) is

$\displaystyle\bm{X}=\begin{Vmatrix}1&f_{1}^{(1)}\left(X_{11}\right)&\ldots&f_{% m}^{\left(s_{m}-1\right)}\left(X_{m1}\right)&\ldots\\ 1&f_{1}^{(1)}\left(X_{12}\right)&\ldots&f_{m}^{\left(s_{m}-1\right)}\left(X_{m% 2}\right)&\ldots\\ \vdots&\vdots&\ddots&\vdots&\ddots\\ 1&f_{1}^{(1)}\left(X_{1N}\right)&\ldots&f_{m}^{\left(s_{m}-1\right)}\left(X_{% mN}\right)&\ldots\\ \end{Vmatrix}$ (3)

Definition 1.2.2. A set of factors $F_{1},\ldots,F_{m},$ pairs of factors $F_{i_{1}}F_{i_{2}}\left(i_{1}\neq i_{2}\right)$ , triples of factors $F_{j_{1}}F_{j_{2}}F_{j_{3}}(j_{1}\neq j_{2}$ $\neq$ $j_{3})$ , etc., is called a factorial set $\Omega$ if the following requirements are satisfied: if $F_{n_{1}}\ldots F_{n_{r}}\in\Omega$ , then $F_{l_{1}}\ldots F_{l_{v}}\in\Omega$ for all $v=1,\ldots,r-1$ and $l_{1},\ldots,l_{v}=n_{1},\ldots,n_{r},l_{1}\neq\ldots\neq l_{v}.$

Definition 1.2.3. The model for the factorial design $\bm{D}$

$\displaystyle Ey\left(X_{1},\ldots,X_{m}\right)=b_{0}+\sum\limits_{i=1}^{m}% \left[b_{i}^{(1)}f_{i}^{(1)}\left(X_{i}\right)+\ldots+b_{i}^{\left(s_{i}-1% \right)}f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)\right]+\sum\limits_{i_{% 1}i_{2}}\Big{[}b_{i_{1}i_{2}}^{1,1}k_{i_{1}i_{2}}^{1,1}f_{i_{1}}^{(1)}\left(X_% {i_{1}}\right)f_{i_{2}}^{(1)}\left(X_{i_{2}}\right)+\ldots+b_{i_{1}i_{2}}^{s_{% i_{1}}-1,s_{i_{2}}-1}k_{i_{1}i_{2}}^{s_{i_{1}}-1,s_{i_{2}}-1}f_{i_{1}}^{\left(% s_{i_{1}}-1\right)}\left(X_{i_{1}}\right)f_{i_{2}}^{\left(s_{i_{2}}-1\right)}% \left(X_{i_{2}}\right)\Big{]}+\ldots$ (4)

is called a factorial model for quantitative factors for the factorial set $\Omega$ (or an $A^{\Omega}$ -model) if the following requirements are satisfied: if the model Eq. (4) includes the term $k_{i_{1}\ldots i_{r}}^{q_{1}\ldots q_{r}}f_{i_{1}}^{\left(q_{1}\right)}\left(X% _{i_{1}}\right)\ldots f_{i_{r}}^{\left(q_{r}\right)}\left(X_{i_{r}}\right)$ for some set of $q_{1},\ldots,q_{r}$ then the model includes all terms for all $q_{1}=0,\ldots,s_{1}-1,\ldots,q_{r}=0,\ldots,s_{r}-1$ (by definition, $f_{i}^{(0)}\left(X_{i}\right)=1$ ).

It is evident that Definition 1.2.3 is consistent with Definition 1.2.2.

An $A^{\Omega}$ -model Eq. (4) is a general model for quantitative factors. An $A^{f}$ -model, for example, is a special case of an $A^{\Omega}$ -model.

2.3 Main effects and interaction effects

In an $N$ -dimensional Euclidean space $E_{N}$ we set up a correspondence between the $u$ -th coordinate of each vectors and the $u$ th treatment combination of the design $\bm{D}$ .

Definition 1.3.1. A nonzero vector $\bm{z}^{T}=\left(z_{1},\ldots,z_{N}\right)\in E_{N}$ is defined to be a contrast if

$\displaystyle\sum\limits_{u=1}^{N}{z_{u}=0}.$ (5)

Definition 1.3.2. The vector of the main effect of the factor $F_{i}$ of the design $\bm{D}$ is a contrast with equal coordinates for the same levels of the factor $F_{i}$ in the design $\bm{D}$ . The vector of the main effect is also called the vector of the interaction effect of order 0.

The definition of the vector of the interaction effect of $(r-1)$ -th order $(r\geqslant 2)$ is based on the definition of the vector of the interaction effect of $(r-2)$ -th order.

Definition 1.3.3. The vector of the interaction effect of $(r-1)$ -th order (or the vector of the $r$ -factorial interaction effect) of the factors $F_{1},\ldots,F_{r}$ of the design $\bm{D}$ is a contrast with equal coordinates for the same combinations of levels of the factors $F_{1},\ldots,F_{r}$ in the design $\bm{D}$ , orthogonal to all vectors of interaction effects up to ( $r-2)$ -th order of the factors $F_{1},\ldots,F_{r}$

We may omit word “vector” in the above two definitions.

A linear combination of several interaction effects of $(r-1)$ -th order $(r\geqslant 1)$ of $r$ factors is, obviously, an interaction effect of $(r-1)$ -th order of the same factors or zero-vector. Therefore, a set of all interaction effects of ( $r-1)$ -th order of $r$ factors, together with the zero-vector, is a linear subspace of the space $E_{N}$ .

Definition 1.3.4. The number of degrees of freedom carried by interaction effects of $(r-1)$ -th order for the design $\bm{D}$ is the dimension of the corresponding linear subspace.

It is evident that the number of degrees of freedom carried by main effects of the factor $F_{i}$ for any design is equal to $s_{i}-1$ .

The requirement of orthogonality of the $(r-1)$ -th order interaction effects to all interaction effects up to $(r-2)$ -th order of the same factor is obviously equivalent to the requirement of orthogonality to maximal linearly independent subset of the corresponding interaction effects.

Definition 1.3.5. The matrix $\bm{F}_{i}$ composed of the maximum subset of $s_{i}-1$ independent vectors of main effects of the factor $F_{i}$ is called a matrix of main effects of the factor $F_{i}$ . The matrix $\bm{F}_{ij}$ composed of the maximum subset of independent vectors of interaction effects of the factors $F_{i}$ and $F_{j}$ is called a matrix of interaction effects of the factors $F_{i}$ and $F_{j}$ , etc.

Introduce the following notation:

$\displaystyle\bm{\Phi}_{1\ldots r}=\left\|\bm{I},\bm{F}_{1},\ldots,\bm{F}_{r},% \bm{F}_{12},\ldots,\bm{F}_{1\ldots r}\right\|$

where $\bm{I}$ is a unit vector (with the elements equal to 1).

We will assume that any matrix in $\bm{\Phi}_{1\ldots r}$ is normalized in such a way that sum of squares of elements of any its column is equal to $N$ . In the matrix $\bm{F}_{i}$ , for each subset of identical rows, we delete all but one row and add a left column consisting of 1. Denote the resulting matrix by $\bar{\bm{\Phi}}_{i}$ .

Matrices $\bm{F}_{i}$ can be used as matrices $\bm{G}_{i}$ Eq. (104) for the $A^{f}$ -model Eq. (103), since

$\displaystyle Rg\left\|\bm{I},\bm{F}_{i}\right\|=Rg{\bar{\bm{\Phi}}}_{i}=s_{i}.$

2.4 The main theorem for full design

Definition 1.4.1. For two vectors $\bm{a}=\left(a_{1},\ldots,a_{N}\right)^{T}$ and $\bm{c}=\left(c_{1},\ldots,c_{N}\right)^{T}$ introduce operation $\otimes$ called multiplication, such that product

$\displaystyle\bm{a}\otimes\bm{c}=\left(a_{1}c_{1},\ldots,a_{N}c_{N}\right)^{T}.$

Let columns of the ( $N\times n$ )-matrix $\bm{A}$ be $\bm{a}_{1},\ldots,\bm{a}_{n}$ , and columns of the ( $N\times l$ )-matrix $\bm{C}$ be $\bm{c}_{1},\ldots,\bm{c}_{l}$ . Then, by definition,

$\displaystyle\bm{A}\otimes\bm{C}=\left\|\left(\bm{a}_{1}\otimes\bm{c}_{1}% \right),\left(\bm{a}_{1}\otimes\bm{c}_{2}\right),\ldots,\left(\bm{a}_{n}% \otimes\bm{c}_{l}\right)\right\|$

Theorem 1.4.1. For a full factorial design, any interaction effect of a set of factors is orthogonal to any interaction effect of other set of factors and the number of columns of the matrix $\bm{F}_{i_{1}\ldots i_{r}}$ is equal to $\left(s_{i_{1}}-1\right)\ldots\left(s_{i_{r}}-1\right)$ .

Proof Consider any two rows of the matrix of the full design $\bm{D}^{f}$ . It can be shown that for these two rows, there exists a column corresponding to some factor, such that for selected two rows, the factor has different levels. Without loss of generality, it can be assumed that the first two rows and the last column are considered. Select the columns in the matrices $\bm{F}_{i}=\{F_{i}^{pq}\}$ to make them pairwise orthogonal. It is evident that in the full design, all levels of the given factor occur equally. Therefore, the columns of the matrix $\bar{\bm{\Phi}}_{i}=\{\bar{\Phi}_{i}^{jl}\}$ will be pairwise orthogonal. Besides, for any $j$ and $l,$

$\displaystyle\sum\limits_{j}\left(\bar{\Phi}_{i}^{jl}\right)^{2}=\sum\limits_{% l}\left(\bar{\Phi}_{i}^{jl}\right)^{2}.$ (6)

It follows from Eq. (6) that for any $p$

$\displaystyle\sum\limits_{q}F_{i}^{pq}=const.$ (7)

Define $\bm{R}_{1}=\left\|\bm{I},\bm{F}_{1}\right\|$ . The number of columns of the matrix $\bm{R}_{1}$ equals

$\displaystyle\lambda_{R_{1}}=s_{1}.$ (8)

Define $\bm{R}_{i}\left(i=2,\ldots,m\right)$ by the following recurrence relation:

$\displaystyle\bm{R}_{i}=\left\|\bm{R}_{i-1},\left(\bm{R}_{i-1}\otimes\bm{F}_{i% }\right)\right\|.$

The number of the columns $\lambda_{R_{i}}$ of the matrix $\bm{R}_{i}$ and the number of the columns $\lambda_{R_{i-1}}$ of the matrix $\bm{R}_{i-1}$ are connected by the following obvious relation:

$\displaystyle\lambda_{R_{i}}=\lambda_{R_{i-1}}s_{i}.$ (9)

Consider the matrix $\bm{R}_{m}=\left\|\bm{R}_{m-1},\left(\bm{R}_{m-1}\otimes\bm{F}_{m}\right)\right\|$ . The number of the columns $\lambda_{R_{m}}$ of the matrix $\bm{R}_{m}$ , by Eqs (8) and (9), equals $\lambda_{R_{m}}=s_{1}\ldots s_{m}=N$ . Hence, $\bm{R}_{m}$ is a square matrix.

We will prove that the selected two first rows of the matrix $\bm{R}_{m}$ are orthogonal. Let the first two rows of the matrix $\bm{R}_{m-1}$ be $\bm{a}^{T}$ and $\bm{c}^{T}$ . Then the first two rows of the matrix $\bm{R}_{m}$ are

$\displaystyle\left(\bm{a}\bar{\Phi}_{m}^{11}\right)^{T}\ldots\left(\bm{a}\bar{% \Phi}_{m}^{1s_{m}}\right)^{T}\text{∼{}and∼{}}\left(\bm{c}\bar{\Phi}_{m}^{21}% \right)^{T}\ldots\left(\bm{c}\bar{\Phi}_{m}^{2s_{m}}\right)^{T}.$

Their scalar product

$\displaystyle\left(\bm{a},\bm{c}\right)\sum\limits_{j=1}^{s_{m}}{\bar{\Phi}_{m% }^{1j}\bar{\Phi}_{m}^{2j}}=0.$

Hence, any two rows of the matrix $\bm{R}_{m}$ are orthogonal.

To prove that any two columns of the matrix $\bm{R}_{m}$ are orthogonal we need to show that sum of squares of elements of any row of the matrix $\bm{R}_{m}$ is constant:

$\displaystyle\sum\limits_{l=1}^{s_{1}\ldots s_{m}}{r_{m}^{jl}=const,}$ (10)

where $r_{i}^{jl}$ is element of the $j$ -th row and the $l$ -th column of the matrix $\bm{R}_{i}$ .

By definition $\bm{R}_{i}$ ,

$\displaystyle\sum\limits_{l=1}^{s_{1}\ldots s_{i}}\left(r_{i}^{jl}\right)^{2}=% \sum\limits_{l=1}^{s_{1}\ldots s_{i}}\left(r_{i-1}^{jl}\right)^{2}\sum\limits_% {n=1}^{s_{i}}{\left(F_{i}^{jn}\right)^{2},\sum\limits_{l=1}^{s_{1}}\left(r_{i}% ^{jl}\right)^{2}=1+\sum\limits_{n=1}^{s_{1}-1}{\left(F_{1}^{jn}\right)^{2}.}}$

Therefore,

$\displaystyle\sum\limits_{l=1}^{s_{1}\ldots s_{m}}{\left(r_{m}^{jl}\right)^{2}% =\prod\limits_{i=1}^{m}\sum\limits_{n=1}^{s_{i}}\left(F_{i}^{jn}\right)^{2}}+% \prod\limits_{i=2}^{m}\sum\limits_{n=1}^{s_{i}}{\left(F_{i}^{jn}\right)^{2}.}$

Hence, by Eq. (7), we get Eq. (10).

Matrices of the form $\bm{F}_{i_{1}}\otimes\ldots\otimes\bm{F}_{i_{r}}$ , included in the matrix $\bm{R}_{m}$ , contain $\left(s_{i_{1}}-1\right)\ldots\left(s_{i_{r}}-1\right)$ columns. For each of these columns, its elements are equal for all treatments of the design $\bm{D}^{f}$ with the same combinations of levels of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ . By what we have already proved, each of these columns is orthogonal to all other columns. Hence, it can be proved by induction that these columns are interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ .

The number of different combinations of levels of the factors $F_{i_{1}},\ldots,F_{i_{m}}$ equals $s_{i_{1}}\ldots s_{i_{m}}$ . All vectors of main effects and interaction effects of these factors belong to an $l$ -dimensional subspace $E_{l}$ ( $l=s_{1}\ldots s_{m}-1$ ) of an $N$ -dimensional space $E_{N}$ , because elements of each of the vector of main effects or interaction effects are equal for all treatments with the same combinations of levels of the factors and all vectors of these effects are orthogonal to the unit vector.

Since

$\displaystyle\sum\limits_{i=1}^{m}\left(s_{i}-1\right)+\sum\limits_{i\neq j}% \left(s_{i}-1\right)\left(s_{j}-1\right)+\ldots+\left(s_{1}-1\right)\ldots% \left(s_{m}-1\right)=s_{1}\ldots s_{m}-1,$

the number of linearly independent $r$ -factorial interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ may not exceed

$\displaystyle\left(s_{i_{1}}-1\right)\ldots\left(s_{i_{r}}-1\right)$ (11)

and, therefore, is equal to Eq. (11).

By using matrices $\bm{F}_{i}$ with pairwise orthogonal columns, we get sets of orthogonal interaction effects. It can be shown that by using linearly independent (not necessarily pairwise orthogonal) main effects of the matrix $\bm{F}_{r}^{\prime}$ , we get linearly independent interaction effects. To prove that, consider the following lemma.

Lemma 1.4.1. Let $\bm{A,}\bm{A}^{\prime}$ , and $\bm{C}$ be matrices of size $N\times p,N\times p$ , and $N\times q$ respectively and

$\displaystyle\bm{A}=\bm{A}^{\prime}\bm{\Lambda},$ (12)

where $\bm{\Lambda}$ is a nonsingular square matrix of order $p$ . Then the matrices $\bm{A}\otimes\bm{C}$ and $\bm{A}^{\prime}\otimes\bm{C}$ are related by a nonsingular linear transformation.

Proof Let $\bm{c}_{i}$ be the $i$ -th column of the matrix $\bm{C}$ . Then, by Eq. (12),

$\displaystyle\bm{A}\otimes\bm{c}_{i}=\left(\bm{A}^{\prime}\bm{\Lambda}\right)% \otimes\bm{c}_{i}=\left(\bm{A}^{\prime}\otimes\bm{c}_{i}\right)\bm{\Lambda}.$

Therefore, for any $i\left(i=1,\ldots,q\right)$ , the matrices $\bm{A\otimes}\bm{c}_{i}$ and $\bm{A^{\prime}\otimes}\bm{c}_{i}$ are related by a nonsingular linear transformation. This proves the lemma.

Matrices $\bm{F}_{i}$ and $\bm{F}^{\prime}_{i}$ for any $i$ are related by a nonsingular linear transformation. Therefore, using Lemma 1.4.1 repeatedly, we get that $\bm{F}_{i_{1}}\otimes\ldots\otimes\bm{F}_{i_{r}}$ is related by a nonsingular linear transformation with $\bm{F}^{\prime}_{i_{1}}\otimes\ldots\otimes\bm{F}^{\prime}_{i_{r}}$ . Hence, $\bm{F}^{\prime}_{i_{1}}\otimes\ldots\otimes\bm{F}^{\prime}_{i_{r}}$ consists of linearly independent interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ .

This completes the proof of Theorem 1.4.1.

Definition 1.4.2. A set of linearly independent interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ is called full if the number of those effects of the set is given by Eq. (11).

Note 1 to Theorem 1.4.1. The proof of Theorem 1.4.1 gives us a method of construction of interaction effects for the design $\bm{D}^{f}$ as a product of main effects of the factors. By using a full set of orthogonal main effects of the factors, we get a full set of orthogonal interaction effects. By using a full set of linearly independent main effects of the factors, we get a full set of linearly independent interaction effects.

Note 2 to Theorem 1.4.1. If for any $i=1,\ldots,m$ , the functions $f_{i}^{(q)}\left(X_{i}\right)$ of the $A^{f}$ -model Eq. (103) are chosen in such a way that

$\displaystyle\sum\limits_{u=1}^{N}{f_{i}^{(q)}\left(X_{iu}\right)}=0,$

all columns of the matrix Eq. (104) except the first are vectors of main effects of the factor $F_{i}$ . If, in addition,

$\displaystyle\sum\limits_{u=1}^{N}\left\{f_{i}^{(q)}\left(X_{iu}\right)\right% \}^{2}=N$

then, by the proof of Theorem 1.4.1, a scalar square of any column $\bm{F}_{i_{1}}\otimes\ldots\otimes\bm{F}_{i_{r}}$ equals $N$ . Therefore, by Theorem 1.4.1 and Note 1 to the Theorem 1.4.1, the coefficient matrix Eq. (105) for a full factorial model is the matrix $\bm{\Phi}_{1\ldots m}$ of main effects and interaction effects of the factors $F_{1},\ldots,F_{m}$ for the design $\bm{D}^{f}$ .

2.5 A model of true effects for quantitative factors

Hereafter, we will consider the Chebyshev model only if the structure of the design $\bm{D}$ leads to orthogonality of all effects. Otherwise, we will consider so-called model of true effects for quantitative factors.

Consider a full design $\bm{D}^{f}$ with $N^{f}$ runs for all factors of the design $\bm{D}$ . Define a vector of true values $\bm{\eta}^{f}$ for the design $\bm{D}^{f}$ as follows:

$\displaystyle\bm{\eta}^{f}=E\bm{y}^{f}=E\left(y_{1},\ldots,y_{N^{f}}\right)^{T}.$

To define a vector of true effects $\bm{B}$ for quantitative factors, form the following matrix for the design $\bm{D}^{f}$ :

$\displaystyle\bm{\Phi}_{1\ldots m}^{f}=\left\|\bm{I},\bm{F}_{1}^{f},\ldots,\bm% {F}_{m}^{f},\bm{F}_{12}^{f},\ldots,\bm{F}_{1\ldots m}^{f}\right\|,$ (13)

where all matrices $\bm{F}^{f}$ have pairwise orthogonal columns (scalar squares of the columns of the matrix $\bm{\Phi}_{1\ldots m}^{f}$ equal $N^{f})$ . Then define

$\displaystyle\bm{B}=\frac{1}{N^{f}}\bm{\Phi}_{1\ldots m}^{f^{T}}\bm{\eta}^{f}.$ (14)

It is evident that

$\displaystyle\bm{\eta}^{f}=\bm{\Phi}_{1\ldots m}^{f}\bm{B},$ (15)

because, by Eq. (14) and Theorem 1.4.1,

$\displaystyle\bm{\Phi}_{1\ldots m}^{f}\bm{B}=\frac{1}{N^{f}}\bm{\Phi}_{1\ldots m% }^{f}\bm{\Phi}_{1\ldots m}^{f^{T}}\bm{\eta}^{f}=\bm{\eta}^{f}.$

Therefore, the following theorem has been proved.

Theorem 1.5.1. For the vector of observations $\bm{y}^{f}=\left(y_{1},\ldots,y_{N^{f}}\right)$ ,

$\displaystyle E\bm{y}^{f}=\bm{\Phi}_{1\ldots m}^{f}\bm{B}$ (16)

at the points of the design $\bm{D}^{f}$ with the factors $F_{1},\ldots,F_{m}$ where $\bm{B}$ is the vector of true effects Eq. (14).

By the definition of the matrix of main effects and Note 2 to Theorem 1.4.1, the model Eqs (15) and (16) is a special case of the $A^{f}$ -model and, therefore, a special case of the general factorial $A^{\Omega}$ -model for quantitative factors. We will call Eqs (15) and (16) the $A^{f}$ -model of true effects.

Denote the parts of the matrix $\bm{\Phi}_{1\ldots m}^{f}$ and the vector $\bm{B}$ for the factorial set $\Omega$ by $\bm{\Phi}^{\Omega}$ and $\bm{B}^{\Omega}$ respectively. If elements of the vector $\bm{B}$ that do not correspond to the factorial set $\Omega$ equal zero, Eq. (16) will be as follows:

$\displaystyle E\bm{y}^{f}=\bm{\Phi}^{\Omega}\bm{B}^{\Omega}.$ (17)

It is evident that the model Eq. (17) is also a special case of the factorial $A^{\Omega}$ -model. We will call Eq. (17) the $A^{\Omega}$ -model of true effects. If it does not matter or if it is clear which type of a model for quantitative factors we consider, we will omit the words “true effects”.

The $A^{\Omega}$ -model Eq. (17) can be extended to a wider domain:

$\displaystyle Ey\left(X_{1},\ldots,X_{m}\right)=\bm{f}^{T}\left(X_{1},\ldots,X% _{m}\right)\bm{B}^{\Omega},$ (18)

where $f^{T}\left(X_{1u},\ldots,X_{mu}\right)$ is the $u$ -th row of the matrix $\bm{\Phi}^{\Omega}$ .

2.6 Full rank theorem

The following three paragraphs are devoted to finding a condition under which it is possible to construct an orthogonal design (Brodsky, 1971).

Let in the design $\bm{D}$ , the number of different combinations of levels of factors $F_{1},\ldots,F_{r}$

$\displaystyle C^{1\ldots r}=s_{1}\ldots s_{r}.$ (19)

Theorem 1.6.1. I. The condition Eq. (19) is necessary and sufficient for the number of degrees of freedom carried by any $n$ -factor interaction effects $(n\leqslant r)$ of $n$ factors $F_{i_{1}},\ldots,F_{i_{n}}$ of $F_{1},\ldots,F_{r}$ is determined by Eq. (11). II. If Eq. (19) holds, $\bm{\Phi}_{1\ldots r}$ is a matrix of full rank.

Proof Necessity of the condition Eq. (19) is evident. Show sufficiency of the condition Eq. (19) and that statement II of the theorem is true.

By the hypothesis of the theorem, the design $\bm{D}$ contains a subset $\bm{D}^{f}$ forming a full design for the factors $F_{1},\ldots,F_{r}$ . For the design $\bm{D}^{f}$ , we generate matrices of effects up to order $(r-1)$ of the factors $F_{1},\ldots,F_{r}$ and the matrix

$\displaystyle\bm{\Phi}_{1\ldots r}^{f}=\left\|\bm{I},\bm{F}_{1}^{f},\ldots,\bm% {F}_{r}^{f},\bm{F}_{12}^{f},\ldots,\bm{F}_{1\ldots r}^{f}\right\|.$

Theorem 1.4.1 implies that for $\bm{D}^{f}$ any interaction effect of a given subset of factors orthogonal to any interaction effect of a different subset of factors. Hence, the columns of the matrix $\bm{\Phi}_{1\ldots r}^{f}$ taken one from each matrix of effects are pairwise orthogonal. Therefore, $\bm{\Phi}_{1\ldots r}^{f}$ is a matrix of full rank.

For each combination of levels of the factors $F_{1},\ldots,F_{r}$ in the design $\bm{D}$ , select the row that corresponds to this combination in the matrix $\bm{\Phi}_{1\ldots r}^{f}$ . Denote the resulting matrix by

$\displaystyle\bm{\Phi}_{1\ldots r}^{\bm{D}}=\left\|\bm{I},\bm{F}_{1}^{\bm{D}},% \ldots,\bm{F}_{r}^{\bm{D}},\bm{F}_{12}^{\bm{D}},\ldots,\bm{F}_{1\ldots r}^{\bm% {D}}\right\|,$

where the number of columns of the matrix $\bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}}$ equals the number of columns of the matrix $\bm{F}_{i_{1}\ldots i_{n}}^{f}$ . It is evident that $\bm{\Phi}_{1\ldots r}^{\bm{D}}$ is also a matrix of full rank.

Now we will construct the matrix

$\displaystyle\bm{\Phi}_{1\ldots r}=\left\|\bm{I},\bm{F}_{1},\ldots,\bm{F}_{r},% \bm{F}_{12},\ldots,\bm{F}_{1\ldots r}\right\|$

that contains vectors main effects, interaction effects of the factors $F_{1},\ldots,F_{r}$ of the design $\bm{D}$ , and the vector ${\bm{I}}$ . The number of columns of the matrix $\bm{F}_{i_{1}\ldots i_{n}}$ is equal to the number of columns of the matrix $\bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}}$ . Denote the $j$ -th columns of the matrices $\bm{F}_{i_{1}\ldots i_{n}}$ and $\bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}}$ by $\bm{F}_{i_{1}\ldots i_{n}}^{j}$ and $\bm{F}_{i_{1}\ldots i_{n}}^{\bm{Dj}}$ respectively. The first column of the matrix $\bm{\Phi}_{1\ldots r}$ is the first column of the matrix $\bm{\Phi}_{1\ldots r}^{\bm{D}}$ . We will construct the next columns recurrently. Assume that we have constructed first $p$ independent columns of the matrix $\bm{\Phi}_{1\ldots r}$ in such a way that the $l$ -th column of the matrix $\bm{\Phi}_{1\ldots r}$ is a linear combination of the first $l$ columns of the matrix $\bm{\Phi}_{1\ldots r}^{\bm{D}}\left(l=1,\ldots,p\right)$ . Also assume that any of the columns that belong to the matrix $\bm{F}_{j_{1}\ldots j_{l}}$ are linearly independent interaction effects of the factors $F_{j_{1}},\ldots,F_{j_{l}}$ . Then the method of construction of the $(p+1)$ -th column is the following.

Let the $(p+1)$ -th column of the matrix $\bm{\Phi}_{1\ldots r}$ is $\bm{F}_{i_{1}\ldots i_{n}}^{j}$ . Then make the following assignment:

$\displaystyle\bm{F}_{i_{1}\ldots i_{n}}^{j}=\bm{A}_{i_{1}\ldots i_{n}}^{j}% \left(\bm{A}_{i_{1}\ldots i_{n}}^{jT}\bm{A}_{i_{1}\ldots i_{n}}^{j}\right)^{-1% }\bm{A}_{i_{1}\ldots i_{n}}^{jT}\bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}_{j}}-\bm{F}% _{i_{1}\ldots i_{n}}^{\bm{D}_{j}},$

where

$\displaystyle\bm{A}_{i_{1}\ldots i_{n}}^{j}=\left\|\bm{I},\bm{F}_{i_{1}},% \ldots,\bm{F}_{i_{n}},\bm{F}_{i_{1}i_{2}},\ldots,\bm{F}_{i_{1}\ldots i_{n}},% \bm{F}_{i_{1}\ldots i_{n}}^{1},\ldots,\bm{F}_{i_{1}\ldots i_{n}}^{j-1}\right\|.$

The first $p$ columns of the matrix $\bm{\Phi}_{1\ldots r}$ are independent, therefore $\bm{A}_{i_{1}\ldots i_{n}}^{jT}\bm{A}_{i_{1}\ldots i_{n}}^{j}$ is nonsingular, its inverse exists, and

$\displaystyle\bm{A}_{i_{1}\ldots i_{n}}^{j}\left(\bm{A}_{i_{1}\ldots i_{n}}^{% jT}\bm{A}_{i_{1}\ldots i_{n}}^{j}\right)^{-1}\bm{A}_{i_{1}\ldots i_{n}}^{jT}% \bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}_{j}}\neq\bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}_% {j}},$

$\bm{F}_{i_{1}\ldots i_{n}}^{j}$ is nonzero column. $\bm{F}_{i_{1}\ldots i_{n}}^{j}$ is a linear combination of columns of $\bm{A}_{i_{1}\ldots i_{n}}^{j}$ and $\bm{F}_{i_{1}\ldots i_{n}}^{\bm{D}_{j}}$ , therefore, we get $p+1$ independent columns. The elements of $\bm{F}_{i_{1}\ldots i_{n}}^{j}$ for the same combinations of level of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ are equal. It is evident that

$\displaystyle\bm{A}_{i_{1}\ldots i_{n}}^{jT}\bm{F}_{i_{1}\ldots i_{n}}^{j}=0.$

Hence, $\bm{F}_{i_{1}\ldots i_{n}}^{j}$ is an interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ .

Therefore, the matrix $\bm{\Phi}_{1\ldots r}$ contains linearly independent columns of main effects, interaction effects of the factors $F_{1},\ldots,F_{r}$ , and $\bm{I}$ .

By Theorem 1.4.1, for the design $\bm{D}^{f}$ the number of degrees of freedom carried by main effects of the factor $F_{i}$ and interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ are equal $\left(s_{i}-1\right)$ and $\left(s_{i_{1}}-1\right)\ldots\left(s_{i_{n}}-1\right)$ respectively. Therefore, each of the matrices $\bm{F}_{i}^{f}$ and $\bm{F}_{i}$ contains $s_{i}-1$ independent columns; each of the matrix $\bm{F}_{i_{1}\ldots i_{n}}^{f}$ and $\bm{F}_{i_{1}\ldots i_{n}}$ contains $\left(s_{i_{1}}-1\right)\ldots\left(s_{i_{n}}-1\right)$ independent columns. The number of linearly independent columns in $\bm{F}_{i}$ equals $s_{i}-1$ therefore $\bm{F}_{i}$ contains a maximum set of linearly independent main effects of the factor $F_{i}$ for the design $\bm{D}$ .

Suppose that $\bm{I}$ and columns of the matrices $\bm{F}_{1},\ldots,\bm{F}_{r}$ constitute a set of linearly independent vectors. Since a nontrivial linear combination of independent main effects of the factors $F_{i}$ is a main effect of the factor $F_{i}$ there exists a nontrivial linear combination of the vectors $\bm{I}$ , $\bm{\xi}_{1},\bm{\xi}_{2},\ldots,\bm{\xi}_{r}$ that equals zero (where $\bm{\xi}_{i}$ are vectors of main effects). On the other hand, $\bm{\xi}_{i}$ can be expressed as nontrivial linear combinations of the columns of the matrix $\bm{F}_{i}$ . It follows that there exists a nontrivial linear combination of $\bm{I}$ and columns of matrices $\bm{F}_{1},\ldots,\bm{F}_{r}$ that equals zero, which is a contradiction. Therefore, $\bm{I}$ and all columns of matrices $\bm{F}_{1},\ldots,\bm{F}_{r}$ are linearly independent. By using simple algebraic operations, we can get that the number of degrees of freedom carried by interaction effects of the factors $F_{i}$ and $F_{j}$ equals $\left(s_{i}-1\right)\left(s_{j}-1\right)$ . Therefore, any matrix of interaction effects of first order in $\bm{\Phi}_{1\ldots r}$ contains a maximum set of linear independent interaction effects of first order.

By using the same type of argument, we can get the following. If any of matrices of linearly independent interaction effects of order $(n-1)$ in $\bm{\Phi}_{1\ldots r}$ contains a maximum set of vectors, then any of matrices of linearly independent interaction effects of order $n$ in $\bm{\Phi}_{1\ldots r}$ contains a maximum set of vectors.

This completes the proof of Theorem 1.6.1.

Consider the matrix $\bm{\Phi}_{i}^{f\bm{D}}=\|\bm{I},\bm{F}_{i}^{f\bm{D}}\|$ and the matrix $\bm{G}_{i}$ Eq. (104). It is evident that these matrices are related by a linear nonsingular transformation. It follows that the matrices $\bm{\Phi}_{1\ldots r}^{f\bm{D}}=\|\bm{I},\bm{F}_{1}^{f\bm{D}}\|\otimes\ldots% \otimes\|\bm{I},\bm{F}_{r}^{f\bm{D}}\|$ and $\bm{G}_{1\ldots r}=\bm{G}_{1}\otimes\ldots\otimes\bm{G}_{r}$ are related by a linear nonsingular transformation as well. Therefore, we get a simple corollary.

Corollary to Theorem 1.6.1. If the condition Eq. (19) is satisfied, then $\bm{G}_{1\ldots r}$ is a matrix of full rank; coefficient matrices $\bm{X}_{1}$ and $\bm{X}_{2}$ of the design $\bm{D}$ for any two $A^{\Omega}$ -models are related by nonsingular linear transformation.

When considering $r$ -factorial interaction effects, we will assume that the condition Eq. (19) is satisfied.

When considering a factorial set $\Omega$ as a set of factors and their subsets in accordance with Definition 3.2.2, we will also consider a factorial set $\Omega$ as a set of main effects and interaction effects in accordance with the following definition.

Definition 1.6.1. A set of main effects and interaction effects of the factors $F_{1},\ldots,F_{m}$ is called a factorial set $\Omega$ if the following condition holds. If interaction effect of the factors $F_{n_{1}},\ldots,F_{n_{r}}$ belongs to the set $\Omega,$ full set of interaction effects of factors $F_{l_{1}},\ldots,F_{l_{v}}$ belong to the set $\Omega$ for all $v=1,\ldots,r$ and $l_{1},\ldots,l_{v}=n_{1},\ldots,n_{r},l_{1}\neq\ldots\neq l_{v}.$

Definition 1.2.2 is consistent with Definition 1.6.1, because of obvious one-to-one correspondence between subsets of factors and subsets of main effects and interaction effects of these factors.

2.7 The condition of proportional frequencies

This paragraph is devoted to the fundamental concept introduced by Plackett (1946) – the condition of proportional frequency.

Let the $l$ -th level of the factor $F_{i}$ occurs $\omega_{i}^{l}$ times and the $n$ -th level of the factor $F_{j}$ occurs $\omega_{j}^{n}$ times in the design $\bm{D}$ . Let the $l$ -th level of the factor $F_{i}$ occurs $\omega_{ij}^{ln}$ times with the $n$ -th level of the factor $F_{j}$ . Consider the $\left(s_{i}\times s_{j}\right)$ -matrix $\bm{W}=\left\{\omega_{ij}^{ln}\right\}$ . It is evident that

$\displaystyle\bm{W}_{i}=\begin{Vmatrix}w_{i}^{0}\\ w_{i}^{1}\\ \vdots\\ w_{i}^{s_{i}-1}\\ \end{Vmatrix}=\bm{WI},\bm{W}_{j}=\begin{Vmatrix}w_{j}^{0}\\ w_{j}^{1}\\ \vdots\\ w_{j}^{s_{j}-1}\\ \end{Vmatrix}=\bm{W}^{T}\bm{I}.$

Consider the $N$ -dimensional vector $\bm{S}$ and the ( $N\times s_{1}\ldots s_{r})$ -matrix $\bm{\Phi}_{1\ldots r}$ . Matrix $\bm{\Phi}_{1\ldots r}$ has $N$ rows including $s_{1}\ldots s_{r}$ different rows (corresponding different combinations of levels of the factors $F_{1},\ldots,F_{r})$ and $s_{1}\ldots s_{r}$ columns that are linearly independent.

For each subset of the identical rows of $\bm{\Phi}_{1\ldots r}$ , select only one. For the corresponding elements of vector $\bm{S}$ , calculate their average. Denote the resulting matrix and the column by $\bar{\bm{\Phi}}_{1\ldots r}$ and $\bar{\bm{S}}$ respectively.

Any column of main effect is orthogonal to the unit vector. Hence,

$\displaystyle\bar{\bm{\Phi}}_{i}^{T}\bm{W}_{i}=\begin{Vmatrix}N\\ 0\\ \vdots\\ 0\\ \end{Vmatrix}=\bm{\Delta}_{i},$

$\displaystyle\bm{W}_{i}=\bar{\bm{\Phi}}_{i}^{T-1}\bm{\Delta}_{i}.$ (20)

Analogically,

$\displaystyle\bm{W}_{j}=\bar{\bm{\Phi}}_{j}^{T-1}\bm{\Delta}_{j}.$ (21)

Theorem 1.7.1 (Plackett, 1946). If

$\displaystyle\bm{F}_{i}^{T}\bm{F}_{j}=0,$ (22)

then

$\displaystyle N\bm{W}=\bm{W}_{i}\bm{W}_{j}^{T}.$

Proof Rewrite Eq. (22) as follows:

$\displaystyle\bar{\bm{\Phi}}_{i}^{T}\bm{W}\bar{\bm{\Phi}}_{j}=\begin{Vmatrix}N% &0&\ldots&0\\ 0&0&\ldots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\ldots&0\\ \end{Vmatrix}=\frac{1}{N}\bm{\Delta}_{i}\bm{\Delta}_{j}^{T}.$

Then, by Eqs (20) and (21),

$\displaystyle N\bm{W}=\bar{\bm{\Phi}}_{i}^{T-1}\bm{\Delta}_{i}\bm{\Delta}_{j}% \bar{\Phi}_{j}^{-1}=\bm{W}_{i}\bm{W}_{j}^{T}.$

This proves the theorem.

Therefore, Theorem 1.7.1 presents a necessary condition of pairwise orthogonality of vectors of main effects (one from each factor). This condition – the condition of proportional frequencies – states that the levels of one factor occur with each of the levels of other factor with proportional frequencies.

Definition 1.7.1. Let $w_{i_{1}\ldots i_{t}}^{j_{1}\ldots j_{t}}$ be the number of the appearances of the combination of levels $j_{1},\ldots,j_{t}$ of the factors $F_{i_{1}},\ldots,F_{i_{t}}$ respectively in the design $\bm{D}$ . Then the set of requirements

$\displaystyle N^{t-1}w_{i_{1}\ldots i_{t}}^{j_{1}\ldots j_{t}}=w_{i_{1}}^{j_{1% }}\ldots w_{i_{t}}^{j_{t}}\left(\text{for any }j_{1},\ldots,j_{t}\right)$ (23)

is called the condition of proportional frequencies for the factors $F_{i_{1}},\ldots,F_{i_{t}}$ .

Definition 1.7.2. The condition of proportional frequencies Eq. (23) is said to be satisfied for a factorial set $\Omega$ if Eq. (23) is satisfied for each group of factors of any two elements of the set $\Omega$ .

Let the design includes all level combinations of the factors $F_{i}$ and $F_{j}$ . Then the number of independent first-order interaction effects of these factors will be determined by Eq. (11). In this and only this case, the matrix $\bar{\bar{\bm{\Phi}}}_{ij}$ will be square. Assume that the condition

$\displaystyle\bm{\Phi}_{ij}^{T}\bm{F}_{n}=0$

is satisfied for the factor $F_{n}.$ As in the proof of Theorem 1.7.1, we get that the levels of the factor $F_{n}$ occur with each combination of the levels of the factors $F_{i}$ and $F_{j}$ with proportional frequencies:

$\displaystyle Nw_{nij}^{k_{n}k_{i}k_{j}}=w_{n}^{k_{n}}w_{ij}^{k_{i}k_{j}}.$

If the condition Eq. (22) is satisfied for the factors $F_{i}$ and ${F}_{j}$ , then, by Theorem 1.7.1,

$\displaystyle Nw_{ij}^{k_{i}k_{j}}=w_{i}^{k_{i}}w_{j}^{k_{j}}.$

Therefore

$\displaystyle N^{2}w_{nij}^{k_{n}k_{i}k_{j}}=w_{n}^{k_{n}}w_{i}^{k_{i}}w_{j}^{% k_{j}}.$

A similar conclusion can be made for any number of factors. For this purpose, we will consider the following partitioning of a set of the factors $F_{1},\ldots,F_{r}$ . The first partition splits the factors $F_{1},\ldots,F_{r}$ into two sets. The second partition splits each of the sets of the first partition (if it contains more than one factor) into two subsets. And so on. The resulting partition of the factors is called a full partition if each subset of the last partition contains only one factor.

Theorem 1.7.2. Suppose that for $t$ factors $F_{1},\ldots,F_{t}$ there exists a full partition such that the following condition holds. Any subset of the $l$ -th partition is split by the $(l+1)$ -th partition into two subsets $F_{i_{1}},\ldots,F_{i_{p}}$ and $F_{i_{p+1}},\ldots,F_{i_{q}}$ such that

$\displaystyle\bm{\Phi}_{i_{1}\ldots i_{p}}^{T}\bm{\Phi}_{i_{p+1}\ldots i_{q}}=% \begin{Vmatrix}N^{2}&0&\ldots&0\\ 0&0&\ldots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\ldots&0\\ \end{Vmatrix}.$

Then

a) a)

The combinations of the levels of the factors $F_{i_{1}},\ldots,F_{i_{p}}$ occur with each combination of the levels of the factors $F_{i_{p+1}},\ldots,F_{i_{q}}$ with proportional frequencies;

The condition of proportional frequencies is satisfied for the factors $F_{1},\ldots,F_{t}$ .

Now we are going to prove a sufficiency of the condition Eq. (23) for pairwise orthogonality of main effects and interaction effects (one from each matrix of effects). The proof will follow from the following lemma.

Lemma 1.7.1. Suppose that for the factors $F_{1},\ldots,F_{t}$ and the column $\bm{S}$ ,

$\displaystyle\bm{S}^{T}\bm{\Phi}_{1\ldots r}={0}.$

Then the sum of the elements of $\bm{S}$ corresponding to any combination of the levels of the factors $F_{1},\ldots,F_{t}$ equals zero.

Proof $Rg\left(\bar{\bm{\Phi}}_{1\ldots r}\right)=Rg\left(\bm{\Phi}_{1\ldots r}\right)$ , and, by Theorem 1.6.1, $\bar{\bm{\Phi}}_{1\ldots r}$ is a square nonsingular matrix. $\bar{\bm{S}}$ is orthogonal to all columns of the matrix $\bar{\bm{\Phi}}_{1\ldots r}$ . Hence, all elements of $\bm{S}$ equal zero, which was to be proved.

It follows from Lemma 1.7.1 that orthogonality of the interaction effect $\bm{\xi}_{1\ldots r+1}$ $r$ -th order of the factors $F_{1},\ldots,F_{r+1}$ to all interaction effects $n$ -th $\left(n<r\right)$ order of these factors implies that the sum of elements of $\bm{\xi}_{1\ldots r+1}$ corresponding to any combination of the levels of any $n$ factors of the factors $F_{1},\ldots,F_{r+1}$ are equal to zero.

Theorem 1.7.3. If the condition of proportional frequencies Eq. (23) is satisfied for $t$ factors, all main effects and interaction effects of these factors (one from each set of effects) are pairwise orthogonal.

Proof Let $P$ and $R$ be two arbitrary subsets (of $p$ and $r$ factors respectively) of the set of $t$ factors such that the condition of proportional frequencies Eq. (23) is satisfied. Summing up both parts of Eq. (23) for all levels of certain factors, we get that the condition of proportional frequencies Eq. (23) is satisfied for any subset of the factors of the given set of $t$ factors. In particular, the condition of proportional frequencies is satisfied for the set of factors belonging to the union $T=P\cup R$ . Denote by $\bm{\xi}_{P}$ and $\bm{\xi}_{R}$ arbitrary interaction effects ( $p-1$ )-th and ( $r-1$ )-th order respectively of the factors belonging to $P$ and $R$ . Consider two cases:

1) 1)

The set $Q=P\cap R$ is empty;

The set $Q$ is not empty.

Case 1. Case 1.

$\bm{\xi}_{P}$ is a contrast, therefore, the sum of its elements equals zero. The condition of proportional frequencies is satisfied for the factors of the set $T$ . Therefore, the combinations of the levels of the factors from the set $R$ occur with each combination of the levels of the factors from the set $P$ with proportional frequencies. The elements of $\bm{\xi}_{P}$ are equal for the same combinations of the factors from the set $P$ . Hence, the sum of elements of $\bm{\xi}_{P}$ corresponding to any combination of the levels of the factors from $R$ equals zero. Therefore, $\bm{\xi}_{P}$ and $\bm{\xi}_{R}$ are orthogonal.

Case 2.

Any effect $\bm{\xi}_{P}$ , by the definition of interaction effects, orthogonal to any effects of the factors from the set $Q$ . By Lemma 1.7.1, for the part $\bm{D}_{Q}$ of the design corresponding to any combination of the levels of the factors from $Q$ , the sum of elements of $\bm{\xi}_{P}$ equals zero. The combinations of the levels of the factors from the set $P$ occur with each combination of the levels of the factors from the set $R Q$ with proportional frequencies. In particular, the same is true for the combinations of the levels of the factors from $P$ for the part $\bm{D}_{Q}$ of the design. Hence, the sum of elements of $\bm{\xi}_{P}$ equals zero for any combination of the levels of the factors from $R$ . Therefore, $\bm{\xi}_{P}$ is orthogonal to $\bm{\xi}_{R}$ .

This completes the proof of the theorem.

It is evident that for any matrix of effects, all effects can be selected pairwise orthogonal. In this case if the condition of proportional frequencies is satisfied for any $t$ factors of the design, all main effects and interaction effects up to order $t-1$ are pairwise orthogonal.

2.8 Construction of interaction effects

Theorem 1.8.1. For any $t$ factors $F_{1},\ldots,F_{t}$ for which the condition of proportional frequencies is satisfied, the product $\bm{S}_{1}\otimes\ldots\otimes\bm{S}_{t}$ of the vectors $\bm{S}_{1},\ldots,\bm{S}_{t}$ of main effects of the factors $F_{1},\ldots,F_{t}$ respectively is a vector $\bm{S}_{1\ldots t}$ of interaction effects of the factors $F_{1},\ldots,F_{t}$ .

Proof It is evident that elements of $\bm{S}_{1\ldots t}$ depend only on the combinations of the levels of the factors $F_{1},\ldots,F_{t}$ . Hence, we need only to prove that $\bm{S}_{1\ldots t}$ is orthogonal to $\bm{I}$ and any main effects and interaction effects up to order $t-2$ of the factors $F_{1},\ldots,F_{t}$ .

For two factors $F_{1}$ and $F_{2}$ , orthogonality of $\bm{S}_{1}\otimes\bm{S}_{2}$ and $\bm{S}^{\prime}_{1}$ ( $\bm{S}^{\prime}_{1}$ is any vector of main effects of the factor $F_{1}$ , perhaps, identical to $\bm{S}_{1}$ ) is equivalent to orthogonality $\bm{S}_{1}\otimes\bm{S}^{\prime}_{1}$ and $\bm{S}_{2}$ . By the term of the theorem, $\bm{S}_{2}$ is orthogonal to all vectors of main effects of the factor $F_{1}$ . Therefore, by Lemma 1.7.1, the sum of the elements of $\bm{S}_{2}$ corresponding to any level of the factor $F_{1}$ equals zero. Since $\bm{S}_{1}\otimes\bm{S}^{\prime}_{1}$ has equal elements for the same levels of the factors $F_{1}$ , $\bm{S}_{1}\otimes\bm{S}^{\prime}_{1}$ is orthogonal to $\bm{S}_{2}$ .

Continue the proof by induction. On the ( $n-1$ )-th step $(n\leqslant t)$ , we get the column $\bm{S}_{1}\otimes\bm{S}_{2}\otimes\ldots\otimes\bm{S}_{n}$ . Its orthogonality to any columns $\bm{S}_{1\ldots l}(i\leqslant n-1)$ of main effects and interaction effects up to order $(l-1)$ is equivalent to orthogonality of two columns $\bm{S}_{1}\otimes\ldots\otimes\bm{S}_{l}\otimes\bm{S}_{1\ldots l}$ and $\bm{S}_{l+1}\otimes\ldots\otimes\bm{S}_{n}$ . By the induction hypothesis, $\bm{S}_{l+1}\otimes\ldots\otimes\bm{S}_{n}$ is interaction effect $(n-l-1)$ -th order and, therefore, orthogonal to all main effects and interaction effects of the factors $F_{1},\ldots,F_{l}$ . Hence, by Lemma 1.7.1, the sum of elements of $\bm{S}_{l+1}\otimes\ldots\otimes\bm{S}_{n}$ corresponding to any combination of the levels of the factors $F_{1},\ldots,F_{l}$ equals zero.

Thus, the proof is complete.

Theorem 1.8.2. Suppose that the condition of proportional frequencies Eq. (23) is satisfied for given set of the factors $F_{1},\ldots,F_{t}$ . Let all matrices of main effects contain pairwise orthogonal columns. Then all possible products of the columns $\bm{S}_{1},\ldots,\bm{S}_{t}$ (one for each factor) produce the full set of $\left(s_{1}-1\right)\times\ldots\times\left(s_{t}-1\right)$ pairwise orthogonal interaction effects of the factors $F_{1},\ldots,F_{t}$ .

Proof Consider two different products of the columns (one for each factor): $\bm{S}_{1}\otimes\ldots\otimes\bm{S}_{t}$ and $\bm{S}^{\prime}_{1}\otimes\ldots\otimes\bm{S}^{\prime}_{t}$ . For these two sets of columns, at least one pair (let it be $\bm{S}_{1}$ and $\bm{S}^{\prime}_{1}$ ) contains different columns. The elements of the column $\bm{S}_{2}\otimes\ldots\otimes\bm{S}_{t}\otimes\bm{S}^{\prime}_{2}\otimes% \ldots\otimes\bm{S}^{\prime}_{t}$ depend only on the levels of the factors $F_{2},\ldots,F_{t}$ . Now we have to prove that the sum of the elements $\bm{S}_{1}\otimes\bm{S}^{\prime}_{1}$ for any combination of the levels of the factors $F_{2},\ldots,F_{t}$ equals zero. Indeed, by Theorem 1.7.2, for any combination of the levels of the factors $F_{2},\ldots,F_{t}$ , the levels of $F_{1}$ occur with proportional frequencies. Hence, for all combinations of the levels of the factors $F_{2},\ldots,F_{t}$ , the sums of the elements of $\bm{S}_{1}\otimes\bm{S}^{\prime}_{1}$ have the same sign. By the term of the theorem, $\bm{S}_{1}$ and $\bm{S}^{\prime}_{1}$ are orthogonal. Hence, these sums equal zero.

Thus, the proof is complete.

Theorem 1.8.2 can be generalized for the case of matrices of main effects that not necessarily contain orthogonal columns.

Theorem 1.8.3. Suppose that the condition of proportional frequencies Eq. (23) is satisfied for given set of the factors $F_{1},\ldots,F_{t}$ . Then all possible products of the columns $\bm{S}_{1},\ldots,\bm{S}_{t}$ (one for each factor) produce the full set of $\left(s_{1}-1\right)\times\ldots\times\left(s_{t}-1\right)$ linearly independent interaction effects of the factors $F_{1},\ldots,F_{t}$ .

The proof of the theorem is similar to the proof of the corresponding part of Theorem 1.4.1.

2.9 Effects of levels and interaction effects of levels

For the sake of simplicity, without loss of generality, consider a design with three factors.

Let $\eta_{ijn}=Ey_{ijn}$ , where $y_{ijn}$ is an observation that corresponds to the point of a full design $\bm{D}^{f}$ with the $i$ -th level of the factor $F_{1}$ , the $j$ -th level of the factor $F_{2}$ , and the $n$ -th level of the factor $F_{3}$ . An asterisk instead of an index means that we take the average over all levels of the corresponding factor. For example,

$\displaystyle\eta_{*jn}=\frac{1}{s_{1}}\sum\limits_{i=0}^{s_{1}-1}{\eta_{ijn}.}$

Definition 1.9.1. The number $\beta_{0}=\eta_{***}$ is called a true average; the number $\beta_{1}^{i}=\eta_{i**}-\eta_{***}$ is called an effect of the $i$ -th level of the factor $F_{1}$ .

Definition 1.9.2. The difference between the effect of the $i$ -th level of the factor $F_{1}$ for a subset with the $j$ -th level of the factor $F_{2}$ and the effect of the $i$ -th level of the factor $F_{1}$ is called an interaction effect of the $i$ -th level of the factor $F_{1}$ and the $j$ -th level of the factor $F_{2}$ and denoted by $\beta_{12}^{ij}$ .

Definition 1.9.3. The difference between the interaction effect of the $i$ -th level of the factor $F_{1}$ and the $j$ -th level of the factor $F_{2}$ for a subset with the $n$ -th level of the factor $F_{3}$ and the interaction effect of the $i$ -th level of the factor $F_{1}$ and the $j$ -th level of the factor $F_{2}$ is called an interaction effect of the $i$ -th level of the factor $F_{1}$ , the $j$ -th level of the factor $F_{2}$ , and the $n$ -th level of the factor $F_{3}$ and denoted by $\beta_{123}^{ijn}$ .

It is evident that Definitions 1.9.2 and 1.9.3 are correct, since they are symmetrical for the factors $F_{1},F_{2}$ , and $F_{3}$ . For example, for the design with three factors,

$\displaystyle\beta_{123}^{ijn}=\left(\eta_{ijn}-\eta_{*jn}-\eta_{i*n}+\eta_{**% n}\right)-\left(\eta_{ij*}-\eta_{*j*}-\eta_{i**}+\eta_{***}\right)=\eta_{ijn}-% \eta_{ij*}-\eta_{i*n}-\eta_{*jn}+\eta_{i**}+\eta_{*j*}+\eta_{**n}-\eta_{***}.$ (24)

Other effects of levels and interaction effects of levels are defined analogously.

Any effect of the level or any interaction effect of the levels is a linear combination of the mathematical expectations of observations for $\bm{D}^{f}$ . The coefficients of such linear combinations form the vectors that we will call vectors of effects of levels and vectors of interaction effects of levels. Denote by $\bm{\psi}_{1}^{(i)}$ the vector of the effect of the $i$ -th level of the factor $F_{1}$ and denote by $\bm{\psi}_{12}^{ij}$ the vector of the interaction effect of the $i$ -th level of the factor $F_{1}$ and the $j$ -th level of the factor $F_{2}$ , etc.

In Eq. (24), coefficient for the treatment combination with the levels $i$ , $j$ , and $n$ of the factors $F_{1},F_{2}$ , and $F_{3}$ respectively is $1/N\left(s_{1}s_{2}s_{3}-s_{1}s_{2}-s_{1}s_{3}-s_{2}s_{3}+s_{1}+s_{2}+s_{3}-1% \right)=1/N\left(s_{1}-1\right)\left(s_{2}-1\right)\left(s_{3}-1\right)$ .

Coefficient for the treatment combination in which the factors $F_{1}$ and $F_{2}$ appear at the levels $i$ and $j$ respectively and the factor $F_{3}$ appears at the level other than $n$ , equals $1/N\left(-s_{1}s_{2}+s_{1}+s_{2}-1\right)=-\left(1/N\right)\left(s_{1}-1\right% )\left(s_{2}-1\right)$ .

Coefficient for the treatment combination in which the factor $F_{1}$ appears at the level $i$ and the factors $F_{2}$ and $F_{3}$ appear at the levels other than $j$ and $n$ respectively, equals $1/N\left(s_{1}-1\right)$ .

Coefficient for the treatment combination in which the factors $F_{1},F_{2}$ , and $F_{3}$ appear at the levels other than $i, j,$ and $n$ respectively equals $-1$ .

The summary for all elements of the vector of the interaction effects of the levels $i$ , $j$ , and $n$ is given in the following table.

Table 1

Vector of Interaction Effect of Levels ${i}$ , ${j}$ , and ${n}$

$F_{1}$	$F_{2}$	$F_{3}$	Elements of vector of interaction effect of levels $i$ , $j$ , and $n$ of factors $F_{1}$ , $F_{2}$ , and $F_{3}$ . Respectively
$i$	$J$	$n$	$1/N\left(s_{1}-1\right)\left(s_{2}-1\right)\left(s_{3}-1\right)$
$i$	$J$		${-1}/N\left(s_{1}-1\right)\left(s_{2}-1\right)$
$i$		$n$	${-1}/N\left(s_{1}-1\right)\left(s_{3}-1\right)$
	$j$	$n$	${-1}/N\left(s_{2}-1\right)\left(s_{3}-1\right)$
$i$			$1/N\left(s_{1}-1\right)$
	$j$		$1/N\left(s_{2}-1\right)$
		$n$	$1/N\left(s_{3}-1\right)$
			$-1/N$

Each of rows of the table corresponds to a set of treatments. If, for example, the factor $F_{1}$ appears at the level $i$ in the set, the corresponding cell of the table has index $i$ . If the factor $F_{1}$ appears at the level other than $i$ , the corresponding cell is left empty. The table cells for the factors $F_{2}$ and $F_{3}$ are filled analogously.

Let

$\displaystyle\Delta_{ij*}^{+-}=\begin{cases}1&\text{if for the $u$ th % obsevation, the factor ${F}_{1}$ appears at the level $i$}\\ &\text{and the factor $F_{2}$ appears at the level other than $j$,}\\ 0&\text{otherwise}.\\ \end{cases}$

We will also use the similar notations in the similar cases.

The element of the vector of the interaction effect of the levels $i$ , $j$ , and $n$ for the $u$ -th obsevation is

$\displaystyle\psi_{123}^{ijn}(u)=\frac{1}{N}\left\{\Delta_{ijn}^{+++}\right.% \left(s_{1}-1\right)\left(s_{2}-1\right)\left(s_{3}-1\right)-\Delta_{ijn}^{++-% }\left(s_{1}-1\right)\left(s_{2}-1\right)-\Delta_{ijn}^{+-+}\left(s_{1}-1% \right)\left(s_{3}-1\right)-\Delta_{ijn}^{-++}\left(s_{2}-1\right)\left(s_{3}-% 1\right)+{\Delta_{ijn}^{+--}\left(s_{1}-1\right)+\Delta}_{ijn}^{-+-}\left(s_{2% }-1\right)+\Delta_{ijn}^{--+}\left(s_{3}-1\right)-\left.\Delta_{ijn}^{---}% \right\}=\frac{1}{N}\left\{\Delta_{i**}^{+}\left(s_{1}-1\right)-\Delta_{i**}^{% -}\right\}\left\{\Delta_{*j*}^{+}\left(s_{2}-1\right)-\Delta_{*j*}^{-}\right\}% \left\{\Delta_{**n}^{+}\left(s_{3}-1\right)-\Delta_{**n}^{-}\right\}$

We can easily prove it if we take into account the following equalities:

$\displaystyle\Delta_{i**}^{+}\Delta_{*j*}^{+}\Delta_{**n}^{+}=\Delta_{ijn}^{++% +};∼{}∼{}\Delta_{i**}^{+}\Delta_{*j*}^{+}\Delta_{**n}^{-}=\Delta_{ijn}^{++-};$ $\displaystyle\Delta_{i**}^{+}\Delta_{*j*}^{-}\Delta_{**n}^{-}=\Delta_{ijn}^{+-% -};∼{}∼{}\Delta_{i**}^{-}\Delta_{*j*}^{-}\Delta_{**n}^{-}=\Delta_{ijn}^{---}.$

Analogously, in the general case, the element of the vector $\bm{\psi}_{1\ldots r}^{i_{1}\ldots i_{r}}$ of the interaction effect of the levels $i_{j}\left(j=1,\ldots,r\right)$ of the factors $F_{1},\ldots,F_{r}$ is

$\displaystyle\psi_{1\ldots r}^{i_{1}\ldots i_{r}}(u)=\frac{1}{N}\left\{\Delta_% {i_{1}*\ldots*}^{+}\left(s_{1}-1\right)-\Delta_{i_{1}*\ldots*}^{-}\right\}% \times\ldots\times\left\{\Delta_{*\ldots*i_{r}*\ldots*}^{+}\left(s_{r}-1\right% )-\Delta_{*\ldots*i_{r}*\ldots*}^{-}\right\}.$

In particular, the element of the vector of the effect of the level $i$ of the factor $F_{1}$ is

$\displaystyle\psi_{1}^{i_{1}}(u)=\frac{1}{N}\left\{\Delta_{i_{1}*\ldots*}^{+}% \left(s_{1}-1\right)-\Delta_{i_{1}*\ldots*}^{-}\right\}.$ (25)

Hence,

$\displaystyle N\psi_{1\ldots r}^{i_{1}\ldots i_{r}}(u)=N\psi_{1}^{i_{1}}(u)% \ldots N\psi_{r}^{i_{r}}(u),$

$\displaystyle N\bm{\psi}_{1\ldots r}^{i_{1}\ldots i_{r}}=N\bm{\psi}_{1}^{i_{1}% }\otimes\ldots\otimes N\bm{\psi}_{r}^{i_{r}}.$ (26)

Therefore, the following theorem has been proved.

Theorem 1.9.1. The vector of the interaction effect of the levels $i_{1},\ldots,i_{r}$ of the factors $F_{1},\ldots,F_{r}$ respectively is, apart from a proportionality factor, the product of the vectors of the effects of the levels $i_{1},\ldots,i_{r}$ of the factors $F_{1},\ldots,F_{r}$ respectively.

It is evident that $\bm{\psi}_{1}^{i_{1}},\ldots,\bm{\psi}_{r}^{i_{r}}$ are the vectors of main effects of the factors $F_{1},\ldots,F_{r}$ respectively. Hence, the vector of the interaction effect of levels $\bm{\psi}_{1\ldots r}^{i_{1}\ldots i_{r}}$ , by Note 1 to Theorem 1.4.1, is the vector of the interaction effect of the factors $F_{1},\ldots,F_{r}$ for the design $\bm{D}^{f}$ .

It is easy to verify that any $i-1$ vectors of all vectors of main effects of the factor $F_{1}$ , with the elements $\Delta_{i_{*\ldots*}}^{+}\left(s_{1}-1\right)-\Delta_{i_{*\ldots*}}^{-}$ ( $i=0,1,\ldots,s_{1}-1)$ , form the set of linearly independent vectors. A similar statement holds for the factors $F_{2},\ldots,F_{r}$ . All possible products of the selected independent main effects (one from each factor) form $\left(s_{1}-1\right)\ldots\left(s_{r}-1\right)$ interaction effects of levels of the factors $F_{1},\ldots,F_{r}$ . By Note 1 to Theorem 1.4.1, these $\left(s_{1}-1\right)\ldots\left(s_{r}-1\right)$ interaction effects of levels form a set of linearly independent vectors. Therefore, the following theorem has been proved.

Theorem 1.9.2. Any vector of the interaction effect of levels of the factors $F_{1},\ldots,F_{r}$ is a vector of an interaction effect of the factors $F_{1},\ldots,F_{r}$ . Maximum linearly independent subset of vectors of interaction effects of levels of the factors $F_{1},\ldots,F_{r}$ contains exactly $\left(s_{1}-1\right)\ldots\left(s_{r}-1\right)$ vectors.

2.10 A model of true effects for qualitative factors

Denote the matrix of all vectors of effects of levels of the factor $F_{i}$ by $\bm{\psi}_{i}=\left\|\bm{\psi}_{i}^{0},\bm{\psi}_{i}^{1},\ldots,\bm{\psi}_{i}^% {s_{i}-1}\right\|,$ denote the matrix of all interaction effects of levels of the factors $F_{i}$ and $F_{j}$ by $\bm{\psi}_{ij}$ , etc.

Then, by Eq. (26), the following equality holds:

$\displaystyle N\bm{\psi}_{i_{1}\ldots i_{r}}=N\bm{\psi}_{i_{1}}\otimes\ldots% \otimes N\bm{\psi}_{i_{r}}.$

Denote

$\displaystyle\bm{\Psi}_{1\ldots m}=\left\|\frac{1}{N}\bm{I},\bm{\psi}_{1},% \ldots,\bm{\psi}_{m},\bm{\psi}_{12},\ldots,\bm{\psi}_{1\ldots m}\right\|.$

Let

$\displaystyle x_{i}^{(j)}(u)=\begin{cases}1&\text{if the factor $F_{i}$ % appears at the level $j$ in the $u$-th observation,}\\ 0&\text{otherwise}.\\ \end{cases}$ (27)

We will use also the following notations:

$\displaystyle\bm{x}_{i}^{jT}=\left(x_{i}^{(j)}(1),\ldots,x_{i}^{(j)}(N)\right)% ,\bm{x}_{i}=\left\|\bm{x}_{i}^{0},\ldots,\bm{x}_{i}^{s_{i}-1}\right\|,$ $\displaystyle\bm{x}_{i_{1}\ldots i_{r}}=\bm{x}_{i_{1}}\otimes\ldots\otimes\bm{% x}_{i_{r}},$ $\displaystyle\bm{X}_{1\ldots m}=\left\|\bm{I},\bm{x}_{1},\ldots,\bm{x}_{m},\bm% {x}_{12},\ldots,\bm{x}_{1\ldots m}\right\|.$ (28)

Theorem 1.10.1.

$\displaystyle\bm{X}_{1\ldots m}\bm{\Psi}_{1\ldots m}^{T}=\bm{E}_{N}.$ (29)

Proof For the sake of simplicity, without loss of generality, consider a design with three factors.

In the matrix $\bm{X}_{123}$ , consider the row corresponding to the $i$ -, $j$ -, and $n$ -th levels of the factors $F_{1},$ $F_{2},$ and $F_{3}$ respectively. In the matrix $\bm{\Psi}_{123}$ , consider the row corresponding to some combination of levels of the factors $F_{1},F_{2}$ , and $F_{3}$ . Then a scalar square of these two rows of the matrices $\bm{X}_{123}$ and $\bm{\Psi}_{123}$ equals

$\displaystyle\frac{1}{N}\left\{1+\left[\Delta_{i**}^{+}\left(s_{1}-1\right)-% \Delta_{i**}^{-}\right]\right.+[\Delta_{*j*}^{+}\left(s_{2}-1\right)-\Delta_{*% j*}^{-}]$ $\displaystyle\quad+\left[\Delta_{**n}^{+}\left(s_{3}-1\right)-\Delta_{**n}^{-}\right]$ $\displaystyle\quad+\left[\Delta_{i**}^{+}\left(s_{1}-1\right)-\Delta_{i**}^{-}% \right][\Delta_{*j*}^{+}\left(s_{2}-1\right)-\Delta_{*j*}^{-}]$ $\displaystyle\quad+\left[\Delta_{i**}^{+}\left(s_{1}-1\right)-\Delta_{i**}^{-}% \right]\left[\Delta_{**n}^{+}\left(s_{3}-1\right)-\Delta_{**n}^{-}\right]$ $\displaystyle\quad+[\Delta_{*j*}^{+}\left(s_{2}-1\right)-\Delta_{*j*}^{-}]% \left[\Delta_{**n}^{+}\left(s_{3}-1\right)-\Delta_{**n}^{-}\right]$ $\displaystyle\quad+\left[\Delta_{i**}^{+}\left(s_{1}-1\right)-\Delta_{i**}^{-}% \right][\Delta_{*j*}^{+}\left(s_{2}-1\right)-\Delta_{*j*}^{-}]\left[\Delta_{**% n}^{+}\left(s_{3}-1\right)-\Delta_{**n}^{-}\right]\}.$ (30)

If the given row of $\bm{\Psi}_{123}$ corresponds to the levels $i$ , $j$ , and $n$ of the factor $F_{1},F_{2},$ and $F_{3}$ respectively, then

$\displaystyle\Delta_{i**}^{+}=\Delta_{*j*}^{+}=\Delta_{**n}^{+}=1,\Delta_{i**}% ^{-}=\Delta_{*j*}^{-}=\Delta_{**n}^{-}=0,$

Therefore, Eq. (2.10) becomes

$\displaystyle\frac{1}{N}\left\{1+\left(s_{1}-1\right)+\left(s_{2}-1\right)+% \left(s_{3}-1\right)+\left(s_{1}-1\right)\left(s_{2}-1\right)+\left(s_{1}-1% \right)\left(s_{3}-1\right)\right.$ $\displaystyle\quad+\left.\left(s_{2}-1\right)\left(s_{3}-1\right)+\left(s_{1}-% 1\right)\left(s_{2}-1\right)\left(s_{3}-1\right)\right\}=\frac{1}{N}=s_{1}s_{2% }s_{3}=1.$

Hence, it has been proved that the diagonal elements of $\bm{X}_{123}\bm{\Psi}_{123}^{T}$ are equal to 1.

Assume that in the given row of $\bm{\Psi}_{123}$ , at least one of the factors $F_{1},F_{2},$ and $F_{3}$ appears at the level other than $i$ , $j,$ and $n$ respectively. Without loss of generality, assume that the factor $F_{3}$ appears at the level other than $n$ . Then Eq. (2.10) becomes

$\displaystyle A+A[\Delta_{**n}^{+}\left(s_{3}-1\right)-\Delta_{**n}^{-}]=0,$

where

$\displaystyle A=1+[\Delta_{i**}^{+}(s_{1}-1)-\Delta_{i**}^{-}]+[\Delta_{*j*}^{% +}(s_{2}-1)-\Delta_{*j*}^{-}]+[\Delta_{i**}^{+}(s_{1}-1)-\Delta_{i**}^{-}][% \Delta_{*j*}^{+}(s_{2}-1)-\Delta_{*j*}^{-}].$

This proves the theorem.

Now we will define the vector $\mathfrak{B}$ of true effects for qualitative factors:

$\displaystyle\mathfrak{B}=\bm{\Psi}_{1\ldots m}^{T}\bm{\eta}^{f},$ (31)

where $\bm{\eta}^{f}$ , as before, is the vector of mathematical expectations at the points of full design. By Eqs (29) and (31), the following equality holds:

$\displaystyle\bm{\eta}^{\bm{f}}=\bm{X}_{\bm{1\ldots}\bm{m}}\mathfrak{B}.$ (32)

We can consider Eq. (32) as a model, that is true for all points of $\bm{D}^{f}$ .

Denote by $\bm{X}^{\Omega}$ and $\mathfrak{B}^{\Omega}$ the parts of the matrix $\bm{X}_{1\ldots m}$ and the vector $\mathfrak{B}$ respectively corresponding to the factorial set $\Omega$ . Assume that the elements of the vector $\mathfrak{B}$ that do not correspond to the factorial set $\Omega$ are equal to zero. Then Eq. (32) becomes

$\displaystyle\bm{\eta}^{f}=\bm{X}^{\Omega}\mathfrak{B}^{\Omega}.$ (33)

The coefficients of the model Eq. (32) and, therefore, the model Eq. (33) are easy to interpret. This interpretation becomes evident if we recall the definition of the effects of levels and interaction effects of levels.

The coefficient matrix $\bm{X}_{1\ldots m}$ for the model Eq. (32) for the full design is not full rank matrix. For example, the sum of the columns belonging to $\bm{x}_{1}$ is $\bm{I}$ . Therefore, the solution of the normal equations of the method of least squares for the parameters $\mathfrak{B}$ of the model is not unique. However, there exists a system of linear equalities for these parameters

$\displaystyle\bm{H}\mathfrak{B}=0$ (34)

such that the matrix

$\displaystyle\begin{Vmatrix}\bm{X}_{1\ldots m}\\ \bm{H}\\ \end{Vmatrix}$ (35)

has full rank and no row of $\bm{H}$ is represented by a linear combination of rows of the matrix $\bm{X}_{1\ldots m}$ . In this case, for the matrix design $\bm{X}_{1\ldots m}$ , i.e., for the full design with the restriction Eq. (34) on the parameters $\mathfrak{B}$ , there exists a unique solution for LS estimates of the parameters (Scheffé, 1959).

Consider the $u$ -th row $[\psi_{1}^{(0)}(u),\psi_{1}^{(1)}(u),\ldots,\psi_{1}^{(s_{1}-1)}(u)]$ of the matrix $\bm{\psi}_{1}$ of the vector of effects of levels of the factor $F_{1}$ .

$\displaystyle\sum\limits_{n=0}^{s_{1}-1}{\Delta_{n*\ldots*}^{+}=1,}∼{}∼{}\sum% \limits_{n=0}^{s_{1}-1}{\Delta_{n*\ldots*}^{-}=s_{1}-1},$

we get, by Eq. (25), that

$\displaystyle\sum\limits_{n=0}^{s_{1}-1}{\psi_{1}^{(n)}(u)}=\sum\limits_{n=0}^% {s_{1}-1}\left\{\Delta_{n*\ldots*}^{+}\left(s_{1}-1\right)-\Delta_{n*\ldots*}^% {-}\right\}=0.$

Hence, for any factor $F_{i}$

$\displaystyle\bm{\psi}_{i}\bm{I}_{s_{i}}=\bm{0}∼{}\left(i=1,\ldots,m\right).$ (36)

It follows that

$\displaystyle\bm{\psi}_{i}\otimes\bm{\psi}_{j}^{n_{j}}\bm{I}_{s_{j}}=\bm{0},$ $\displaystyle\ldots$ $\displaystyle\bm{\psi}_{1}\otimes\bm{\psi}_{2}^{n_{2}}\otimes\ldots\otimes\bm{% \psi}_{m}^{n_{m}}\bm{I}_{s_{1}}=\bm{0},$ $\displaystyle\ldots$ $\displaystyle\bm{\psi}_{1}^{n_{1}}\otimes\bm{\psi}_{2}^{n_{2}}\otimes\ldots% \otimes\bm{\psi}_{m}\bm{I}_{s_{m}}=\bm{0}$ $\displaystyle\left(i,j=1,\ldots,m;i\neq j;n_{l}=0,\ldots,s_{l}-1\right).$

Therefore, by Eq. (36), we get

$\displaystyle\sum\limits_{n_{i}=0}^{s_{i}-1}\beta_{i}^{\left(n_{i}\right)}=0,% \sum\limits_{n_{i}=0}^{s_{i}-1}\beta_{ij}^{\left(n_{i}n_{j}\right)}=0,\ldots,$ $\displaystyle\sum\limits_{n_{1}=0}^{s_{1}-1}\beta_{12\ldots m}^{\left(n_{1}n_{% 2}\ldots n_{m}\right)}=0,\ldots,\sum\limits_{n_{m}=0}^{s_{m}-1}\beta_{12\ldots m% }^{\left(n_{1}n_{2}\ldots n_{m}\right)}=0,$ $\displaystyle i,j=1,\ldots,m;i\neq j;n_{l}=0,\ldots,s_{l}-1.$ (37)

Then Eq. (2.10) becomes

$\displaystyle\bm{H}\mathfrak{B}=\bm{0},$ (38)

where $\bm{H}$ is the coefficient matrix of Eq. (2.10).

Split the matrix $\bm{H}$ into submatrices to correspond to the partitions of $\bm{X}_{1\ldots m}$ and $\bm{\Psi}_{1\ldots m}$ :

$\displaystyle\bm{H}=\left\|\bm{H}_{0},\bm{H}_{1},\ldots,\bm{H}_{m},\bm{H}_{12}% ,\ldots,\bm{H}_{1\ldots m}\right\|.$

Then for the full design $3^{2}$ , for example, the matrix $\bm{H}$ is

$\displaystyle\begin{array}[]{l}\qquad∼{}∼{}\bm{H}_{0}\quad∼{}∼{}\bm{H}_{1}% \qquad∼{}∼{}\bm{H}_{2}\qquad\qquad\qquad\bm{H}_{12}\\ \bm{H}=\begin{Vmatrix}0|&1&1&1|&0&0&0|&0&0&0&0&0&0&0&0&0\\ 0|&0&0&0|&1&1&1|&0&0&0&0&0&0&0&0&0\\ 0|&0&0&0|&0&0&0|&1&1&1&0&0&0&0&0&0\\ 0|&0&0&0|&0&0&0|&0&0&0&1&1&1&0&0&0\\ 0|&0&0&0|&0&0&0|&0&0&0&0&0&0&1&1&1\\ 0|&0&0&0|&0&0&0|&1&0&0&1&0&0&1&0&0\\ 0|&0&0&0|&0&0&0|&0&1&0&0&1&0&0&1&0\\ 0|&0&0&0|&0&0&0|&0&0&1&0&0&1&0&0&1\end{Vmatrix}.\end{array}$

Denote the columns of the matrices $\bm{H}$ and $\bm{X}_{1\ldots m}$ by $\bm{h}$ and $\bm{x}$ respectively, adding the indices corresponding to the indices of $\bm{\psi}$ . The columns of the matrix $\bm{H}$ is $\bm{h}_{0},\bm{h}_{1}^{0},\bm{h}_{1}^{1},\bm{h}_{1}^{2},\bm{h}_{2}^{0},\bm{h}_% {2}^{1},\bm{h}_{2}^{2},\bm{h}_{12}^{00},\bm{h}_{12}^{01},\bm{h}_{12}^{02},\bm{% h}_{12}^{10},\bm{h}_{12}^{11},\bm{h}_{12}^{12},\bm{h}_{12}^{20},\bm{h}_{12}^{2% 1},$ $\bm{h}_{12}^{22}$ .

Lemma 1.10.1. If for the vector $\bm{\gamma}$

$\displaystyle\bm{H}_{1\ldots r}\bm{\gamma}=\bm{0},∼{}∼{}\bm{\gamma}^{T}\bm{% \gamma}\neq 0,$ (39)

then $\bm{x}_{1\ldots r}\bm{\gamma}$ is the vector of the interaction effect of the factors $F_{1},\ldots,F_{r}$ and for any vector of the interaction effect $\bm{\psi}$ of these factors there exists the vector $\bm{\gamma}$ , such that Eq. (39) holds and $\bm{\psi}=\bm{x}_{1\ldots r}\bm{\gamma}.$

Proof It is evident that any column of the matrix $\bm{\psi}_{1\ldots r}$ , as any other vector of the interaction effect of the factors $F_{1},\ldots,F_{r}$ , can be represented by a linear combination of columns of the matrix $\bm{x}_{1\ldots r}$ , namely $\bm{x}_{1\ldots r}\bm{\gamma}.$

Let

$\displaystyle\bm{x}_{1\ldots r}\bm{\gamma}=\sum\limits_{i_{1}=0}^{s_{1}-1}\sum% \limits_{i_{r}=0}^{s_{r}-1}{\bm{x}_{1\ldots r}^{i_{1}\ldots i_{r}}\gamma_{1% \ldots r}^{\left(i_{1}\ldots i_{r}\right)}}.$

By the definition of a vector of an interaction effect, $\bm{x}_{1\ldots r}\bm{\gamma}$ is orthogonal to all columns of the matrix

$\displaystyle\bm{\Phi}_{i_{1}\ldots i_{r-1}}=\left\|\bm{I},\bm{F}_{i_{1}},% \ldots,\bm{F}_{i_{r-1}},\bm{F}_{i_{1}i_{2}},\ldots,\bm{F}_{i_{1}\ldots i_{r-1}% }\right\|$

for any factors $F_{i_{1}},\ldots,F_{i_{r-1}}$ of $F_{1},\ldots,F_{r}$ . Therefore, by Lemma 1.7.1, the sum of the elements of $\bm{x}_{1\ldots r}\bm{\gamma}$ corresponding to any combination of levels of the factors $F_{i_{1}},\ldots,F_{i_{r-1}}$ is equal to zero, i.e.,

$\displaystyle\sum\limits_{i_{1}=0}^{s_{1}-1}\gamma_{1\ldots r}^{\left(i_{1}% \ldots i_{r}\right)}=0,\ldots,\sum\limits_{i_{r}=0}^{s_{r}-1}\gamma_{1\ldots r% }^{\left(i_{1}\ldots i_{r}\right)}=0.$ (40)

Hence,

$\displaystyle\bm{H}_{1\ldots r}\gamma=\begin{Vmatrix}\left\{\sum\limits_{i_{1}% =0}^{s_{1}-1}\gamma_{1\ldots r}^{\left(i_{1}\ldots i_{r}\right)}\right\}\\ \vdots\\ \left\{\sum\limits_{i_{r}=0}^{s_{r}-1}\gamma_{1\ldots r}^{\left(i_{1}\ldots i_% {r}\right)}\right\}\\ \end{Vmatrix}=0.$ (41)

Therefore, the condition Eq. (39) holds.

Now we have to prove that if the condition Eq. (39) holds, $\bm{x}_{1\ldots r}\bm{\gamma}$ is the vector of the interaction effect of the factors $F_{1},\ldots,F_{r}$ .

Summing up, for example, the first equality of Eq. (40), we get

$\displaystyle\sum\limits_{i_{1}=0}^{s_{1}-1}\ldots\sum\limits_{i_{r}=0}^{s_{r}% -1}\gamma_{1\ldots r}^{\left(i_{1}\ldots i_{r}\right)}=0.$

That means that $\bm{x}_{1\ldots r}\bm{\gamma}$ is a contrast. It follows, by Eq. (39), that Eq. (41) holds, and, therefore, Eq. (40) holds as well. Hence, the sum of elements of $\bm{x}_{1\ldots r}\bm{\gamma}$ corresponding to any combination of level of the factors $F_{i_{1}},\ldots,F_{i_{r-1}}$ of the factors $F_{1},\ldots,F_{r}$ is equal to zero.

This completes the proof of the lemma.

Theorem 1.10.1. The matrix Eq. (35) is a full rank matrix and no row of $\bm{H}$ can be represented by a linear combination of the rows of the matrix $\bm{X}_{1\ldots m}$ .

Proof The fact that no row of $\bm{H}$ can be represented by a linear combination of the rows of the matrix $\bm{X}_{1\ldots m}$ is obvious. Now we have to prove that there is no nonzero vector $\bm{\gamma}$ such that

$\displaystyle\begin{Vmatrix}\bm{X}_{\bm{1\ldots\bm{m}}}\\ \bm{H}\\ \end{Vmatrix}\bm{\gamma}=\bm{0}.$ (42)

Indeed, Eq. (42) implies that

$\displaystyle\bm{H\gamma}=0.$ (43)

By Lemma 1.10.1 and Eq. (43), it follows that

$\displaystyle\bm{\gamma}^{T}=\left(\gamma_{0},\bm{\gamma}_{1}^{T},\ldots,\bm{% \gamma}_{m}^{T},\bm{\gamma}_{12}^{T},\ldots,\bm{\gamma}_{1\ldots m}^{T}\right),$

where $\bm{x}_{i}\bm{\gamma}_{i}$ is the vector of the main effect of the factor $F_{i}$ ; $\bm{x}_{i_{1}\ldots i_{r}}\bm{\gamma}_{i_{1}\ldots i_{r}}$ is the vector of the interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ . Hence,

$\displaystyle\bm{X}_{1\ldots m}\bm{\gamma}=\bm{I}\gamma_{0}+\bm{x}_{1}\bm{% \gamma}_{1}+\ldots+\bm{x}_{m}\bm{\gamma}_{m}+\bm{x}_{12}\bm{\gamma}_{12}+% \ldots+\bm{x}_{1\ldots m}\bm{\gamma}_{1\ldots m}.$ (44)

By Theorem 1.4.1, all vectors in the right-hand side of Eq. (44) are orthogonal. Therefore,

$\displaystyle\bm{X}_{1\ldots m}\bm{\gamma}\neq 0,$

which contradicts Eq. (42).

This completes the proof of Theorem 1.10.1.

The model Eq. (32) with the restrictions Eq. (38) is called a full factorial model of true effects for qualitative factors (or a $C^{f}$ -model of true effects).

Let $\bm{H}^{\Omega}$ be a submatrix of the matrix $\bm{H}$ corresponding to the factorial set $\Omega$ . Then the model Eq. (33) with the restrictions $\bm{H}^{\Omega}\mathfrak{B}^{\Omega}=\bm{0}$ is called a factorial model of true effects for the factorial set $\Omega$ for quantitative factors (or a $C^{\Omega}$ -model of true effects). Hereafter, we will not necessarily keep the words “true effects” in the notation of these models.

For the $C^{\Omega}$ -model $,$ obviously, the following two conditions are satisfied:

The $C^{\Omega}$ -model Eq. (33) of true effects contains an absolute term and terms with all effects of levels for any factor.

If the model contains at least one term with an interaction effect of levels of the factors $F_{i_{1}},\ldots,F_{i_{r}},$ then the model contains all terms with interaction effects of levels of any $n(n\leqslant r)$ factors of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ .

2.11 A mixed model

Consider the full design $\bm{D}^{f}$ together with the design $\bm{D}$ for the case when the factors $F_{1},\ldots,F_{n}$ are qualitative and the factors $F_{n+1},\ldots,F_{m}$ are quantitative.

For the qualitative factors, as in Section 2.10, we will use the matrices $\bm{\rho}_{i}=\bm{\psi}_{i}=\left\|\bm{\psi}_{i}^{0},\ldots,\bm{\psi}_{i}^{s_{% i}-1}\right\|$ of all vectors of effects of levels of the factors $F_{i}\left(i=1,\ldots,n\right)$ . For quantitative factors, as in Section 2.5, we will use the matrices $\bm{\rho}_{j}=(1/N^{f})\bm{F}_{j}^{f}$ of vectors of main effects of the factors $F_{j}(j=n+1,\ldots,m)$ for the design $\bm{D}^{f}$ . For vectors of interaction effects of qualitative factors $F_{i_{1}},\ldots,F_{i_{r}}\left(i_{1},\ldots,i_{r}\leqslant n\right)$ we will apply the matrix $\bm{\rho}_{i_{1}\ldots i_{r}}$ of all vectors of interaction effects of levels of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ :

$\displaystyle N^{f}\bm{\rho}_{i_{1}\ldots i_{r}}=N^{f}\bm{\psi}_{i_{1}\ldots i% _{r}}=N^{f}\bm{\psi}_{i_{1}}\otimes\ldots\otimes N^{f}\bm{\psi}_{i_{r}}.$

For quantitative factors $F_{j_{1}},\ldots,F_{j_{l}}\left(j_{1},\ldots,j_{l}\geqslant n+1\right),$ we will apply the matrix $\bm{\rho}_{j_{1}\ldots j_{l}}$ of interaction effects of the factors $F_{j_{1}},\ldots,F_{j_{l}}$ :

$\displaystyle N^{f}\bm{\rho}_{j_{1}\ldots j_{l}}=\bm{F}_{j_{1}\ldots j_{l}}^{f}.$

For the qualitative factors $F_{i_{1}},\ldots,F_{i_{r}}\left(i_{1},\ldots,i_{r}\leqslant n\right)$ and the quantitative factors $F_{j_{1}},\ldots,F_{j_{l}}(j_{1},\ldots,j_{l}\geqslant$ $n+1)$ we will use the matrix $\bm{\rho}_{i_{1},\ldots,i_{r},j_{1},\ldots,j_{l}}$ :

$\displaystyle N^{f}\bm{\rho}_{i_{1}\ldots i_{r}j_{1}\ldots j_{l}}=N^{f}\bm{% \psi}_{i_{1}\ldots i_{r}}\otimes\bm{F}_{j_{1}\ldots j_{l}}^{f}.$ (45)

By using the line of proof of Theorem 1.9.2, we get the following theorem.

Theorem 1.11.1. Any vector of the matrix Eq. (45) is a vector of an interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{r}},F_{j_{1}},\ldots,F_{j_{l}}$ . A maximum linearly independent subset of vectors of interaction effects of the matrix Eq. (45) contains exactly $\left(s_{i_{1}}-1\right)\ldots\left(s_{i_{r}}-1\right)\left(s_{j_{1}}-1\right)% \ldots\left(s_{j_{l}}-1\right)$ vectors.

Denote

$\displaystyle\bm{P}_{1\ldots m}=\left\|\frac{1}{N^{f}}\bm{I},\bm{\rho}_{1},% \ldots,\bm{\rho}_{m},\bm{\rho}_{12},\ldots,\bm{\rho}_{1\ldots m}\right\|,$ $\displaystyle\bm{z}_{i}=\bm{x}_{i}\left(i=1,\ldots,n\right),$ $\displaystyle\bm{z}_{i}=\bm{F}_{j}^{f}\left(j=n+1,\ldots,m\right),$ $\displaystyle\bm{z}_{i_{1}\ldots i_{r}j_{1}\ldots j_{l}}=\bm{x}_{i_{1}\ldots i% _{r}}\otimes\bm{F}_{j_{1}\ldots j_{l}}^{f}$ $\displaystyle\left(i_{1},\ldots,i_{r}\leqslant n;j_{1},\ldots,j_{l}\geqslant n% +1\right),$ $\displaystyle\bm{Z}_{1\ldots m}=\left\|\bm{I},\bm{z}_{1},\ldots,\bm{z}_{m},\bm% {z}_{12},\ldots,\bm{z}_{1\ldots m}\right\|.$ (46)

Theorem 1.11.2.

$\displaystyle\bm{Z}_{1\ldots m}\bm{P}_{1\ldots m}^{T}=\bm{E}_{N^{F}}.$ (47)

Proof Consider the full design $\bm{D}^{\prime}$ with $s_{i}$ runs for the only factor $F_{i}$ . By Theorems 1.4.1 and 1.10.1, we can easily see that

$\displaystyle\bm{X}^{\prime}_{i}\bm{\Psi}_{i}^{\prime T}=\frac{1}{s_{i}}\bm{% \Phi}^{\prime}_{i}\bm{\Phi}_{i}^{\prime T}=\bm{E}_{s_{i}}.$

Besides,

$\displaystyle\begin{Vmatrix}\bm{X}^{\prime}_{i}\\ \vdots\\ \bm{X}^{\prime}_{i}\\ \end{Vmatrix}=\left\|\bm{I},\bm{x}_{i}\right\|;\begin{Vmatrix}\bm{\Psi}^{% \prime}_{i}\\ \vdots\\ \bm{\Psi}^{\prime}_{i}\\ \end{Vmatrix}=\frac{N^{f}}{s_{i}}\left\|\frac{1}{N^{f}}\bm{I},\bm{\psi}_{i}% \right\|;$ $\displaystyle\begin{Vmatrix}\bm{\Phi}^{\prime}_{i}\\ \vdots\\ \bm{\Phi}^{\prime}_{i}\\ \end{Vmatrix}=\left\|\bm{I},\bm{F}_{i}^{f}\right\|.$

Therefore,

$\displaystyle\frac{N^{f}}{s_{i}}\left\|\bm{I},\bm{x}_{i}\right\|\cdot\left\|% \frac{1}{N^{f}}\bm{I},\bm{\psi}_{i}\right\|^{T}=\frac{1}{s_{i}}\left\|\bm{I},% \bm{F}_{i}^{f}\right\|\cdot\left\|\bm{I},\bm{F}_{i}^{f}\right\|^{T},$

and

$\displaystyle\bm{x}_{i}\bm{\psi}_{i}^{T}=\frac{1}{N^{f}}\bm{F}_{i}^{f}\bm{F}_{% i}^{fT}=\bm{z}_{i}\bm{\rho}_{i}^{T}.$ (48)

Then, by Eqs (2.11) and (48), we get for $i_{1},\ldots,i_{r}\leqslant n$ and $j_{1},\ldots,j_{l}\geqslant n+1$ that

$\displaystyle\bm{z}_{i_{1}\ldots i_{r}j_{1}\ldots j_{l}}\bm{\rho}_{i_{1}\ldots i% _{r}j_{1}\ldots j_{l}}^{T}=\left(\bm{x}_{i_{1}\ldots i_{r}}\otimes\bm{F}_{j_{1% }\ldots j_{l}}^{f}\right)\left(\bm{\psi}_{i_{1}\ldots i_{r}}\otimes\bm{F}_{j_{% 1}\ldots j_{l}}^{f}\right)^{T}=\left(N^{f}\right)^{r+l-1}\left(\bm{x}_{i_{1}}% \otimes\ldots\otimes\bm{x}_{i_{r}}\otimes\bm{F}_{j_{1}}^{f}\otimes\ldots% \otimes\bm{F}_{j_{l}}^{f}\right){\times\left(\bm{\psi}_{i_{1}}\otimes\ldots% \otimes\bm{\psi}_{i_{r}}\otimes\frac{1}{N^{f}}\bm{F}_{j_{1}}^{f}\otimes\ldots% \otimes\bm{F}_{j_{l}}^{f}\right)}^{T}=\left(N^{f}\right)^{r+l-1}\left(\bm{x}_{% i_{1}}\bm{\psi}_{i_{1}}^{T}\right)*\ldots*\left(\bm{x}_{i_{r}}\bm{\psi}_{i_{r}% }^{T}\right)*\left(\frac{1}{N^{f}}\bm{F}_{j_{1}}^{f}\bm{F}_{j_{1}}^{fT}\right)% *\ldots*\left(\frac{1}{N^{f}}\bm{F}_{j_{l}}^{f}\bm{F}_{j_{l}}^{fT}\right)=% \left(N^{f}\right)^{r+l-1}\left(\bm{x}_{i_{1}}\bm{\psi}_{i_{1}}^{T}\right)*% \ldots*\left(\bm{x}_{j_{l}}\bm{\psi}_{j_{l}}^{T}\right)=\bm{x}_{i_{1}\ldots i_% {r}j_{1}\ldots j_{l}}\bm{\psi}_{i_{1}\ldots i_{r}j_{1}\ldots j_{l}}^{T},$

where $*$ denotes term by term multiplication of matrices.

Then Eq. (29) implies Eq. (47).

Denote by

$\displaystyle\bm{\Theta}=\bm{P}_{1\ldots m}^{T}\bm{\eta}^{f}$ (49)

a vector of true effects of the mixed model. Theorem 1.11.2 implies that

$\displaystyle\bm{\eta}^{f}=\bm{Z}_{1\ldots m}\bm{\Theta}.$ (50)

There exist equalities similar to equalities Eq. (2.10), for the parameters Eq. (49) of the mixed model Eq. (50) with the summation indices $i,j\leqslant n$ . Let $\bm{V}$ denote the matrix of coefficient of the corresponding system. Then

$\displaystyle\bm{V\Theta}=\bm{0}.$ (51)

Using methods similar to the methods of Section 2.10, we can show that the following theorem holds.

Theorem 1.11.3.

$\displaystyle\begin{Vmatrix}\bm{Z}_{1\ldots m}\\ \bm{V}\\ \end{Vmatrix}$

is a full rank matrix, and no row of $\bm{V}$ is represented by a linear combination of the rows of $\bm{Z}_{1\ldots m}$ .

The model Eq. (50) with the restriction Eq. (51) will be called the mixed full factorial model of true effects (or the $G^{f}$ -model of true effects).

Denote by $\bm{Z}^{\Omega},\bm{V}^{\Omega}$ , and $\bm{\Theta}^{\Omega}$ the parts of the matrices $\bm{Z}_{1\ldots m},\bm{V}$ , and the vector $\bm{\Theta}$ respectively corresponding to the factorial set $\Omega$ . Assume that elements of the vector $\bm{\Theta}$ that do not correspond to the set $\Omega$ are equal to zero. Then the following model (which will be called the mixed factorial model of true effects for the factorial set $\Omega$ , or the -model of true effects) holds:

$\displaystyle\bm{\eta}^{f}=\bm{Z}^{\Omega}\bm{\Theta}^{\Omega}\left(\bm{V}^{% \Omega}\bm{\Theta}^{\Omega}=\bm{0}\right).$ (52)

We may omit the words “true effects” in the notation of the model.

The model Eq. (52) can be extended to a wider domain. In this case, we get the following model:

$\displaystyle Ey\left(X_{1},\ldots,X_{m}\right)=\bm{f}_{g}^{T}\left(X_{1},% \ldots,X_{m}\right)\bm{\Theta}^{\Omega}\left(\bm{V}^{\Omega}\bm{\Theta}^{% \Omega}=\bm{0}\right),$

where $\bm{f}_{g}^{T}\left(X_{1u},\ldots,X_{mu}\right)$ is the $u$ -th row of the matrix $\bm{Z}^{\Omega}.$

2.12 Equivalence of factorial models

We now focus on equivalence of factorial models in the sense of properties of related regression. The $A^{\Omega}$ -model and the $C^{\Omega}$ -model of true effects are special cases of the $G^{\Omega}$ -model of true effects. Hence, the only model (of considered factorial models) that is not a special case of the $G^{\Omega}$ -model of true effects is the general $A^{\Omega}$ -model Therefore, to prove equivalence of all types of factorial models for factorial set $\Omega$ we have to prove equivalence of any two $G^{\Omega}$ -models (i.e., any $G^{\Omega}$ -model of true effects and any $A^{\Omega}$ -model of true effects) and equivalence of $A^{\Omega}$ -models of true effects and the general $A^{\Omega}$ -model.

Consider a set $S^{\Omega f}$ that consists of vector $\bm{I}$ and a full set of linearly independent effects for the factorial set $\Omega$ for the full design $\bm{D}^{f}$ . For the fractional design $\bm{D}$ (i.e., for the design that does not include some combinations of the levels), consider a set $S^{\Omega D}$ of vectors with the following property. Its coordinates corresponding to some combination of levels of the factors are equal to the elements of vectors of the set $S^{\Omega f}$ corresponding to the same combination of levels for the design $\bm{D}^{f}$ . We will call the vectors of effects of the set $S^{\Omega D}$ the vectors of effects generated by the design $\bm{D}$ and the factorial set $\Omega,$ and denote them by upper index $\bm{D}$ . Let $\bm{Z}^{\Omega D}$ be the coefficient matrix of the design $\bm{D}$ for the $G^{\Omega}$ -model Eq. (52).

We now focus on the problem of estimability of the parameters of the model Eq. (52) for the fractional design $\bm{D}$ that includes some treatment combinations (not necessarily different) of the full design $\bm{D}^{f}$ .

Lemma 1.12.1. The matrix

$\displaystyle\begin{Vmatrix}\bm{Z}^{\Omega D}\\ \bm{V}^{\Omega}\\ \end{Vmatrix}$ (53)

is a matrix of full rank if and only if vectors of effects generated by the design $\bm{D}$ and the factorial set $\Omega$ are linearly independent.

Proof Let $\bm{\gamma}$ be a nonzero vector such that

$\displaystyle\begin{Vmatrix}\bm{Z}^{\Omega D}\\ \bm{V}^{\Omega}\\ \end{Vmatrix}\bm{\gamma}=\bm{0}.$ (54)

Then $\bm{V}^{\Omega}\bm{\gamma}=\bm{0}$ , and, by Lemma 1.10.1,

$\displaystyle\bm{\gamma}^{T}=\left(\gamma_{0},\bm{\gamma}_{1}^{T},\ldots,\bm{% \gamma}_{m}^{T},\bm{\gamma}_{i_{1}i_{2}}^{T},\ldots\right),$

where $\bm{z}_{i}\bm{\gamma}_{i}$ is the vector of the main effects of the factor $F_{i};\bm{x}_{i_{1}\ldots i_{r}}\bm{\gamma}_{i_{1}\ldots i_{r}}$ is the vector of the interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{r}}.$

Therefore,

$\displaystyle\bm{Z}^{\Omega D}\bm{\gamma}=\gamma_{0}\bm{I}+\sum\limits_{i=1}^{% m}{\bm{z}_{i}^{D}\bm{\gamma}_{i}^{T}+\sum\limits_{i_{1}i_{2}}{\bm{z}_{i_{1}i_{% 2}}^{D}\bm{\gamma}_{i_{1}i_{2}}}}+\ldots=0,$ (55)

where $\bm{z}^{D}$ includes those and only those rows of $\bm{z}$ that correspond to treatments combinations of the fractional design $\bm{D}$ i.e., $\bm{I,}\bm{z}_{i}^{D}\bm{\gamma}_{i}^{T},\bm{z}_{i_{1}i_{2}}^{D}\bm{\gamma}_{i% _{1}i_{2}}^{T},\ldots$ (the vectors of effects generated by the design $\bm{D}$ and the factorial set $\Omega)$ . It follows from Eq. (55) that these vectors effects generated by the design $\bm{D}$ and the factorial set $\Omega$ are linearly independent. By Lemma 1.10.1, any vector of the interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ can be represented as $\bm{z}_{i_{1}\ldots i_{r}}\bm{\gamma}_{i_{1}\ldots i_{r}}\left(\bm{V}_{i_{1}% \ldots i_{r}}\bm{\gamma}_{i_{1}\ldots i_{r}}=\bm{0}\right)$ . Then the corresponding vector of the effect generated by the design $\bm{D}$ is

$\displaystyle\bm{z}_{i_{1}\ldots i_{r}}^{D}\bm{\gamma}_{i_{1}\ldots i_{r}}% \left(\bm{V}_{i_{1}\ldots i_{r}}\bm{\gamma}_{i_{1}\ldots i_{r}}=\bm{0}\right).$

By virtue of the assumption, there exist $\lambda_{0},\lambda_{1},\ldots,\lambda_{m},\lambda_{i_{1}i_{2}},\ldots$ (not equal simultaneously zero) such that

$\displaystyle\lambda_{0}\bm{I}+\sum\limits_{i=1}^{m}{\lambda_{i}\bm{z}_{i}^{D}% \bm{\gamma}_{i}+\sum\limits_{i_{1},i_{2}}{\lambda_{i_{1}i_{2}}\bm{z}_{i_{1}i_{% 2}}^{D}\bm{\gamma}_{i_{1}i_{2}}}}+\ldots=0.$

Then for the vector $\bm{\gamma}^{T}=\left(\lambda{}_{0},\lambda_{1}\bm{\gamma}_{1}^{T},\ldots,% \lambda_{m}\bm{\gamma}_{m}^{T},\lambda_{i_{1}i_{2}}\bm{\gamma}_{i_{1}i_{2}}^{T% },\ldots\right)$ ,

$\displaystyle\bm{Z}^{\Omega D}\bm{\gamma}=\bm{0}\text{∼{}and∼{}}\bm{V}^{\Omega% }\bm{\gamma}=\bm{0},$

i.e., Eq. (54) is satisfied.

Thus, the proof is complete.

Consider three set of factors corresponding to the fractional design $\bm{D}$ : $F_{1},\ldots,F_{m};F_{1}^{\prime},\ldots,F_{m}^{\prime}$ and the quantitative factors $F_{1}^{\prime\prime},\ldots,F_{m}^{\prime\prime}$ such that $s_{i}=s_{i}^{\prime}=s_{i}^{\prime\prime}$ . For the factors $F_{1},\ldots,F_{m}$ , consider the $G^{\Omega}$ -model of true effects

$\displaystyle\bm{\eta}^{f}=\bm{Z}^{\Omega}\bm{\Theta}^{\Omega}$ (56)

with the restrictions on the parameters

$\displaystyle\bm{V}^{\Omega}\bm{\Theta}^{\Omega}=\bm{0}.$ (57)

For the factors $F_{1}^{\prime},\ldots,F_{m}^{\prime}$ , consider the $G^{\Omega}$ -model of true effects

$\displaystyle\bm{\eta}^{f}=\bm{Z}^{\prime\Omega}\bm{\Theta}^{\prime\Omega}$

with the restrictions on parameters

$\displaystyle\bm{V}^{\prime\Omega}\bm{\Theta}^{\prime\Omega}=\bm{0}.$

For the factors ${F}^{\prime}_{1},\ldots,F^{\prime}_{m},$ consider the general $A^{\Omega}$ -model Eq. (106) with the coefficient matrix $\bm{X}$ .

Theorem 1.12.1. If for the design $\bm{D}$ one of the matrices

$\displaystyle\begin{Vmatrix}\bm{Z}^{\Omega D}\\ \bm{V}^{\Omega}\\ \end{Vmatrix},\begin{Vmatrix}\bm{Z}^{\prime\Omega D}\\ \bm{V}^{\prime\Omega}\\ \end{Vmatrix},\text{∼{}and∼{}}\bm{X}$

is a full rank matrix, then any of them is a full rank matrix.

The proof of the theorem follows from Lemma 1.12.1 and the Corollary to Theorem 1.6.1.

Theorem 1.12.1 and Lemma 1.12.1 imply that the existence of unique solution of the normal equations of the method of least squares does not depend on whether the factors are qualitative or quantitative. It depends only on whether the vectors of effects generated by the design $\bm{D}$ and the factorial set $\Omega$ are linearly independent or not. The design $\bm{D}$ is called nonsingular if these vectors of effects (generated by the design $\bm{D}$ and the factorial set $\Omega$ ) are linearly independent.

Theorem 1.12.2. For nonsingular factorial design $\bm{D}$ , all factorial models for the same factorial set $\Omega$ are equivalent in the sense of properties of related regression (for any point, estimates of the regression function are equal and variances of these estimates are equal).

Proof First, we will prove that the general $A^{\Omega}$ -model and any $A^{\Omega}$ -model of true effects are equivalent. Second, we will prove that any $G^{\Omega}$ -model of true effects and any $A^{\Omega}$ -model of true effects are equivalent.

Consider the $A^{\Omega}$ -model of true effects

$\displaystyle Ey_{t}\left(X_{1},\ldots,X_{m}\right)=\bm{f}_{t}^{T}\left(X_{1},% \ldots,X_{m}\right)\bm{B}_{t}^{\Omega}=B_{t0}+\sum\limits_{i}{\bm{f}_{ti}^{T}% \left(X_{i}\right)}\bm{B}_{ti}+{\sum\limits_{i_{1,}i_{2}}{[\bm{f}_{ti_{1}}% \left(X_{i_{1}}\right){\otimes\bm{f}}_{ti_{2}}\left(X_{i_{2}}\right)}]}^{T}\bm% {B}_{ti_{1}i_{2}+\ldots},$ (58)

with the domain $Z_{t}$ that not necessarily coincides with $\bm{D}^{f}$ such that

$\displaystyle\bm{F}_{i}^{f}=\begin{Vmatrix}\bm{f}_{ti}^{T}\left(X_{i1}\right)% \\ \vdots\\ \bm{f}_{ti}^{T}\left(X_{iN^{f}}\right)\\ \end{Vmatrix}.$

Then it is evident that

$\displaystyle\bm{\Phi}^{\Omega}=\begin{Vmatrix}\bm{f}_{t}^{T}\left(X_{11},% \ldots,X_{m1}\right)\\ \vdots\\ \bm{f}_{t}^{T}\left(X_{1N^{f}},\ldots,X_{mN^{f}}\right)\\ \end{Vmatrix}.$

The coefficient matrix of the design $\bm{D}$ for the model Eq. (58) $\bm{X}_{t}=\bm{\Phi}^{\Omega D}$ .

Consider the general $A^{\Omega}$ -model with the domain $Z_{g}$ :

$\displaystyle Ey\left(X_{1},\ldots,X_{m}\right)=\bm{f}^{T}(X_{1},\ldots,X_{m})% \bm{B}^{\Omega}=B_{0}+\sum\limits_{i}{\bm{f}_{i}^{T}\left(X_{i}\right)}\bm{B}_% {i}+\sum\limits_{i_{1},i_{2}}\left[\bm{f}_{i_{1}}\left(X_{i_{1}}\right)\otimes% \bm{f}_{i_{2}}\left(X_{i_{2}}\right)\right]^{T}\bm{B}_{i_{1}i_{2}}+\ldots$ (59)

Denote the coefficient matrix of the design $\bm{D}$ for the model Eq. (59) by $\bm{X}$ . The submatrix $\bm{\Phi}_{i}^{fD}=\|\bm{I},\bm{F}_{i}^{fD}\|$ of the matrix $\bm{X}_{t}=\bm{\Phi}^{\Omega D}$ has the size $N\times s_{i}$ and the rank $s_{i}$ . The submatrix

$\displaystyle\bm{G}_{i}=\begin{Vmatrix}1&\bm{f}_{i}^{T}\left(X_{i1}\right)\\ &\vdots\\ 1&\bm{f}_{i}^{T}\left(X_{iN}\right)\\ \end{Vmatrix}$

of the matrix $\bm{X}$ has the same size $N\times s_{i}$ and rank $s_{i}$ . By the Corollary to Theorem 1.6.1, these submatrices are related by the nonsingular linear transformation $\bm{A}_{i}$ :

$\displaystyle\bm{\Phi}_{i}^{fD}=\bm{G}_{i}\bm{A}_{i}.$ (60)

The matrices $\bm{X}_{t}$ and $\bm{X}$ are also related by the nonsingular linear transformation:

$\displaystyle\bm{X}_{t}=\bm{XA}.$ (61)

Besides, for the treatment combinations of $\bm{D}^{f}$ the following equality holds:

$\displaystyle\bm{f}_{t}^{T}\left(X_{1},\ldots,X_{m}\right)=\bm{f}^{T}\left(X_{% 1},\ldots,X_{m}\right)\bm{A}.$ (62)

Then for the point $\left(X_{1},\ldots,X_{m}\right)\in\bm{D}^{f}$ , LS estimate for the model Eq. (58), by Eqs (61) and (62), coincides with LS estimate for the model Eq. (59):

$\displaystyle\hat{y}_{t}\left(X_{1},\ldots,X_{m}\right)=\bm{f}_{t}^{T}\left(X_% {1},\ldots,X_{m}\right)\hat{B}_{t}{=\bm{f}}_{t}^{T}\left(X_{1},\ldots,X_{m}% \right)\left(\bm{X}_{t}^{T}\bm{X}_{t}\right)^{-1}\bm{X}_{t}\bm{y}=\bm{f}^{T}% \left(X_{1},\ldots,X_{m}\right)\bm{A}\left(\bm{A}^{T}\bm{X}^{T}\bm{XA}\right)^% {-1}\bm{A}^{T}\bm{X}^{T}\bm{y}=\bm{f}^{T}\left(X_{1},\ldots,X_{m}\right)\left(% \bm{X}^{T}\bm{X}\right)^{-1}\bm{X}^{T}\bm{y}=\hat{y}\left(X_{1},\ldots,X_{m}% \right).$ (63)

The variance of the estimate $\hat{y}_{t}\left(X_{1},\ldots,X_{m}\right)$ at the point $\left(X_{1},\ldots,X_{m}\right)\in\bm{D}^{f}$ , by Eqs (61) and (62), coincides with the variance of the estimate $\hat{y}\left(X_{1},\ldots,X_{m}\right)$ :

$\displaystyle\frac{\sigma^{2}\left(\hat{y}_{t}\right)}{\sigma^{2}}=\bm{f}_{t}^% {T}\left(X_{1},\ldots,X_{m}\right)\left(\bm{X}_{t}^{T}\bm{X}_{t}\right)^{-1}% \bm{f}_{t}\left(X_{1},\ldots,X_{m}\right)=\bm{f}^{T}\left(X_{1},\ldots,X_{m}% \right)\bm{A}\left(\bm{A}^{T}\bm{X}^{T}\bm{XA}\right)^{-1}\bm{A}^{T}\bm{f}% \left(X_{1},\ldots,X_{m}\right)\bm{f}^{T}\left(X_{1},\ldots,X_{m}\right)\left(% \bm{X}^{T}\bm{X}\right)^{-1}\bm{f}\left(X_{1},\ldots,X_{m}\right)=\frac{\sigma% ^{2}\left(\hat{y}\right)}{\sigma^{2}}.$ (64)

Assume that the model Eq. (58) is defined on $Z_{t}$ such that the equality similar to Eq. (60) holds over the domain $Z=Z_{t}\cap Z_{g}$ , i.e., that

$\displaystyle\bm{f}_{ti}^{T}\left(X_{i}\right)=\bm{f}_{i}^{T}\left(X_{i}\right% )\bm{A}_{i}.$

Then Eq. (62) and, therefore, Eqs (63) and (64) are satisfied for all points $\left(X_{1},\ldots,X_{m}\right)\in Z$ .

To prove equivalence of the $G^{\Omega}$ -model and $A^{\Omega}$ -model, we will show the following: A reduction of the model Eqs (56)–(57) to the model without restrictions on parameters, with a coefficient matrix of a full rank, leads to the $A^{\Omega}$ -model of true effects

$\displaystyle\bm{\eta}^{f}=\bm{\Phi}^{\Omega}\bm{B}^{\Omega},$ (65)

where $\bm{\Phi}^{\Omega}$ contains the vectors of effects of the factorial set $\Omega$ for the design $\bm{D}$ .

Consider the model Eq. (56) with the restrictions on the parameters Eq. (57). These restrictions are split into the following partial restrictions:

$\displaystyle\bm{V}_{1\ldots r}\bm{\Theta}_{1\ldots r}=\bm{0},$ (66)

where $\bm{\Theta}_{1\ldots r}$ is the corresponding part of the vector $\bm{\Theta}^{\Omega}$ , i.e., $\bm{\Theta}_{1\ldots r}=\bm{\rho}_{1\ldots r}^{T}\bm{\eta}^{f}$ . Let $\bm{\gamma}_{i}$ be one of the solution of Eq. (66). Then, by Lemma 1.10.1, the vector $\bm{\gamma}_{i}$ corresponds to the vector $\bm{z}_{1\ldots r}\bm{\gamma}_{i}$ of the interaction effect of the factors $F_{1},\ldots,F_{r}.$

Lemma 1.12.2. A set of linearly independent vectors $\bm{\gamma}_{i}$ such that $\bm{V}_{1\ldots r}\bm{\gamma}_{i}=\bm{0}$ corresponds to a set of the linearly independent vectors $\bm{z}_{1\ldots r}\bm{\gamma}_{i}\left(i=1,\ldots,l\right)$ of the interaction effects of the factors $F_{1},\ldots,F_{r}$ .

Proof The equality

$\displaystyle\sum\limits_{i=1}^{l}{\lambda_{i}\bm{z}_{1\ldots r}\bm{\gamma}_{i% }=0}$ (67)

implies that

$\displaystyle\bm{z}_{1\ldots r}^{T}\bm{z}_{1\ldots r}\sum\limits_{i=1}^{l}{% \lambda_{i}\bm{\gamma}_{i}}=\bm{0}.$

Since $\bm{z}_{1\ldots r}^{T}\bm{z}_{1\ldots r}=n\bm{E}$ ,

$\displaystyle\sum\limits_{i=1}^{l}{\lambda_{i}\bm{\gamma}_{i}}=\bm{0}.$ (68)

It is easy to see that Eq. (68) implies Eq. (67). This proves the lemma.

Since $Rg{\bm{V}}_{1\ldots r}=s_{1}\ldots s_{r}-\left(s_{1}-1\right)\ldots\left(s_{r}% -1\right)$ , the general solution of Eq. (66) is

$\displaystyle\bm{\Theta}_{1\ldots r}=\bm{\Gamma}_{1\ldots r}\bm{B}_{1\ldots r},$ (69)

where $\bm{\Gamma}_{1\ldots r}$ is the $\left(s_{1}\ldots s_{r}\right)\times\left[\left(s_{1}-1\right)\ldots\left(s_{r% }-1\right)\right]$ matrix such that

$\displaystyle\bm{V}_{1\ldots r}\bm{\Gamma}_{1\ldots r}=\bm{0},∼{}∼{}Rg{\bm{% \Gamma}}_{1\ldots r}=\left(s_{1}-1\right)\ldots\left(s_{r}-1\right),$ (70)

and $\bm{B}_{1\ldots r}$ is an arbitrary $\left[\left(s_{1}-1\right)\ldots\left(s_{r}-1\right)\right]$ -dimensional vector. Now transform Eq. (56) to

$\displaystyle\bm{\eta}^{f}=\Theta_{0}+\sum\limits_{i=1}^{m}{\bm{z}_{i}\bm{% \Theta}_{i}+\sum\limits_{i_{1},i_{2}}{\bm{z}_{i_{1}i_{2}}\bm{\Theta}_{i_{1}i_{% 2}}}}+\ldots$ (71)

With the substitute Eq. (69), we get the following relationship:

$\displaystyle\bm{z}_{1\ldots r}\bm{\Theta}_{1\ldots r}=\bm{z}_{1\ldots r}\bm{% \Gamma}_{1\ldots r}\bm{B}_{1\ldots r}=\bm{X}_{1\ldots r}\bm{B}_{1\ldots r},$ (72)

where $\bm{X}_{1\ldots r}$ , by Eq. (70) and Lemma 1.12.2, contains the vectors of the interaction effects of the factors $F_{1},\ldots,F_{r}.$

The similar substitutes can be done for all terms of Eq. (55). With the notations

$\displaystyle\bm{X}=\left\|\bm{I},\bm{X}_{1},\ldots,\bm{X}_{m},\bm{X}_{12},% \ldots\right\|,$ $\displaystyle\bm{B}^{\Omega T}=\left\|\frac{1}{N}\bm{I},\bm{B}_{1}^{T},\ldots,% \bm{B}_{m}^{T},\bm{B}_{12}^{T},\ldots\right\|,$

we get the required model Eq. (65).

3. The effectiveness of designs

For the given design, the LS estimates possess some optimal properties. Hereafter, we will fix a method of statistical estimation (the LS method) and will solve the task of finding effective designs for factorial models. We will distinguish between two shades of meaning of the concept of an effective design: the criteria of optimality and the desirable properties of the design. There is no clear distinction between these two concepts. However, they can be described as follows.

The criteria of optimality are mathematically clear requirements for the design. These requirements in the majority of cases may be seen as an expansion of the concept of the best linear estimates. The desirable properties are those properties that are not very clear from the point of view of? mathematician but natural for the practitioners involved in experiment.

Hereafter we assume that all designs are nonsingular. By Theorem 1.12.1, whether the design is singular or nonsingular does not depend on the type of the factorial model for the factorial set $\Omega$ . Therefore, we introduce the types of nonsingular designs in accordance with the following definition.

Definition 2.0.1. A nonsingular design for the factorial model for the factorial set $\Omega$ containing all possible elements of $n$ factors $(n\leqslant r-1)$ is called the design of resolution $2r-1.$ A nonsingular design for the factorial model for the set $\Omega$ containing all possible elements of $n$ factors $(n\leqslant r-1)$ is called the design of resolution $2r$ if all effects of the set $\Omega$ are estimated with no bias in the model for the set $\Omega^{\prime}$ containing all possible elements of $l$ factors $(l\leqslant r)$ . The design of resolution 3 is also called the design of main effects and the corresponding model is called the model of main effects.

3.1 Optimality criteria of designs

If for the model Eq. (75) and the design Eq. (76), the information matrix $\bm{M}$ is a full rank matrix, the covariance matrix of the vector $\hat{\Theta}$ of estimates is

$\displaystyle\bm{\Gamma}(\hat{\bm{\Theta}})=\bm{M}^{-1}\sigma^{2}.$

The matrix $\bar{\bm{M}}=\frac{1}{N}\bm{M}$ is called a normalized information matrix and the matrix $\bar{\bm{\Gamma}}=\bar{\bm{M}}^{-1}\sigma^{2}$ is called a normalized covariance matrix.

The first three criteria of optimality will be introduced for the model without restrictions on parameters. They allow an interpretation that is associated with the size of the dispersion ellipsoid of the parameter estimates.

Definition 2.1.1. The design $\bm{D}^{*}$ is called $D$ -optimal on the set of designs $\cal D$ if

$\displaystyle\text{det∼{}}\bar{\bm{M}}\left(\bm{D}^{*}\right)=\max_{\bm{D}\in% \cal D}\text{det∼{}}\bar{\bm{M}}\left(\bm{D}\right).$ (73)

The dispersion ellipsoid of the parameter estimates of the $D$ -optimal design has minimal volume.

The criterion of $D$ -optimality (also called the Mood’s criterion) is the most popular one. It will be used also for the model Eq. (77) with the restrictions Eq. (100) on parameters. In this case, the matrix $\bar{\bm{M}}$ in Eq. (73) will correspond to the information matrix $\bm{M}=\bm{Q}^{T}\bm{X}^{T}\bm{XQ}$ for the reduced model Eq. (80). A property of $D$ -optimality of a design is invariant to any nonsingular linear transformation of the parameter vector. Based on that it is easy to prove that a property of $D$ -optimality of a design is invariant to a selection of the vector of new parameters $\bm{Q}_{n}$ of the reduced model.

Definition 2.1.2. The design $\bm{D}^{*}$ is called $A$ -optimal on the set of designs $\cal D$ if

$\displaystyle\text{Tr∼{}}\bm{\Gamma}\left(\bm{D}^{*}\right)=\max_{\bm{D}\in% \cal D}\text{Tr∼{}}\bm{\Gamma}\left(\bm{D}\right).$ (74)

The dispersion ellipsoid of the parameter estimates of the $A$ -optimal design has minimal length of a diagonal of the circumscribed parallelepiped.

The criterion of $A$ -optimality is also called the Kishen’s criterion.

Definition 2.1.3. The design $\bm{D}^{*}$ is called $E$ -optimal on the set of the designs $\cal D$ if

$\displaystyle\max_{q}q\left\{\bm{\Gamma}\left(\bm{D}^{*}\right)\right\}=\max_{% \bm{D}\in\cal D}\max_{q}q\left\{\bm{\Gamma}\left(\bm{D}\right)\right\},$ (75)

where $q\left\{\bm{\Gamma}\right\}$ is an eigenvalue of the matrix $\bm{\Gamma}$ .

The maximum axe of the dispersion ellipsoid of the parameters of estimates of an $E$ -optimal design has minimal length.

The criterion of $E$ -optimality is also called the Ehrenfeld’s criterion.

The following two criteria are related to the properties of the regression function in the domain.

Definition 2.1.4. The design $\bm{D}^{*}$ is called $G$ -optimal in the domain $Z$ on the set of the designs $\cal D$ if

$\displaystyle\max_{Z}N^{*}d\left(\bm{D}^{*},X_{1},\ldots,X_{m}\right)=\max_{% \bm{D}\in\cal D}\max_{Z}Nd\left(\bm{D},X_{1},\ldots,X_{m}\right),$ (76)

where $d\left(\bm{D},X_{1},\ldots,X_{m}\right)$ is the variance of the estimate of the regression function at the point $\left(X_{1},\ldots,X_{m}\right)\in Z$ .

The value

$\displaystyle\bar{d}\left(\bm{D},X_{1},\ldots,X_{m}\right)=\int\mathop{\ldots}% \limits_{Z}{\int{d\left(\bm{D},X_{1},\ldots,X_{m}\right)dX_{1}\ldots}}dX_{m}$

is called an average variance over the domain $Z$ .1

Definition 2.1.5. The design $\bm{D}^{*}$ is called $Q$ -optimal in the domain $Z$ on the set of the designs $\cal D$ if

$\displaystyle\bar{d}\left(\bm{D}^{*},X_{1},\ldots,X_{m}\right)=\max_{\bm{D}\in% \cal D}\bar{d}\left(\bm{D},X_{1},\ldots,X_{m}\right).$ (77)

The following two criteria (orthogonality and regularity) will be often used in here, although, at first glance, they have no such statistical justification as the previous criteria of this paragraph. However, at the end of this chapter, we will show why these criteria are important for applications.

Definition 2.1.6. A design is called orthogonal for the given model if the covariance matrix of the parameter vector of estimates for this model and for the design is diagonal.

Definition 2.1.7. A factorial design is called regular of strength $t$ if the condition of proportional frequencies is satisfied for any $t$ factors.

The following theorem is a corollary to Theorem 1.7.3.

Theorem 2.1.1. A regular factorial design of strength $t=2n$ allows obtaining a set of pairwise orthogonal main effects and interaction effects up to the order $n-1$ . A regular factorial design of strength $t=2n+1$ allows obtaining a set of pairwise orthogonal main effects and interaction effects up to the order $n-1$ such that each of them is orthogonal to all interaction effects of the order $n$ .

Theorem 2.1.1 implies that a regular factorial design of strength $t$ is a special case of the design of resolution $t+1$ .

Definition 2.1.8. A factorial design is called regular for the factorial set $\Omega$ if there exists a factorial model for the factorial set $\Omega$ for which this design is orthogonal.

By the Definition 2.1.8, a regular factorial design of strength $t=2n$ is a special case of the regular factorial design for the factorial set $\Omega$ , containing all possible elements of $l$ factors ( $l\leqslant n).$

Note that regularity of the design for the factorial set $\Omega$ does not imply orthogonality of the design for any model for the set $\Omega$ .

Theorem 2.1.2. The following three statements are equivalent:

1) 1)

The design $\bm{D}$ is regular for the factorial set $\Omega$ ;

For the design $\bm{D}$ all main effects and interaction effects corresponding to the factorial set $\Omega$ (one from each set of effects) are pairwise orthogonal;

In the design $\bm{D},$ the condition of proportional frequencies is satisfied for the factorial set $\Omega$ .

Proof Equivalence of the statements 2 and 3 follows from Theorems 1.7.2 and 1.7.3. Now we will show that the statement 1 of the theorem implies the statement 2. Indeed, it follows from the statement 1 that the coefficient matrix $\bm{X}$ has pairwise orthogonal columns. Hence, the columns corresponding functions

$\displaystyle f_{1}^{(1)}\left(X_{1}\right),\ldots,f_{1}^{\left(s_{1}-1\right)% }\left(X_{1}\right),\ldots,f_{m}^{(1)}\left(X_{m}\right),\ldots,f_{m}^{\left(s% _{m}-1\right)}\left(X_{m}\right)$

are orthogonal to the unit vector and, therefore, are main effects of the factors $F_{1},\ldots,F_{m}$ .

The column $\bm{f}_{i_{1}\ldots i_{n}}$ corresponding to the product $f_{i_{1}}^{\left(j_{1}\right)}\left(X_{i_{1}}\right)\times\ldots\times f_{i_{n% }}^{\left(j_{n}\right)}\left(X_{i_{n}}\right)$ has equal elements for the given combination of levels of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ . Hence, it follows from the statement 1 that this column is orthogonal to all main effects of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ and all interaction effects of these factors of the order $l<n-1$ . Therefore, the column $\bm{f}_{i_{1}\ldots i_{n}}$ is an interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ . Hence, the statement 2 holds.

It is easy to see from Theorem 1.8.2 that the statement 2 implies the statement 1.

This completes the proof of the theorem.

Very often, it is not easy to construct the design that satisfies all or even some of optimality criteria. So the designs that satisfy only one of the criteria are also important and useful.

3.2 Desirable properties of designs

We start considering the desirable properties of designs with the property related to the number of treatments of the design, which is very important for practitioners involved in experiment.

Definition 2.2.1. A design is called saturated for the factorial $A^{\Omega}$ -model if the number of runs of the design is equal to the number of parameters of the model $A^{\Omega}.$

We also apply the Definition 2.2.1 to models that include qualitative factors (with the restrictions on parameters). In this case we reduce the number of parameters in the Definition 2.2.1 by the number of linearly independent restrictions.

Among of other desirable properties of designs we note the following two:

•
Simplicity of calculations and interpretation of the results of observations;
•
Possibility to split the design into blocks when all experiments cannot be carry out in homogeneous conditions.

In the next chapters, we will address issues of construction of optimal designs with desirable properties.
3.3 Equivalence of d-and g-optimal designs

Let $\bm{D}$ be a set of all designs with the domain $Z$ that is closed and bounded. Then the following theorem of Kiefer-Wolfowitz holds:

Theorem 2.3.1 (Kiefer & Wolfowitz, 1960). The following statements are equivalent:

1) 1)
The design $\bm{D}^{}$ is $D$ -optimal on $\bm{D}$ ;
2)
The design $\bm{D}^{}$ is $G$ -optimal on $\bm{D}$ ;
3)
$\max_{Z}N^{}d\left(\bm{D}^{},X_{1},\ldots,X_{m}\right)=k$ .

Theorem 2.3.2 (Kiefer & Wolfowitz, 1960). The information matrix of $D$ -( $G$ -)optimal design is unique on $\bm{D}$ . The maximum of the variance of the estimate of the regression function on $Z$ is reached at points of the design.

Theorem 2.3.1, generally speaking, does not hold if $\bm{D}$ is a subset of the set of all designs. For example, for the subset of designs with the fixed number of treatments, $D$ - and $G$ -optimal designs are not equivalent. In this case the statement 3 of Theorem 2.3.1 holds neither for $D$ - nor for $G$ -optimal design.
3.4 Criterion of average variance

Let $\bm{D}$ be a factorial design for the $A^{\Omega}$ -model Eq. (106). The variance of the estimate of the regression function at point

$\displaystyle\bm{x}^{T}=\left[1,f_{1}^{(1)}\left(X_{1}\right),\ldots,f_{1}^{% \left(s_{1}-1\right)}\left(X_{1}\right),\ldots,k_{i_{1}i_{2}}^{1,1}f_{i_{1}}^{% (1)}\left(X_{i_{1}}\right)f_{i_{2}}^{(1)}\left(X_{i_{2}}\right),\right.\ldots,% \left.k_{i_{1}i_{2}}^{s_{i_{1}}-1,s_{i_{2}}-1}f_{i_{1}}^{\left(s_{i_{1}}-1% \right)}\left(X_{i_{1}}\right)f_{i_{2}}^{\left(s_{i_{2}}-1\right)}\left(X_{i_{% 2}}\right),\ldots\right]=\left\{\bm{x}^{T}\left(X_{1},\ldots,X_{m}\right)\right\}$

is equal to

$\displaystyle\sigma_{\bm{x}}^{2}=\sigma^{2}\bm{x}^{T}\left(\bm{X}^{T}\bm{X}% \right)^{-1}\bm{x}.$

By Theorem 1.4.1 and Note 2 to Theorem 1.4.1, functions in the $A^{\Omega}$ -model Eq. (106) can be selected in such a way that

$\displaystyle\sum\limits_{u=1}^{N^{f}}{x_{u}^{(j)}x_{u}^{(l)}}=N^{f}\delta_{jl}$ (78)

where ${x}_{u}^{(n)}=\bm{x}^{T}\left(X_{1u},\ldots,X_{mu}\right);n=jl$ .

Then the average variance over $\bm{D}^{f}$ is

$\displaystyle\sigma_{a}^{2}=\sigma^{2}/N^{f}\sum\limits_{x\in\bm{D}^{f}}{\bm{x% }^{T}\left(\bm{X}^{T}\bm{X}\right)^{-1}\bm{x}=\sigma^{2}/N^{f}\sum\limits_{u=1% }^{N^{f}}\sum\limits_{j=1}^{k}\sum\limits_{l=1}^{k}{x_{u}^{(j)}x_{u}^{(l)}c_{% jl}}}=\sigma^{2}/N^{f}\sum\limits_{j=1}^{k}\sum\limits_{l=1}^{k}c_{jl}\sum% \limits_{u=1}^{N^{f}}{x_{u}^{(j)}x_{u}^{(l)}}=\sigma^{2}\sum\limits_{j=1}^{k}% \sum\limits_{l=1}^{k}{c_{jl}\delta_{jl}}=\sigma^{2}\sum\limits_{j=1}^{k}{c_{jj% }=\sigma^{2}Tr\left\{\left(\bm{X}^{T}\bm{X}\right)^{-1}\right\}},$

where $k$ is the number of parameters in the model and $\left(\bm{X}^{T}\bm{X}\right)^{-1}=\left\{c_{jl}\right\}$ . By the results of Section 2.12, it follows that the variance of the estimate of the regression function depends neither on the type of the model for the factorial set $\Omega$ nor on the choice of the functions in the model. Therefore, the following theorem holds.

Theorem 2.4.1. The factorial design is $Q$ -optimal on $\bm{D}^{f}$ for any factorial model for the factorial set $\Omega$ if and only if it is $A$ -optimal for the $A^{\Omega}$ -model satisfied the condition Eq. (78).

Theorem 2.4.1 can also be obtained as a consequence of Theorem 2.12.1 of the book by Fedorov (1972).

Let the levels $0,1,\ldots,\left(s_{i}-1\right)$ of the factor $F_{i}$ occur $n_{i}^{(0)}n_{i}^{(1)},\ldots,n_{i}^{\left(s_{i}-1\right)}$ times respectively in the design $\bm{D}$ . In this case, obviously,

$\displaystyle\sum\limits_{j=0}^{s_{i}-1}n_{i}^{(j)}=N.$

Definition 2.4.1. The number

$\displaystyle U_{i}^{j_{i}}=\frac{N-n_{i}^{\left(j_{i}\right)}}{n_{i}^{\left(j% _{i}\right)}\left(s_{i}-1\right)}$ (79)

is called the coefficient of uniformity $ofthej_{i}$ -th level of the factor $F_{i}$ .

It is evident that if the $j_{i}$ -th level of the factor $F_{i}$ occurs in the design $\bm{D}$ more than $N/s_{i}$ times, $U_{i}^{j_{i}}<1;$ if the $j_{i}$ -th level of the factor $F_{i}$ occurs in the design $\bm{D}$ less than $N/s_{i}$ times, $U_{i}^{j_{i}}>1;$ if the $j_{i}$ -th level of the factor $F_{i}$ occurs in the design $\bm{D}$ exactly $N/s_{i}$ times, $U_{i}^{j_{i}}=1$ .

The last equality holds, in particular, for uniform designs for any level of any factor.

Definition 2.4.2. The average of the coefficients $U_{i}^{j_{i}}$ over all levels of the factor $F_{i}$

$\displaystyle U_{i}=\sum_{j_{i}=0}^{s_{i}-1}\frac{U_{i}^{j_{i}}}{s_{i}}=\frac{% N\sum_{j=0}^{s_{i}-1}\left(1/{n_{i}^{\left(j_{i}\right)}-s_{i}}\right)}{s_{i}% \left(s_{i}-1\right)}$ (80)

is called a coefficient uniformity of the factor $F_{i}$

Definition 2.4.3. A factor is called uniform if all its levels occur in the design with equal frequency. Otherwise, a factor is called nonuniform.

It is evident that for uniform factors $U_{i}=1,$ for nonuniform factors $U_{i}>1$ .

Consider the regular design $\bm{D}$ of main effects for the factors $F_{1},\ldots,F_{m}$ , i.e., the regular design for the factorial set $\Omega$ that consists of only the factors $F_{1},\ldots,F_{m}$ . Then all functions in the model

can be selected in such a way that the design $\bm{D}$ will be orthogonal for the model.

Let the values of the functions $f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ at the points of the design $\bm{D}$ form the matrix $\bm{F}_{i}$ of main effects of the factor $F_{i}$ (Section 2.3). Hence, the coefficient matrix $\bm{X}$ of the design $\bm{D}$ for the model Eq. (81) is

$\displaystyle\bm{X}=\left\|\bm{I},\bm{F}_{1},\ldots,\bm{F}_{m}\right\|.$

All columns of $\bm{X}$ are pairwise orthogonal and scalar square of any of them equals $N$ .

The covariance matrix of the vector of parameter estimates of the model Eq. (81) is

$\displaystyle\left(\bm{X}^{T}\bm{X}\right)^{-1}\sigma^{2}=\frac{\sigma^{2}}{N}% \bm{E}_{k}\left(k=\sum\limits_{i=1}^{m}{\left(s_{i}-1\right)+1}\right).$ (82)

Let $\bm{x}(j_{1},\ldots,j_{m})=[1,f_{1}^{(1)}(X_{1}^{j_{1}}),\ldots,f_{1}^{(s_{1}-% 1)}(X_{1}^{(j_{1})}),\ldots,f_{m}^{(1)}(X_{m}^{j_{m}}),\ldots,f_{m}^{(s_{m}-1)% }(X_{m}^{(j_{m})})]$ be the vector of functions corresponding to the levels $j_{1},\ldots,j_{m}$ of the factors $F_{1},\ldots,F_{m}$ respectively. The variance at this point is

$\displaystyle\sigma_{\bm{x}}^{2}=\bm{x}^{T}\left(\bm{X}^{T}\bm{X}\right)^{-1}% \bm{x}\sigma^{2}=\frac{\sigma^{2}}{N}\bm{x}^{T}\bm{x}.$

The normalized (per one treatment combination) variance at this point is $\bar{\sigma}_{\bm{x}}^{2}=\sigma^{2}\bm{x}^{T}\bm{x}$ .

Consider the matrix

$\displaystyle\bm{\Phi}_{i}=\left\|\bm{I},\bm{F}_{i}\right\|=\begin{Vmatrix}\bm% {a}_{i}^{0^{T}}\\ \vdots\\ \bm{a}_{i}^{0^{T}}\\ \vdots\\ \bm{a}_{i}^{s_{i}-1^{T}}\\ \vdots\\ \bm{a}_{i}^{s_{i}-1^{T}}\\ \end{Vmatrix}=\begin{Vmatrix}1&\bm{c}_{i}^{0^{T}}\\ \vdots&\vdots\\ 1&\bm{c}_{i}^{0^{T}}\\ \vdots&\vdots\\ 1&\bm{c}_{i}^{s_{i}-1^{T}}\\ \vdots&\vdots\\ 1&\bm{c}_{i}^{s_{i}-1^{T}}\\ \end{Vmatrix}{\begin{array}[]{*{20}c}\left.{\begin{array}[]{*{20}c}\\ \\ \\ \end{array}}\right\}\\ \vdots\\ \left.{\begin{array}[]{*{20}c}\\ \\ \\ \end{array}}\right\}\\ \end{array}}{\begin{array}[]{*{20}c}\\ n_{i}^{(0)}\\ \\ \\ \\ n_{i}^{\left(s_{i}-1\right)}\\ \\ \end{array}}.$

Then

$\displaystyle\bm{\Phi}_{i}^{T}\bm{\Phi}_{i}=N\bm{E}_{s_{i}}.$ (83)

Now introduce the following matrix:

$\displaystyle\tilde{\bm{\Phi}}_{i}=\begin{Vmatrix}\sqrt{n_{i}^{(0)}}\bm{a}_{i}% ^{0^{T}}\\ \vdots\\ \sqrt{n_{i}^{\left(s_{i}-1\right)}}\bm{a}_{i}^{s_{i}-1^{T}}\\ \end{Vmatrix}.$

$\tilde{\bm{\Phi}}_{i}$ is a square matrix of order $s_{i}$ . It follows from Eq. (83) that

$\displaystyle\tilde{\bm{\Phi}}_{i}^{T}\tilde{\bm{\Phi}}_{i}=N\bm{E}_{s_{i}}.$

Therefore,

$\displaystyle\bm{a}_{i}^{j_{i}T}\bm{a}_{i}^{j_{i}}=\frac{N}{n_{i}^{\left(j_{i}% \right)}},\bm{c}_{i}^{j_{i}T}\bm{c}_{i}^{j_{i}}=\frac{N}{n_{i}^{\left(j_{i}% \right)}}-1.$ (84)

By Eqs (79) and (84),

$\displaystyle\frac{\bar{\sigma}_{\bm{x}}^{2}}{\sigma^{2}}=\bm{x}^{T}\bm{x}=1+% \bm{c}_{1}^{j_{1}T}\bm{c}_{1}^{j_{1}}+\ldots+\bm{c}_{m}^{j_{m}T}\bm{c}_{m}^{j_% {m}}=1+\left(\frac{N}{n_{1}^{\left(j_{1}\right)}}-1\right)+\ldots+\left(\frac{% N}{n_{m}^{\left(j_{m}\right)}}-1\right)=1+\sum\limits_{i=1}^{m}{\left(s_{i}-1% \right)}U_{i}^{j_{i}}.$ (85)

Therefore, the normalized variance at any point $\bm{x}\left(j_{1},\ldots,j_{m}\right)$ of $\bm{D}^{f}$ can be expressed via the coefficient of uniformity of the levels $j_{1},\ldots,j_{m}$ of the factors $F_{1},\ldots,F_{m}$ respectively.

Calculate the sum of the normalized variances over all points of $\bm{D}^{f}$ :

$\displaystyle\sigma^{2}\left\{\sum\limits_{\bm{D}^{f}}{\left(\frac{N}{n_{1}^{% \left(j_{1}\right)}}-1\right)+\ldots+\sum\limits_{\bm{D}^{f}}{\left(\frac{N}{n% _{m}^{\left(j_{m}\right)}}-1\right)+s_{1}\ldots s_{m}}}\right\}$ $\displaystyle=\sigma^{2}\left\{s_{2}\ldots s_{m}\sum\limits_{j_{1}=0}^{s_{1}-1% }\left(\frac{N}{n_{1}^{\left(j_{1}\right)}}-1\right)\right.+\ldots\left.+s_{1}% \ldots s_{m-1}\sum\limits_{j_{m}=0}^{s_{m}-1}\left(\frac{N}{n_{m}^{\left(j_{m}% \right)}}-1\right)+s_{1}\ldots s_{m}\right\}.$

The average normalized variance over $\bm{D}^{f}$

$\displaystyle\bar{\sigma}_{a}^{2}=\sigma^{2}\sum\limits_{\bm{D}^{f}}\frac{\bm{% x}^{T}\bm{x}}{s_{1}\ldots s_{m}}=\sigma^{2}\left\{\frac{1}{s_{1}}\sum\limits_{% j_{1}=0}^{s_{1}-1}\left(\frac{N}{n_{1}^{\left(j_{1}\right)}}-1\right)\right.+% \ldots+\frac{1}{s_{m}}\left.\sum_{j_{m}=0}^{s_{m}-1}{\left(\frac{N}{n_{m}^{% \left(j_{m}\right)}}-1\right)+1}\right\}.$ (86)

Denote

$\displaystyle A_{i}=\frac{\sigma^{2}}{s_{i}}N\left(\sum\limits_{j_{i}=0}^{s_{i% }-1}{\frac{1}{n_{i}^{\left(j_{i}\right)}}-s_{i}}\right)$

and call it an average (over $\bm{D}^{f})$ normalized (per one treatment combination) variance for the factor $F_{i}$ . Then Eq. (86) implies that

$\displaystyle\bar{\sigma}_{a}^{2}=\sigma^{2}\sum\limits_{\bm{D}^{f}}{\bm{x}^{T% }\bm{x}}/\left(s_{1}\ldots s_{m}\right)=1+\sum\limits_{i=1}^{m}A_{i}.$

It is evident that

$\displaystyle A_{i}=\left(s_{i}-1\right)U_{i}\sigma^{2}.$

Hence,

$\displaystyle\bar{\sigma}_{a}^{2}=\sigma^{2}\left\{1+\sum\limits_{i=1}^{m}{% \left(s_{i}-1\right)U_{i}}\right\},$ (87)

Therefore, the average normalized variance is expressed via the coefficients of uniformity of factors.

In uniform regular designs, $U_{i}=1\left(i=1,\ldots,m\right),$ therefore, for such designs

$\displaystyle\bar{\sigma}_{a}^{2}=\sigma^{2}\left\{1+\sum\limits_{i=1}^{m}% \left(s_{i}-1\right)\right\}=k\sigma^{2}.$

In nonuniform regular designs, $U_{i}>1$ for some of $i$ therefore, for such designs $\bar{\sigma}_{a}^{2}>k\sigma^{2}.$

Consider a regular design $\bm{D}$ for a factorial model for a factorial set $\Omega$ .

Let

$\displaystyle\bm{x}\left(j_{1},\ldots,j_{m}\right)=\left[1,f_{1}^{(1)}\left(X_% {1}^{\left(j_{1}\right)}\right)\right.,\ldots,f_{1}^{\left(s_{1}-1\right)}% \left(X_{1}^{\left(j_{1}\right)}\right),\ldots,f_{m}^{(1)}\left(X_{m}^{\left(j% _{m}\right)}\right),\ldots,f_{m}^{\left(s_{m}-1\right)}\left(X_{m}^{\left(j_{m% }\right)}\right),f_{i_{1}}^{(1)}\left(X_{i_{1}}^{\left(j_{i_{1}}\right)}\right% )f_{i_{2}}^{(1)}\left(X_{i_{2}}^{\left(j_{i_{2}}\right)}\right),\ldots,\left.f% _{i_{1}}^{\left(s_{i_{1}}-1\right)}\left(X_{i_{1}}^{\left(j_{i_{1}}\right)}% \right)f_{i_{2}}^{\left(s_{i_{2}}-1\right)}\left(X_{i_{2}}^{\left(j_{i_{2}}% \right)}\right),\ldots\right]^{T}.$

Then

$\displaystyle\sigma_{\bm{x}}^{2}=\frac{\sigma^{2}}{N}\bm{x}^{T}\bm{x}.$ (88)

Similar to Eq. (85), we can get the following:

$\displaystyle\bar{\sigma}_{\bm{x}}^{2}=\sigma^{2}\left\{1+\sum\limits_{i=1}^{m% }{\bm{c}_{i}^{j_{i}^{T}}\bm{c}_{i}^{j_{i}}+\sum\limits_{i_{1},i_{2}}{\left(\bm% {c}_{i_{1}}^{j_{i_{1}}}\otimes\bm{c}_{i_{2}}^{j_{i_{2}}}\right)^{T}\left(\bm{c% }_{i_{1}}^{j_{i_{1}}}\otimes\bm{c}_{i_{2}}^{j_{i_{2}}}\right)+\ldots}}\right\}% =\sigma^{2}\left\{1+\sum\limits_{i=1}^{m}{\bm{c}_{i}^{j_{i}^{T}}\bm{c}_{i}^{j_% {i}}+\sum\limits_{i_{1},i_{2}}{\left(\bm{c}_{i_{1}}^{j_{i_{1}}T}\bm{c}_{i_{2}}% ^{j_{i_{2}}}\right)\left(\bm{c}_{i_{2}}^{j_{i_{2}}T}\bm{c}_{i_{2}}^{j_{i_{2}}}% \right)+\ldots}}\right\}=\sigma^{2}\left\{1+\sum\limits_{i=1}^{m}{\left(s_{i}-% 1\right)U_{i}^{j_{i}}+\sum\limits_{i_{1},i_{2}}{\left(s_{i_{1}}-1\right)\left(% s_{i_{2}}-1\right)U_{i_{1}}^{j_{i_{1}}}U_{i_{2}}^{j_{i_{2}}}+\ldots}}\right\}.$ (89)

Similar to Eq. (86), for the design under consideration, we can calculate the average (over $\bm{D}^{f})$ normalized variance:

$\displaystyle\bar{\sigma}_{a}^{2}=\sigma^{2}\left\{1+\frac{1}{s_{1}}\sum% \limits_{j_{1}=0}^{s_{1}-1}\left(\frac{N}{n_{1}^{\left(j_{1}\right)}}-1\right)% +\ldots+\frac{1}{s_{m}}\sum\limits_{j_{m}=0}^{s_{m}-1}\left(\frac{N}{n_{m}^{% \left(j_{m}\right)}}-1\right)\right.+\left.\sum\limits_{i_{1},i_{2}}\frac{1}{s% _{i_{1}}s_{i_{2}}}\sum\limits_{j_{i_{1}}=0}^{s_{i_{1}}-1}\left(\frac{N}{n_{i_{% 1}}^{\left(j_{i_{1}}\right)}}-1\right)\sum\limits_{j_{i_{2}}=0}^{s_{i_{2}}-1}% \left(\frac{N}{n_{i_{2}}^{\left(j_{i_{2}}\right)}}-1\right)+\ldots\right\}.$

Taking into account Eq. (80), we get

$\displaystyle\bar{\sigma}_{a}^{2}=\sigma^{2}\left\{1+\sum\limits_{i=1}^{m}{% \left(s_{i}-1\right)U_{i}}+\sum\limits_{i_{1},i_{2}}{\left(s_{i_{1}}-1\right)% \left(s_{i_{2}}-1\right)U_{i_{1}}U_{i_{2}}+\ldots}\right\}$

It is evident that in uniform regular designs, $\bar{\sigma}_{a}^{2}=\sigma^{2}k,$ where $k$ is the number of parameters of the model; in nonuniform regular designs, $\bar{\sigma}_{a}^{2}>\sigma^{2}k$ .

Consider the following efficiency function related to the criterion of the average variance:

$\displaystyle\varphi=\frac{k\sigma^{2}}{\bar{\sigma}_{a}^{2}}.$

Then for uniform regular designs, $\varphi=1$ . For nonuniform regular designs, $\varphi<1$ . Hence,

$\displaystyle\varphi\leqslant 1.$ (90)

Therefore, it makes sense to express the efficiency function related to the criterion of the average variance as $\varphi$ 100%.

Emphasize that Eq. (90) holds only for factorial models and designs. In connection with the last comment, consider the following example.

In the domain

$\displaystyle 0\leqslant X_{i}\leqslant 1∼{}∼{}(i=1,2,3),$

consider the design

$\displaystyle{X}_{1}∼{}∼{}{X}_{2}∼{}∼{}X_{3}$ $\displaystyle\begin{Vmatrix}1&1&0\\ 1&0&1\\ 0&1&1\\ \end{Vmatrix}$

for the model

$\displaystyle Ey=b_{1}X_{1}+b_{2}X_{2}+b_{3}X_{3}.$ (91)

It can be verified that this design is $D$ -optimal. The lack of an absolute term makes the model “nonfactorial”. And, therefore, we should not expect that Eq. (90) still holds. Calculate now the variances of estimates of the regression function (divided by $\sigma^{2})$ at 8 points of $\bm{D}^{f}$ . They are 1, 1, 1, 3/4, 3/4, 3/4, 3/4, 0. The average variance is 6 $\sigma^{2}$ /8. Hence, an efficiency of the design equals 133%.

3.5 D-optimality of regular factorial designs

In this paragraph we will obtain conditions under which regular factorial designs are D-optimal (Brodsky, 1975) and establish the relationship with other criteria of optimality.

First, consider the regular design of main effects for the model Eq. (81). It follows from Eq. (85) that for a uniform design, a normalized variance at any point $\bm{x}$ (in particular, at any point of the design $\bm{D})$ is equal to the number of parameters to be estimate:

$\displaystyle k=1+\sum\limits_{i=1}^{m}\left(s_{i}-1\right),$

i.e., maximum of a normalized variance over $\bm{D}^{f}$ is reached at points of the design $\bm{D}$ . Hence, the design $\bm{D}$ is $D$ -optimal on $\bm{D}^{f}$ .

Assume that in the design $\bm{D}$ , at least one factor, say $F_{1}$ is nonuniform. For each factor, select the level with the maximum value of coefficient of uniformity. Let these levels be $j_{1},\ldots,j_{m}$ . Since

$\displaystyle U_{1}^{j_{1}}>1,U_{i}^{j_{i}}\geqslant 1∼{}∼{}\left(i=2,\ldots,m% \right),$

it follows from Eq. (85) that $\bar{\sigma}_{\bm{x}\left(j_{1},\ldots,j_{m}\right)}^{2}>k\sigma^{2}$ . Therefore, a nonuniform regular design cannot be $D$ -optimal for the model Eq. (81) even on $\bm{D}^{f}$ . Hence, the following theorem has been proved.

Theorem 2.5.1. A regular factorial design of main effects is $D$ -optimal for the model Eq. (81) on $\bm{D}^{f}$ if and only if it is uniform.

Suppose that the functions $f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ in the model Eq. (81) form a set of orthogonal polynomials in $X_{i}$ at the points of the design $\bm{D}$ such that Eq. (82) holds. Denote by $a_{i}$ and $b_{i}$ minimal and maximum values of the variable $X_{i}$ respectively. The property of D-optimality of the design on $\bm{D}^{f}$ , by Theorem 2.5.1, holds for any choice of values of $X_{i}$ for each of levels of the factor $F_{i}$ . We will now try to answer the question how to select the values of $X_{i}$ to ensure that the resulting design is optimal on the cube $a_{i}\leqslant X_{i}\leqslant b_{i}$ . Hereafter, without loss of generality, we will consider a design on the cube $-1\leqslant X_{i}\leqslant 1$ .

Let $\bm{D}$ be the regular uniform design for the quantitative factors $F_{1},\ldots,F_{m}$ for the model of main effects. Let $s_{i}$ values $X_{i}^{(0)},\ldots,X_{i}^{\left(s_{i}-1\right)}$ of the variable $X_{i}$ that appear in the design $\bm{D}$ be the following: endpoints of the interval $[-1,1]$ and roots of the first derivative of the $\left(s_{i}-1\right)$ -th Legendre polynomial. Well known (Hoel, 1958; Guest, 1958) that the one-dimension design on interval $[-1,1]$ that consists of these $s_{i}$ points is a $D$ -optimal for the model

$\displaystyle Ey=b_{0}+b_{i}^{(1)}f_{i}^{(1)}\left(X_{i}\right)+\ldots+b_{i}^{% \left(s_{i}-1\right)}f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right).$

That implies the following:

$\displaystyle\max_{X_{i}\in[-1,+1]}\sum\limits_{j=1}^{s_{i}-1}\left[f_{i}^{(j)% }\left(X_{i}\right)\right]^{2}=s_{i}-1,$ (92)

The maximum in Eq. (92) is reached on interval $[-1,1]$ at the points $X_{i}^{(0)},\ldots,X_{i}^{\left(s_{i}-1\right)}$ . For the design $\bm{D}$ the normalized variance at the point

$\displaystyle\bm{x}^{T}=\left[1,f_{1}^{(1)}\left(X_{1}\right),\ldots,f_{1}^{% \left(s_{1}-1\right)}\left(X_{1}\right),\ldots,f_{m}^{(1)}\left(X_{m}\right),% \ldots,f_{m}^{\left(s_{m}-1\right)}\left(X_{m}\right)\right],$

by Eq. (82), is equal to

$\displaystyle\bar{\sigma}_{\bm{x}}^{2}=\left\{1+\sum\limits_{j=1}^{s_{1}-1}{% \left[f_{1}^{(j)}\left(X_{1}\right)\right]^{2}+\ldots+\sum\limits_{j=1}^{s_{m}% -1}\left[f_{m}^{(j)}\left(X_{m}\right)\right]^{2}}\right\}\sigma^{2}.$

Taking into account Eq. (92), we get that

$\displaystyle\max_{X_{i}\in[-1,+1]}\bar{\sigma}_{\bm{x}}^{2}=\sigma^{2}\left\{% 1+\sum\limits_{i=1}^{m}\left(s_{i}-1\right)\right\}=\sigma^{2}k.$

Therefore, the design $\bm{D}$ is $D$ -optimal. The design $\bm{D}$ is also $D$ -optimal for the model, obtained from the model Eq. (81) by a linear transformation of its functions. Hence, the following theorem has been proved.

Theorem 2.5.2. Consider a regular design $\bm{D}$ of main effects for the factors $F_{1},\ldots,F_{m}$ with the $s_{1},\ldots,s_{m}$ levels respectively for the model Eq. (81), where $f_{i}^{(j)}\left(X_{i}\right)$ is polynomial in $X_{i}$ of degree $j$ . Suppose that the variables $X_{i}$ take $s_{i}$ values at the endpoints of the interval $[-1,+1]$ and at roots of the first derivative of the $\left(s_{i}-1\right)$ -th Legendre polynomial. Then the design $\bm{D}$ is $D$ -optimal for the model Eq. (81) on the cube $-1\leqslant X_{i}\leqslant+1$ if and only if it is uniform.

Consider now the regular design $\bm{D}$ for the factorial set $\Omega$ for the $A^{\Omega}$ -model:

$\displaystyle Ey=b_{0}+b_{1}^{(1)}f_{1}^{(1)}\left(X_{1}\right)+\ldots+b_{1}^{% \left(s_{1}-1\right)}f_{1}^{\left(s_{1}-1\right)}\left(X_{1}\right)+\ldots+b_{% m}^{(1)}f_{m}^{(1)}\left(X_{m}\right)+\ldots+b_{m}^{\left(s_{m}-1\right)}f_{m}% ^{\left(s_{m}-1\right)}\left(X_{m}\right)+\sum\limits_{i_{1},i_{2}}\left[b_{i_% {1}i_{2}}^{(1,1)}f_{i_{1}}^{(1)}\left(X_{i_{1}}\right)f_{i_{2}}^{(1)}\left(X_{% i_{2}}\right)+\ldots+b_{i_{1}i_{2}}^{\left(s_{i_{1}}-1,s_{i_{2}}-1\right)}f_{i% _{1}}^{\left(s_{i_{1}}-1\right)}\left(X_{i_{1}}\right)f_{i_{2}}^{\left(s_{i_{2% }}-1\right)}\left(X_{i_{2}}\right)\right]+\ldots$ (93)

It follows from Eq. (89) that for the uniform design, normalized variance at any point $\bm{x\in}\bm{D}^{f}$ (in particular, at any point of the design $\bm{D})$ is equal to the number of parameters to be estimated. Therefore, the design $\bm{D}$ is $D$ -optimal in the domain $\bm{D}^{f}$ . On the other hand, by Eq. (89), if at least one factor of the design $\bm{D}$ is nonuniform, there exists the point $\bm{x\in}\bm{D}^{f}$ for which the normalized variance exceeds the number of parameters. Hence, a nonuniform design cannot be $D$ -optimal for the model Eq. (93) even on the $\bm{D}^{f}$ . Therefore, the following theorem has been proved.

Theorem 2.5.3. A regular factorial design for the factorial set $\Omega$ is $D$ -optimal for the factorial $A^{\Omega}$ -model on $\bm{D}^{f}$ if and only if it is uniform.

Let the function $f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ be the set of orthogonal polynomials in $X_{i}$ at points of the design $\bm{D}.X_{i}$ is defined on the interval $[-1,+1]$ . A property of $D$ -optimality of a regular design for the factorial set $\Omega$ on $\bm{D}^{f}$ does not depend on the values $X_{i}$ for each of $s_{i}$ levels of the factor. We will find the set of the values $X_{i}$ that makes the design $D$ -optimal on the cube $-1\leqslant X_{i}\leqslant+1.$

Consider a regular uniform design for the factorial set $\Omega$ . Assume that $s_{i}$ different values $X_{i}^{(0)},\ldots,X_{i}^{\left(s_{i}-1\right)}$ of variable $X_{i}$ in the design $\bm{D}$ are the endpoints of the interval $[-1,+1]$ and the roots of the first derivative of the $\left(s_{i}-1\right)$ -th Legendre polynomial on this interval.

For the design $\bm{D}$ for $A^{\Omega}$ -model Eq. (93), the normalized variance at the point

$\displaystyle\bm{x}^{T}=\left[1,f\right._{1}^{(1)}\left(X_{1}\right),\ldots,f_% {1}^{\left(s_{1}-1\right)}\left(X_{1}\right),\ldots,f_{m}^{(1)}\left(X_{m}% \right),\ldots,f_{m}^{\left(s_{m}-1\right)}\left(X_{m}\right),\left.f_{i_{1}}^% {(1)}\left(X_{i_{1}}\right)f_{i_{2}}^{(1)}\left(X_{i_{2}}\right),\ldots,f_{i_{% 1}}^{\left(s_{i_{1}}-1\right)}\left(X_{i_{1}}\right)f_{i_{2}}^{\left(s_{i_{2}}% -1\right)}\left(X_{i_{2}}\right)\right]$

equals, by Eq. (88),

$\displaystyle\bar{\sigma}_{\bm{x}}^{2}=\sigma^{2}\left\{1+\sum\limits_{j=1}^{s% _{1}-1}\left[f_{1}^{(j)}\left(X_{1}\right)\right]^{2}\right.+\ldots+\sum% \limits_{j=1}^{s_{m}-1}\left[f_{m}^{(j)}\left(X_{m}\right)\right]^{2}\left.+% \sum\limits_{i_{1},i_{2}}\left\{\sum\limits_{j=1}^{s_{i_{1}}-1}{\left[f_{i_{1}% }^{(j)}\left(X_{i_{1}}\right)\right]^{2}\sum\limits_{j=1}^{s_{i_{2}}-1}\left[f% _{i_{2}}^{(j)}\left(X_{i_{2}}\right)\right]^{2}}\right\}+\ldots\right\}.$

Taking into account Eq. (92), we get

$\displaystyle\bar{\sigma}_{\bm{x}}^{2}=\sigma^{2}\left\{1+\sum\limits_{i=1}^{m% }{\left(s_{i}-1\right)+\sum\limits_{i_{1},i_{2}}{\left(s_{i_{1}}-1\right)\left% (s_{i_{2}}-1\right)+\ldots}}\right\}=k\sigma^{2}.$

Therefore, the design $\bm{D}$ is $D$ -optimal. The design $\bm{D}$ is also $D$ -optimal for the model obtained from the model Eq. (93) by linear nonsingular transformation of the set its functions. Hence, the following theorem has been proved.

Theorem 2.5.4. Let $f_{i}^{(j)}\left(X_{i}\right)$ in the $A^{\Omega}$ -model Eq. (93) be a polynomial in $X_{i}$ of degree $j$ . Then the regular factorial design $\bm{D}$ for the factorial set $\Omega$ with the variables $X_{i}$ that have $s_{i}$ different values at the endpoints of the interval $[-1,+1]$ and at the roots of the first derivative of the $\left(s_{i}-1\right)$ -th Legendre polynomial is $D$ -optimal for the $A^{\Omega}$ -model Eq. (93) on the cube $-1\leqslant X_{i}\leqslant+1$ if and only if the design $\bm{D}$ is uniform.

Let $\bm{D}_{i}$ be a $D$ -optimal uniform design with $s_{i}$ runs on the interval $[-1,+1]$ for the model

$\displaystyle Ey=b_{0}+b_{i}^{(1)}f_{i}^{(1)}\left(X_{i}\right)+\ldots+b_{i}^{% \left(s_{i}-1\right)}f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right).$

The following theorem is a generalization of Theorem 2.5.4 and can be proved analogously.

Theorem 2.5.5. A regular factorial design $\bm{D}$ for the factorial set $\Omega$ where for each $i(i=1,\ldots,m)s_{i}$ levels of the variables $X_{i}$ match with $s_{i}$ levels of the variables of the design $\bm{D}_{i}$ , is a $D$ -optimal for the $A^{\Omega}$ -model Eq. (93) on the cube $-1\leqslant X_{i}\leqslant+1$ if and only if the design $\bm{D}$ is uniform.

Theorem 2.5.6. The $D$ -optimal regular uniform design $\bm{D}$ from Theorem 2.5.5 is $Q$ -optimal and $A$ -optimal if the set of the functions $f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ satisfies the condition Eq. (78).

Proof The theorem is a corollary to Theorem 2.11.1 of the book of Fedorov (1972) and the results of this chapter.

Now consider the mixed factorial $G^{\Omega}$ -model for the factorial set $\Omega$ for the qualitative factors $F_{1},\ldots,F_{n}$ and the quantitative factors $F_{n+1},\ldots,F_{m}$ :

$\displaystyle Ey=\bm{f}^{T}\left(X_{1},\ldots,X_{m}\right)\bm{\Theta}.$ (94)

The domain is defined as follows:

$\displaystyle F_{j}=0,1,\ldots,s_{j}-1\left(j=1,\ldots,n\right),-1\leqslant X_% {i}\leqslant 1\left(i=n+1,\ldots,m\right).$ (95)

It follows from Section 2.11 that for a $k$ -dimensional vector $\bm{\Theta}$ of parameters of the $G^{\Omega}$ -model, the following equality holds:

$\displaystyle\bm{T\Theta}=\bm{0}\left(Rg\bm{T}=q\right).$ (96)

Let the general solution of Eq. (96) be

$\displaystyle\bm{\Theta}=\bm{Q}\bm{\theta}_{m},$

where $\bm{Q}$ is a $k\times(k-q)$ matrix; $Rg\bm{Q}=k-q,\bm{TQ}=\bm{0};\bm{\theta}_{n}$ is the vector of $k-q$ elements, which are the new parameters. The value $k-q$ can be calculated based on Theorem 1.9.2. After the reparametrization we will have the following model:

$\displaystyle Ey=\bm{f}^{T}\left(X_{1},\ldots,X_{m}\right)\bm{Q}\bm{\theta}_{m}.$ (97)

For different new parameters $\bm{\theta}_{n}$ , we will have different matrices $\bm{Q}$ related to each other by linear nonsingular transformations. This is equivalent to linear nonsingular transformations of the set of functions of the model Eq. (97). Property of $D$ -optimality is invariant to such transformations. In view of the last remark, we give the following definition.

Definition 2.5.1. The design $\bm{D}$ is called $D$ -optimal for the $G^{\Omega}$ -model with the restrictions Eq. (96) on Eq. (95) if it $D$ -optimal for any model Eq. (97) that is the result of reparameterization of the model Eq. (94).

Let $\bm{D}$ be a regular uniform design for the factorial set $\Omega$ for the qualitative factors $F_{1},\ldots,F_{n}$ and quantitative factors $F_{n+1},\ldots,F_{m}$ . Assume that the functions $f_{i}^{(1)}\left(X_{i}\right),\ldots,f_{i}^{\left(s_{i}-1\right)}\left(X_{i}\right)$ (which are included to the $G^{\Omega}$ -model for quantitative factors) are pairwise orthogonal so that condition Eq. (78) holds. Then, by using methods similar to those in Sections 3.4 and 3.5, we can show that the normalized (per one treatment combination) variance of the estimate of the regression function at the point $\left(j_{1},\ldots,j_{n},X_{n+1},\ldots,X_{m}\right)$ is equal to

$\displaystyle\sigma_{a}^{2}\left(j_{1},\ldots,j_{n},X_{n+1},\ldots,X_{m}\right% )=\sigma^{2}\left\{1+\sum_{j=1}^{s_{1}-1}{{\left[f_{n+1}^{(j)}(X_{n+1})\right]% }^{2}+\ldots}\right.$ $\displaystyle\quad+\sum_{j=1}^{s_{m-1}}{\left[f_{m}^{(j)}(X_{m})\right]}^{2}+% \left(s_{1}-1\right)+\ldots+\left(s_{n}-1\right)+\sum\limits_{i_{1},i_{2}}{(s_% {i_{1}}-1)}\left.\sum_{j=1}^{s_{i_{2}}-1}{\left[f_{i_{2}}^{(j)}(X_{i_{2}})% \right]}+\ldots\right\}$ (98)

If the variables $X_{i}$ take $s_{i}$ values at the points of the design $\bm{D}_{i}$ , then, by Eq. (3.5) and Theorem 1.9.2,

$\displaystyle\sigma_{a}^{2}(j_{1},\ldots,j_{n},X_{n+1},\ldots,X_{m},)=\sigma^{% 2}\left[1+\sum_{i=1}^{m}(s_{i}-1)+\sum_{i_{1},i_{2}}{(s_{i_{1}}-1)(s_{i_{2}}-1% )+}\ldots\right]=\sigma^{2}(k-q).$ (99)

The multiplier $k-q$ in Eq. (99) matches with the number of parameters of the model Eq. (97). Now it is evident that by using the line of the proof of Theorem 2.5.4, we can prove the following generalization of Theorem 2.5.5.

Theorem 2.5.7. Let the design $\bm{D}$ be a regular factorial design for the factorial set $\Omega$ for the quantitative variables $X_{1},\ldots,X_{m}$ and the qualitative factors $F_{m+1},\ldots,F_{n}$ . Let for each $i\left(i=1,\ldots,m\right)s_{i}$ levels of the variables $X_{i}$ match with $s_{i}$ levels of the variables of the design $\bm{D}_{i}$ . Then the design $\bm{D}$ is $D$ -optimal for the $G^{\Omega}$ -model Eq. (94) with the restrictions Eq. (96) on Eq. (95) if and only if it is uniform.

Note to Theorem 2.5.7. It is easy to prove that for the design $\bm{D}$ from Theorem 2.5.7, a statement similar to Theorem 2.5.6 holds.

The results of this chapter provide a justification for the applicability of the criterion of regularity in the design of experiments.

3.6 BG-criterion

A geometric interpretation of $D$ -optimal (and close to $D$ -optimal) designs is based on the volume of a multi-dimensional ellipsoid. This interpretation does not give a clear understanding on a relative effectiveness of designs when we want to compare them. In this paragraph we will consider a transformation that reduces the multi-dimensional characteristic of $+D$ -optimality to the linear one. The transformation was introduced by Brodsky and Golikova (1981). It is applicable not only to factorial models but also to other polynomial models.

Consider one preliminary example (Brodsky & Golikova, 1981) for the design of second order. Let $\bm{D}^{*}$ be the $D$ -optimal design on the cube

$\displaystyle-1\leqslant X_{i}\leqslant 1(i=1,\ldots,7)$

for the polynomial model of second order. Let $\bm{M}^{*}$ be the information matrix of the design $\bm{D}^{*}$ .

For the same model, consider another design $\bm{D}_{1}$ obtained by multiplication of all coordinates of any point of $\bm{D}^{*}$ by 0.99. Practically, these two designs can be regarded as “almost the same”.

It is easy to calculate that

$\displaystyle\text{det∼{}}M_{1}=0.28\cdot\text{det∼{}}\bm{M}^{{*}},$

where $M_{1}$ is an information matrix for the design $D_{1}$ .

Hence, the design $\bm{D}_{1}$ is “three times worse” than the design $\bm{D}^{*}$ (based on determinant of information matrix). Therefore, it is obvious that it does not make much sense to compare designs based on the determinant of the information matrix. This is not surprising: the clarity of the comparison is lost in moving to multidimensional characteristics. Therefore, all criteria are usually reduced to linear characteristics.

It is for this reason that a number of authors performs certain transformations on this criterion (the determinant of the information matrix). The most popular transformation is a root of degree $q$ . Some authors assume that $q=k$ , where $k$ is the number of parameters of the model. However, more often, they assume $q=2k$ . Now we will see what this yields for the example above.

It is easy to calculate that

$\displaystyle\frac{{(\text{det∼{}}\bm{M}_{\bm{1}})}^{1/k}}{{(\text{det∼{}}\bm{% M}^{*})}^{1/k}}100\%=96.5\%,$ $\displaystyle\frac{{(\text{det∼{}}\bm{M}_{\bm{1}})}^{1/{2k}}}{{(\text{det∼{}}% \bm{M}^{*})}^{1/{2k}}}100\%=98.3\%.$

i.e., the design that has to have an 99%-efficiency is interpreted as 96.5%-optimal (for $q=k$ ) or 98.3%-optimal (for $q=2k$ ). The difference is not so big, especially for $q=2k$ . So, it may seem that the transformation meets the goal. However, one more example will show that it does not.

In our example, replace the design $\bm{D}_{1}$ with the design $\bm{D}_{2}$ obtained by multiplication of all coordinates of any point of $\bm{D}^{*}$ by 0.90. It is easy to calculate that for $q=2k$ it will be interpreted as 83%-optimal (instead of 90%-optimal) and for $q=k$ , as 69%-optimal. It is possible to give even stronger examples.

It turns out that the use of these transformations only aggravates the situation. Indeed, since determinants of information matrices usually differ by several orders of magnitude, nobody wants to compare them. It is simply stated, instead, which is the greater. The transformations give the impression of comparability of criteria. The researcher might choose a not quite appropriate design (e.g., one with a large number of experiments) just because it has a “significantly” better characteristic than others, while in reality, the characteristics of all designs might be very close to each other.

Is it possible to find a transformation that never distort (in the sense mentioned above) characteristic of $D$ -optimality? Such a transformation for polynomial models is a root of degree

$\displaystyle q=2n_{1}+4n_{2}+\ldots+2ln_{l},$ (100)

where $n_{i}$ is the number of terms of order $i$ in the model $\left(i=1,\ldots,l\right)$ .

We will call a corresponding criterion the BG-criterion.

Denote the number of variables by $m$ and the number of parameters of the model, by $k$ . Then for the designs of the first order (for example, two-level factorial design of main effects), Eq. (100) will be as follows:

$\displaystyle q=2k-2.$

For designs of second order,

$\displaystyle q=4k-2n-4.$

For the example under consideration, $q=126$ (instead of usually used 36 and 72).

Therefore, BG-criterion of optimality of the design $\bm{D}$ (related to the criterion of $D$ -optimality and expressed by determinant of the information matrix $\bm{M}$ of the design $\bm{D}$ ) is:

$\displaystyle\frac{{(\text{det∼{}}\bm{M)}}^{1/q}}{{(\text{det∼{}}\bm{M}^{*})}^% {1/q}}100\%,$ (101)

where $q$ is defined by Eq. (100).

Using Eq. (101), we get that the value of BG-criterion equals 99% for the design $\bm{D}_{1}$ and equals 90% for the design $\bm{D}_{2}$ .

A geometric interpretation of the BG-criterion is obvious. If for the given design $\bm{D}$ , the value of BG-criterion equals, say, 95%, then a $D$ -optimal design with coordinates of any points multiplied by 0.95 has the same determinant of information matrix as the design $\bm{D}$ .

4. Geometric designs

4.1 Splitting of degrees of freedom

This and the next paragraphs are devoted to the fundamental concept introduced by Bose (1947) – the nature of degrees of freedoms split in a full symmetrical design $s^{m}$ where $s=p^{h},p$ is prime.

Definition 3.1.1. Let $y_{1},\ldots,y_{l}$ and $y_{l+1},\ldots,y_{2l}$ be two sets of observations. Then a vector of coefficients of the linear function of observations

$\displaystyle y_{1}+\ldots+y_{l}-y_{l+1}-\ldots-y_{2l}$ (102)

is called a contrast between these two sets of observations.

It is evident that Eq. (5) is satisfied for the vector of coefficients of the linear function Eq. (102).

Assume that all $N$ observations are divided into $q$ sets of $N_{1}=N/q$ observations in each in such a way that no one observation belongs to two sets. Then there exist $\left({\begin{array}[]{*{20}c}q\\ 2\\ \end{array}}\right)=q(q-1)/2$ different contrast between these sets. It is evident that maximum number of linearly independent contrasts equals $q-1$ . An example of $q-1$ linearly independent contrasts could be the contrasts between any fixed set and all other sets. The contrasts between these sets are said to carry $q-1$ degrees of freedom.

The following lemma is evident.

Lemma 3.1.1 (Bose, 1947). Suppose that all $N$ observations are divided into $q_{1}$ subsets of $N_{1}=N/q_{1}$ observations each in one way, and into $q_{2}$ sets of $N_{2}=N/q_{2}$ observations each in another way so that for every split, each of $N$ observations belong to one and only one of subset. Then if for any subset of the first split, $N_{1}/q_{2}$ observations belong to any subset of the second split, a contrast between any two subsets of the first split is orthogonal to a contrast between any two subsets of the second split.

Consider the full symmetrical design $s^{m}$ where $s=p^{h},p$ is prime, $h$ is integer. In the design $s^{m}$ , every level of a factor corresponds to an element of Galois field $GF(s)$ . Then any treatment combination of the design with the factors $F_{1},\ldots,F_{m}$ at levels $\chi_{1},\ldots,\chi_{m}$ can be represented by the point of an $m$ -dimensional finite Euclidean space $EG(m,s).$

Let $P\left(a_{1},\ldots,a_{m}\right)$ be the pencil of parallel flats in $EG(m,s)$ . By this pencil, all $s^{m}$ treatments are divided into $s$ subsets of $s^{m-1}$ treatments (each subset corresponds to one flat of the pencil). Different flats of the pencil have no points in common, and there exists one flat that passes thorough each point of $EG(m,s)$ . Hence, each treatment belongs to one and only one subset. Therefore, the maximal number of linearly independent contrasts between these subsets is $s-1$ . In this case the pencil $P\left(a_{1},\ldots,a_{m}\right)$ of parallel flats is said to carry $s-1$ degrees of freedom.

Consider two different pencils $P_{1}$ and $P_{2}$ of parallel flats.

Theorem 3.1.1 (Bose, 1947). A contrast between any two subsets generated by the pencil $P_{1}$ is orthogonal to a contrast between any two subset generated by the pencil $P_{2}.$

Proof Any given flat of the pencil $P_{1}$ intersects $s$ different flats of the pencil $P_{2}$ in $s$ different $(m-2)$ -flats. No two of $(m-2)$ -flats have any point in common (otherwise two different flats of the pencil $P_{2}$ would have a point in common). Any $(m-2)$ -flat contains exactly $s^{m-2}$ points. Therefore, each $(m-1)$ -flats of the pencil $P_{2}$ contains exactly $s^{m-2}$ points of $s^{m-1}$ points belonging to the given $(m-1)$ -flat of the pencil $P_{1}$ . Now the theorem statement follows from Lemma 3.1.1.

The number of different pencils of parallel flats is equal to $\left(s^{m}-1\right)/(s-1)$ . Each pencil carries $s-1$ degrees of freedom. Hence, all $s^{m}-1$ degrees of freedom carried by all contrasts can be split up to $\left(s^{m}-1\right)/(s-1)$ sets (generated by pencils of parallel flats) of $s-1$ degrees of freedom each so that any contrast corresponding to one set is orthogonal to any contrast corresponding to another set.

4.2 Nature of degrees of freedom carried by parallel pencils

Following Bose (1947), consider the nature of $s-1$ degrees of freedom carried by the pencil

$\displaystyle P\left(a_{1},\ldots,a_{m}\right).$ (103)

Suppose that of $m$ coordinates $a_{1},\ldots,a_{m}$ of the pencil Eq. (103), $n$ coordinates are nonzero (without loss of generality, $a_{1},\ldots,a_{n})$ and the rest of them ( $a_{n+1},\ldots,a_{n})$ are equal to zero. Any $(m-1)$ -flat of the pencil Eq. (103) is

$\displaystyle a_{0}+a_{1}\chi_{1}+\ldots+a_{n}\chi_{n}=0,$ (104)

where $a_{0}$ is one of the $s$ elements of $GF(s)$ . Consider two points such that the $i$ -th coordinate of one of them equals the $i$ -th coordinate of other point for all $i=1,\ldots,n$ . It is evident that these two points either simultaneously satisfy Eq. (104) or do not. Therefore, coordinates of a contrast between any two flats of the pencil Eq. (103) are the same for the same combinations of $\chi_{1},\ldots,\chi_{n}$ .

When $n=1$ , coordinates of a contrast between any two flats of the pencil

$\displaystyle P(a,0,\ldots,0)$ (105)

depend only on levels of the factor $F_{1}$ and, by the definition, form a main effect of the factor. Since the pencil of parallel flats carries $s-1$ degrees of freedom, the pencil Eq. (105) generates a full set of linearly independent main effects.

When $n=2$ , coordinates of a contrast between any two flats of the pencil

$\displaystyle P\left(a_{1},a_{2},0,\ldots,0\right)$ (106)

depend only on levels of the factors $F_{1}$ and $F_{2}$ . By Theorem 3.1.1, this contrast is orthogonal to all main effects of the factors $F_{1}$ and $F_{2}$ . Therefore, it is an interaction effect of the factors $F_{1}$ and $F_{2}$ . The number of different pencils of type Eq. (106) equals $s-1$ . Each of them carries $s-1$ degrees of freedom, and these degrees of freedom for one pencil of type Eq. (106), by Theorem 3.1.1, are orthogonal to degrees of freedom of other pencil of type Eq. (106). Therefore, all pencils of type Eq. (106) produce the full set of $(s-1)^{2}$ linearly independent interaction effects of the factors $F_{1}$ and $F_{2}$ .

Increasing $n$ we get the following theorem.

Theorem 3.2.1 (Bose, 1947). If $n$ coordinates $a_{i_{1}},\ldots,a_{i_{n}}$ of the pencil Eq. (103) are nonzero and the rest of them are zero, a contrast between any flats of the pencil Eq. (103) is an interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ . The pencil Eq. (103) carries $s-1$ degrees of freedom. The number of different pencils generating interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ equals $(s-1)^{n-1}$ .

Let $X_{i}\in PG(m,s)$ be the vertex of the fundamental simplex. Consider the bundle of the parallel flats $P\left(a_{1},\ldots,a_{m}\right)$ in $PG(m,s)$ . Assume that the coordinates $a_{i_{1}},\ldots,a_{i_{n}}$ of this bundle are nonzero, and the rest coordinates are equal to zero. Then the vertex of the bundle is

$\displaystyle\chi_{0}=0,a_{i_{1}}\chi_{i_{1}}+\ldots+a_{i_{n}}\chi_{i_{n}}=0.$ (107)

The vertex Eq. (107) passes thorough all vertices of the fundamental simplex, except the vertices $X_{i_{1}},\ldots,X_{i_{n}}$ . Then the following theorem follows from Theorem 3.2.1.

Theorem 3.2.2 (Bose, 1947). The pencil $P\left(a_{1},\ldots,a_{m}\right)$ of parallel flats corresponds to interaction effects of order $(n-1)$ of the factors $F_{i_{1}},\ldots,F_{i_{n}}$ if and only if the vertex of the corresponding bundle passes through all the vertices of the fundamental simplex other than $X_{i_{1}},\ldots,X_{i_{n}}.$

4.3 Hypercubes of strength t

Consider a full symmetrical design with each factor at $s$ levels.

Set up a one-to-one correspondence between the levels of the factors $F_{i}\left(i=1,\ldots,m\right)$ and the elements of Galois field $GF(s)$ denoted by $0,1,\ldots,s-1$ . The full design corresponds to the points of a finite Euclidean space $EG(m,s)$ . Denote coordinates of points of this space as $\left(\chi_{1},\ldots,\chi_{m}\right)$ . Then the system of $l$ independent equations

$\displaystyle a_{11}\chi_{1}+\ldots+a_{m1}\chi_{m}=0,$ $\displaystyle\ldots$ (108) $\displaystyle a_{1l}\chi_{1}+\ldots+a_{ml}\chi_{m}=0$

with coefficients $a_{ij}\in GF(s)$ quite obviously defines the subset of $s^{m-l}$ points of $EG(m,s)$ or the subset of the full design $s^{m}.$

Definition 3.3.1. The design consisting of $s^{m-l}$ points satisfying the system of $l$ linearly independent Eq. (4.3) is called a geometric design.

We will also use Definition 3.3.1 of geometric designs when right-hand sides of Eq. (4.3) have any elements (not necessarily zero). However, hereafter, except Section 4.6, we will assume that right-hand sides of Eq. (4.3) are zero.

Theorem 3.3.1 (Rao, 1950). $s^{m-l}$ points satisfying the system Eq. (4.3) form a hypercube of strength $t$ if and only if there is no nontrivial linear combination of Eq. (4.3) that contains less than $t+1$ nonzero coefficients.

Proof Assume that any nontrivial linear combination of Eq. (4.3) contains at least $(t+1)$ nonzero coefficients. Then any combination of any set of $t$ factors satisfies the system Eq. (4.3). Indeed, without loss of generality, we can select the factors $F_{1},\ldots,F_{t}$ as a set of $t$ factors. Fix them at certain levels. Then Eq. (4.3) is transformed into the following system:

$\displaystyle a_{1}=a_{t+1,1}\chi_{t+1}+\ldots+a_{m1}\chi_{1},$ $\displaystyle\ldots$ (109) $\displaystyle a_{l}=a_{t+1,l}\chi_{t+1}+\ldots+a_{ml}\chi_{m}.$

Each equation of Eq. (4.3) contains at least one nonzero coefficient. Without loss of generality, assume that $a_{ml}\neq 0$ . Sum up the $(l-1)$ -th equation and the $l$ -th equation with appropriate multiplier. Then we get the equation with $a_{m,l-1}=0$ . This equation also has at least one nonzero coefficient. We can assume that $a_{m-1,l-1}\neq 0$ . Going further with this process of diagonalization, transform the system Eq. (4.3) to a semi-diagonal type. Then we get that the system Eq. (4.3) always has at least one solution, and the number of different solutions is constant and equals $s^{m-l-t}$ . Hence, any combination of levels of any $t$ factors occurs exactly $s^{m-l-t}$ times. Therefore, the design is a hypercube of strength $t$ .

Now suppose that there exists a linear combination of Eq. (4.3) that forms the equation

$\displaystyle a_{1}\chi_{i_{1}}+\ldots+a_{n}\chi_{i_{n}}=0,$ (110)

where $n<t$ $+$ $1$ . Then, it is evident that all combinations of levels of $n$ factors $F_{i_{1}},\ldots,F_{i_{n}}$ will not satisfy Eq. (4.3). This proves the theorem.

Theorem 3.3.2 (Bose & Bush, 1952). Suppose that there exists the matrix $\bm{C}=\left\{c_{ij}\right\}$ of size $m\times n$ ( $c_{ij}\in GF(s),s=p^{h},p$ is prime) such that any of its submatrix of size $t\times n$ has rang $t$ . Then there exists an orthogonal array $\left(s^{n},m,s,t\right).$

Proof Consider the matrix $Q=\left\{q_{ij}\right\}$ of the full design $s^{n}$ of size $s^{n}\times n\left(q_{ij}\in GF(s)\right)$ . We will show that the matrix $\bm{A}=\bm{Q}\bm{C}^{T}$ of size $s^{n}\times m$ is the orthogonal array $\left(s^{n},m,s,t\right)$ .

Let $\bm{A}^{\prime}$ be a submatrix of size $s^{n}\times t$ of the matrix $\bm{A}$ and $\bm{C}^{\prime}$ is a submatrix of size $t\times n$ of the matrix $\bm{C}$ corresponding to $\bm{A}^{\prime}$ . Since $Rg\bm{C}^{\prime}=t,$ each row of $\bm{A}^{\prime}$ is a combination of $s^{n-t}$ different rows of the matrix $\bm{Q}$ . Hence, each row in $\bm{A}^{\prime}$ occurs $s^{n-t}$ times. Therefore, $\bm{A}$ is an orthogonal array of strength $t$ and index $\lambda=s^{n-t}.$

Elements of rows of the matrix $\bm{C}$ can be interpreted as coordinates of points in a finite projective space $PG\left(n-1,p^{h}\right)$ such that no $t$ of them belong to a subspace of dimension $t-2$ or less. Therefore, this condition is equivalent to the condition of Theorem 3.3.2.

We will show that two conditions of Theorem 3.3.2 are equivalent to the condition of Theorem 3.3.1.

Theorem 3.3.3. The following three statements are equivalent:

There exists the matrix $\bm{C}=\left\{c_{ij}\right\}$ of size $m\times n$ ( $c_{ij}\in GF\left(p^{h}\right),p$ is prime) such that any its submatrix of size $t\times n$ has rank $t .$

There exist $m$ points in projective space $PG\left(n-1,p^{h}\right)$ such that no $t$ of them belong to a subspace of dimension $t-2$ or less.

There exists a system of $l=m-n$ Eq. (4.3) such that there is no nontrivial linear combination of the equations that contains less than $t+1$ nonzero coefficients.

Proof Suppose that the statement 3 of the theorem holds. We will use the following transformations of the matrix: addition a multiple of one row to another and permutation of rows or columns, and call them elementary transformations. It is evident that the matrix $\bm{V}=\left\{v_{ij}\right\}\left(i=1,\ldots,m;j=1,\ldots,l\right)$ of coefficient of the system Eq. (4.3) by elementary transformations can be converted to the following matrix:

$\displaystyle\begin{Vmatrix}g_{11}&{g}_{21}&\ldots&g_{n1}&0&0&\ldots&1\\ \vdots&\vdots&\ddots&\vdots&\vdots&\vdots&\ddots&\vdots\\ g_{1l}&{g}_{2l}&\ldots&g_{nl}&1&0&\ldots&0\\ \end{Vmatrix}.$ (111)

Each row of the matrix Eq. (111) is a nontrivial linear combination of rows of the matrix $\bm{V}$ . Hence, each row of the matrix Eq. (111) contains at least $t+1$ nonzero elements. I.e., for any $j$ there is at least $t$ nonzero elements among numbers $g_{1j},\ldots,g_{nj}$ . A similar statement can be made for any nontrivial combination of rows of the matrix Eq. (111). Namely, a nontrivial linear combination with $a\left(>0\right)$ nonzero coefficients of rows of the matrix Eq. (111) contains at least $t+1$ nonzero elements, and at least $t+1-a$ of them occur in the first $n$ columns.

Consider the matrix

$\displaystyle\begin{Vmatrix}g_{11}&g_{21}&\ldots&g_{n1}&0&0&\ldots&1\\ \vdots&\vdots&\ddots&\vdots&\vdots&\vdots&\ddots&\vdots\\ g_{1l}&g_{2l}&\ldots&g_{nl}&1&0&\ldots&0\\ 0&0&\ldots&1&0&0&\ldots&0\\ \vdots&\vdots&\ddots&\vdots&\vdots&\vdots&\ddots&\vdots\\ 0&1&\ldots&0&0&0&\ldots&0\\ 1&0&\ldots&0&0&0&\ldots&0\\ \end{Vmatrix}$ (112)

and some nontrivial linear combination of its $t$ rows with $a$ rows selected from the first $l$ rows and $(t-a)$ rows selected from the rest rows. It is evident that the first $n$ elements of this linear combination contain at least $(t+1-a)-(t-a)=1$ nonzero elements. That means that any submatrix of size $t\times n(t\leqslant n)$ of the matrix

$\displaystyle\begin{Vmatrix}g_{11}&g_{21}&\ldots&g_{n1}\\ \vdots&\vdots&\ddots&\vdots\\ g_{1l}&g_{2l}&\ldots&g_{nl}\\ 0&0&\ldots&1\\ \vdots&\vdots&\ddots&\vdots\\ 0&1&\ldots&0\\ 1&0&\ldots&0\\ \end{Vmatrix}$ (113)

of size $m\times n$ has rang $t$ .

Suppose that the statement 1 of the theorem holds and rank of the matrix $\bm{C}=\left\{c_{ij}\right\}$ equals $n^{\prime}\left(t\leqslant n^{\prime}\leqslant n\right)$ . Select $n^{\prime}$ independent rows in the matrix $\bm{C}$ (without loss of generality, assume that these rows are the last $n^{\prime}$ rows of $\bm{C})$ . Then each row $\bm{c}_{i}\left(i=1,\ldots,m-n^{\prime}\right)$ of the matrix $\bm{C}$ can be represented by a linear combination of rows $\bm{c}_{u}\left(u=m-n^{\prime}+1,\ldots,m\right)$ of the matrix:

$\displaystyle\bm{c}_{1}=\lambda_{11}\bm{c}_{m-n^{\prime}+1}+\ldots+\lambda_{1n% ^{\prime}}\bm{c}_{m},$ $\displaystyle\ldots$ (114) $\displaystyle\bm{c}_{m-n^{\prime}}=\lambda_{\left(m-n^{\prime}\right)1}\bm{c}_% {m-n^{\prime}+1}+\ldots+\lambda_{\left(m-n^{\prime}\right)n^{\prime}}\bm{c}_{m}.$

Consider the matrix $\bm{\Lambda}=\left\{\lambda_{ij}\right\}\left(i=1,\ldots,m-n^{\prime};j=1,% \ldots,n^{\prime}\right)$ The number of nonzero elements in the $i$ -th row of the matrix $\bm{\Lambda}$ cannot be less than $t$ . Otherwise, by Eq. (4.3), $\bm{c}_{i}$ could be represented by a linear combination of these nonzero elements and therefore, there would exist $t$ linearly independent rows of the matrix $\bm{C}$ . Thus, we have arrived at a contradiction.

For the same reason, a nontrivial linear combination $r\left(r\leqslant m-n^{\prime}\right)$ rows of the matrix $\bm{\Lambda}$ cannot contain less than $t-r+1$ nonzero elements.

Form the following matrix of size $\left(m-n^{\prime}\right)\times m$ :

$\displaystyle\begin{Vmatrix}{\lambda}_{11}&{\lambda}_{12}&\ldots&{\lambda}_{1n% ^{\prime}}&1&0&\ldots&0\\ {\lambda}_{21}&{\lambda}_{22}&\ldots&{\lambda}_{2n^{\prime}}&0&1&\ldots&0\\ \vdots&\vdots&\ddots&\vdots&\vdots&\vdots&\ddots&\vdots\\ \lambda_{\left(m-n^{\prime}\right)1}&\lambda_{\left(m-n^{\prime}\right)2}&% \ldots&\lambda_{\left(m-n^{\prime}\right)n^{\prime}}&0&0&\ldots&1\\ \end{Vmatrix}.$ (115)

Using the properties of the matrix $\bm{\Lambda}$ , we can get that any nontrivial linear combination of any rows of the matrix Eq. (115) contains at least $t+1$ nonzero elements. Since $n^{\prime}\leqslant n$ , the matrix of required size can be generated from the matrix Eq. (115) by deleting any $n-n^{\prime}$ rows.

This completes the proof of the theorem.

We will say that we are using geometric method of construction of hypercubes of strength $t$ when we construct them based on Theorem 3.3.1 or Theorem 3.3.2. In this case $s^{m-l}$ points of the hypercube satisfy the system of type Eq. (4.3).

4.4 Alias sets of pencils of parallel flats

In this paragraph, we will concentrate on the nature of pencils of parallel flats in fractional geometric designs (Brodsky, 1972).

Consider the full symmetrical design $s^{m}$ (the design $\bm{D}^{f})$ and all pencils $P\left(a_{1},\ldots,a_{m}\right)$ of parallel flats

$\displaystyle a_{0}+a_{1}\chi_{1}+\ldots+a_{m}\chi_{m}=0.$ (116)

The nature of the contrasts generated by these pencils is defined by Theorem 3.2.1. Let $\bm{D}$ be the subset of $s^{m-l}$ points of the design $\bm{D}^{f}$ generated by $l$ independent equations Eq. (4.3).

The Eq. (4.3) is called generating relations of the design $\bm{D}$ . The pencil $P\left(a_{11},\ldots,a_{m1}\right),\ldots,P\left(a_{1l},\ldots,a_{ml}\right)$ is called generators of the design $\bm{D}$ . Note that for the given design $\bm{D}$ , a selection of the generators is not unique. The pencils

$\displaystyle P\left(\lambda_{1}a_{11}+\ldots+\lambda_{l}a_{1l},\ldots,\lambda% _{l}a_{m1}+\ldots+\lambda_{l}a_{ml}\right)$ (117)

( $\lambda_{i}$ are not equal simultaneously to zero) is called defining pencils of the design $\bm{D}$ . It is evident that we can get a unique representation of the defining pencils Eq. (117) if we set to 1 the first of nonzero coordinates. Therefore, the total number of different defining pencils of the design $\bm{D}$ generated by Eq. (4.3) equals $\left(s^{l}-1\right)/(s-1)$ .

For the design $\bm{D}$ consider vector $\bm{\xi}$ with coordinates equal coordinates of the contrast $\bm{\xi}^{f}$ generated by the pencil $P\left(a_{1},\ldots,a_{m}\right)$ in $\bm{D}^{f}$ . In this case we will say that the contrast $\bm{\xi}^{f}$ in $\bm{D}^{f}$ generates $\bm{\xi}$ in $\bm{D}$ . If $P\left(a_{1},\ldots,a_{m}\right)$ is a defining pencil of the design $\bm{D}$ , then all points of $\bm{D}$ belong to one of the flats of Eq. (116) (for $a_{0}=0)$ and no points of $\bm{D}$ belong to other flats of Eq. (116) (for $a_{0}\neq 0)$ . In this case $\bm{\xi}^{f}$ generates zero vector $\bm{0}$ in $\bm{D}$ . It is evident that the vector $\bm{0}$ is generated by those and only those contrasts $\bm{\xi}^{f}$ that correspond to defining pencils of the design $\bm{D}$ . In this case we shall say that the vector $\bm{0}$ and vectors generated by defining pencils of the design $\bm{D}$ belong to the same alias set (for the design $\bm{D}$ ). We also shall say that defining pencils of the design $\bm{D}$ belong to the same alias set.

If $P\left(a_{1},\ldots,a_{m}\right)$ is not a defining pencil of the design $\bm{D}$ , each flat of the pencil Eq. (116) intersect $\bm{D}$ in $s^{m-l-1}$ points of $(m-l-1)$ -flat, which we denote by $P\left(a_{0},a_{1},\ldots,a_{m}\right)$ . In this case each point of the design $\bm{D}$ belong to one and only one flat of the pencil Eq. (116) and, therefore, one and only one flat $P\left(a_{0},a_{1},\ldots,a_{m}\right)$ . Hence, a pencil of parallel flats in $\bm{D}^{f}$ generates a pencil of parallel $(m-l-1)$ -flats $P\left(a_{0},a_{1},\ldots,a_{m}\right)$ in the design $\bm{D}$ which we denote by $P^{\prime}\left(a_{1},\ldots,a_{m}\right)$ .

Consider the flat

$\displaystyle a_{0}+\left(\lambda_{0}a_{1}+\lambda_{1}a_{11}+\ldots+\lambda_{l% }a_{1l}\right)\chi_{1}+\ldots+\left(\lambda_{0}a_{m}+\lambda_{1}a_{m1}+\ldots+% \lambda_{l}a_{ml}\right)\chi_{m}=0\quad\left(\lambda_{0}\neq 0\right)$ (118)

of the pencil

$\displaystyle{P(\lambda_{0}a_{1}+\lambda_{1}a_{11}+\ldots+\lambda_{l}a_{1l},% \ldots,\lambda}_{0}a_{m}+\lambda_{1}a_{m1}+\ldots+\lambda_{l}a_{ml}).$ (119)

It is evident that the flat Eq. (118) intersects Eq. (4.3) in the same points as Eq. (116). Besides, all flats intersecting Eq. (4.3) in the same points as Eq. (116) are represented by Eq. (118).

Since the pencils $P\left(a_{1},\ldots,a_{m}\right),P\left(a_{11},\ldots,a_{m1}\right),\ldots,P% \left(a_{1l},\ldots,a_{ml}\right)$ are linearly independent, different sets Eq. (119) correspond to different pencils. Therefore, the total number of different pencils of type Eq. (119) equals $s^{l}$ . Hence, for any pencil $P\left(a_{1},\ldots,a_{m}\right)$ that is not defining pencil of the design $\bm{D}$ there exists $s^{l}$ pencils, including $P\left(a_{1},\ldots,a_{m}\right)$ , that generate the same pencil $P^{\prime}\left(a_{1},\ldots,a_{m}\right)$ in the design $\bm{D}$ . We shall say about such $s^{l}$ pencils that they belong to the same alias set of the design $\bm{D}$ . We shall also say that contrasts generated by these pencils belong to one alias set.

The total number of the pencils in $\bm{D}^{f}$ equal $\left(s^{m}-1\right)/(s-1)$ . An alias set of defining pencils of the design $\bm{D}$ consists of $\left(s^{l}-1\right)/(s-1)$ pencils. Therefore, the number of different alias sets of nondefining pencils equals

$\displaystyle\frac{\left(s^{m}-1\right)/{(s-1)-\left(s^{l}-1\right)/(s-1)}}{s^% {l}}=\frac{s^{m-l}-1}{s-1}.$

Consider two different pencils $P^{\prime}\left(a_{1},\ldots,a_{m}\right)$ and $P^{\prime}\left(g_{1},\ldots,g_{m}\right)$ that are generated by pencils of two different alias sets for the design $\bm{D}$ . The rows of the matrix

$\displaystyle\begin{Vmatrix}a_{1}&\ldots&a_{m}\\ g_{1}&\ldots&g_{m}\\ a_{11}&\ldots&a_{m1}\\ \vdots&\ddots&\vdots\\ a_{1l}&\ldots&a_{ml}\\ \end{Vmatrix}$

are linearly independent. Hence, any given $(m-l-1)$ -flat of the pencil $P^{\prime}\left(a_{1},\ldots,a_{m}\right)$ intersects $s$ different $(m-l-1)$ -flats of the pencil $P^{\prime}\left(g_{1},\ldots,g_{m}\right)$ in $s$ different $(m-l-2)$ -flats. Any two of these $(m-l-2)$ -flats have no point in common, because two different $(m-l-1)$ -flats of the pencil $P^{\prime}\left(g_{1},\ldots,g_{m}\right)$ have no point in common. Any $(m-l-2)$ -flat contains exactly $s^{m-l-2}$ points. Therefore, each $(m-l-1)$ -flat of the pencil $P^{\prime}\left(g_{1},\ldots,g_{m}\right)$ contains exactly $s^{m-l-2}$ points of $s^{m-l-1}$ points belonging to the given $(m-l-1)$ -flat of the pencil $P^{\prime}\left(a_{1},\ldots,a_{m}\right)$ . Therefore, by Lemma 3.1.1, degrees of freedom carried by the pencil $P\left(a_{1},\ldots,a_{m}\right)$ are orthogonal to degrees of freedom carried by the pencil $P^{\prime}\left(g_{1},\ldots,g_{m}\right)$ . Therefore, the following theorem has been proved.

Theorem 3.4.1. All $\left(s^{m}-1\right)/(s-1)$ pencils of parallel flats in $\bm{D}^{f}$ are split into $\left(s^{m-l}-1\right)/(s-1)$ alias sets with $s^{l}$ pencils in each and one alias set with $\left(s^{l}-1\right)/(s-1)$ defining pencils. The pencils belonging to the same alias set generate identical pencils of parallel flats in the design $\bm{D}$ . The pencils from different alias sets generate pencils of parallel flats in the design $\bm{D}$ with orthogonal degrees of freedom.

It follows from the proof of Theorem 3.3.1 that if the design contains all combinations of levels of $t$ factors $F_{i_{1}},\ldots,F_{i_{t}}$ , no defining pencil has all coordinates other than $a_{i_{1}},\ldots,a_{i_{t}}$ simultaneously equal to zero. The pencil that have part of coordinates $a_{i_{1}},\ldots,a_{i_{t}}$ (namely, $a_{j_{1}},\ldots,a_{j_{t}})$ not equal to zero and the rest of coordinates equal to zero, cannot be a defining pencil and cannot be in the same alias set with the pencil that has all coordinates other than $a_{i_{1}},\ldots,a_{i_{t}}$ simultaneously equal to zero.

Therefore, if the design $\bm{D}$ contains all levels of the factor $F_{i}$ , the pencil $P_{i}$ corresponding to main effects of this factor in $\bm{D}^{f}$ cannot be a defining pencil of the design $\bm{D}$ . Hence, the pencil $P_{i}^{\prime}$ in $\bm{D}$ generated by the pencil $P_{i}$ forms $s-1$ contrasts orthogonal to the vector $\bm{I}$ , with coordinates that depend only on levels of the factor $F_{i}$ . Therefore, the pencil $P_{i}^{\prime}$ also forms a full set of main effects of the factor $F_{i}$ in the design $\bm{D}$ .

If the design $\bm{D}$ contains all combinations of levels of two factors $F_{i}$ and $F_{j}$ , any pencil $P_{ij}$ corresponding to interaction effects of these factors in $\bm{D}^{f}$ cannot be a defining pencil of the design $\bm{D}$ . Hence, the pencil $P_{ij}^{\prime}$ in $\bm{D}$ generated by the pencil $P_{ij}$ forms $s-1$ contrasts orthogonal to the vector $\bm{I}$ and all main effects of the factors $F_{1}$ and $F_{2}$ (because the pencils corresponding to main effects of factors $F_{1}$ and $F_{2}$ cannot be in the alias set together with the pencil $P_{ij})$ . Since the pencil $P_{ij}^{\prime}$ forms contrasts with coordinates that depend only on levels of the factors $F_{i}$ and $F_{j}$ , these contrasts are interaction effects of the factors $F_{i}$ and $F_{j}$ . Contrasts corresponding to all pencils $P_{ij}$ form a full set of $(s-1)^{2}$ interaction effects of the factors $F_{i}$ and $F_{j}$ .

Continuing this reasoning by induction, we get the following theorem.

Theorem 3.4.2. If the design contains all combinations of levels of the factors $F_{i_{1}},\ldots,F_{i_{t}}$ or (which is the same) no defining pencil has all coordinates other than $a_{i_{1}},\ldots,a_{i_{t}}$ simultaneously equal to zero, all pencils corresponding to interaction effects of the factors $F_{i_{1}},\ldots,F_{i_{t}}$ in $\bm{D}^{f}$ generate pencils of parallel flats in the design $\bm{D}$ corresponding to a full set of interaction effects of these factors in the design $\bm{D}$ .

We will get independent effects if we select not more than one pencil from each alias set.

4.5 Defining relation

Consider a geometric design $\bm{D}$ generated by $l$ independent Eq. (4.3). We will call the following relationship a defining relation of the design $\bm{D}$ :

$\displaystyle 0=a_{11}\chi_{1}+\ldots+a_{m1}\chi_{m}=a_{12}\chi_{1}+\ldots+a_{% m2}\chi_{m}=\ldots{=a}_{1l}\chi_{1}+\ldots+a_{ml}\chi_{m}=\left(a_{11}+a_{12}% \right)\chi_{1}+\ldots+\left(a_{m1}+a_{m2}\right)\chi_{m}=\ldots=\left[a_{11}+% (s-1)a_{12}\right]\chi_{1}+\ldots+\left[a_{m1}+(s-1)a_{m2}\right]\chi_{m}=% \ldots=\left[a_{11}+\ldots+a_{1l}\right]\chi_{1}+\ldots+\left[a_{m1}+\ldots+a_% {ml}\right]\chi_{m}=\ldots=\left[a_{11}+(s-1)a_{12}+\ldots+(s-1)a_{1l}\right]% \chi_{1}+\ldots+\left[a_{m1}+(s-1)a_{m2}+\ldots+(s-1)a_{ml}\right]\chi_{m}.$ (120)

The coefficients in Eq. (120) match with the coordinates of the defining pencils. A so-called standard defining relation is derived from the defining relation Eq. (120) by multiplying of each of its side by the element $\lambda\in GF(s)$ such that the first nonzero coefficient of the side equals 1.

By Theorem 3.3.1, the geometric design $\bm{D}$ is a hypercube of strength $t$ if and only if no side of the defining relation Eq. (120) contains less than $t+1$ nonzero coefficients.

4.6 Two-level designs

Any geometric two-level design $\bm{D}$ (as any geometric design) is uniform. I.e., any level of any factor occurs in the design exactly $N/2$ times ( $N$ is the number of treatment combinations in the design). Hence, the Chebyshev model will be the same as the $A^{\Omega}$ -model of true effects. Then a full factorial model is

$\displaystyle Ey=b_{0}+b_{1}x_{1}+\ldots+b_{m}x_{m}+b_{12}x_{1}x_{2}+\ldots+b_% {1\ldots m}x_{1}\ldots x_{m},$ (121)

where $x_{i}=1$ for one of two levels of the factor and $x_{i}=-1$ for another level.

Consider two matrices of the design $\bm{D}$ : $\bm{D}_{F}=\left\{\chi_{iu}\right\}\left[\chi_{iu}\in GF(2)\right]$ and $\bm{D}=\left\{x_{iu}\right\}$ , where $\chi_{iu}=0$ , $x_{iu}=1$ if the factor $F_{i}$ occurs in the $u$ -th treatment of the design at level 0, and $\chi_{iu}=1$ , $x_{iu}=-1$ if the factor $F_{i}$ occurs in the $u$ -th treatment of the design at level 1. Then, obviously, the following two equalities are equivalent:

$\displaystyle\chi_{i_{1}u}+\ldots+\chi_{i_{r}u}=0,$ $\displaystyle x_{i_{1}u}\ldots x_{i_{r}u}=1.$

Therefore, the system of the generating relations Eq. (4.3) for the geometric two-level design $\bm{D}$ is transformed as follows:

$\displaystyle x_{1}^{a_{11}}\ldots x_{m}^{a_{m1}}=1,$ $\displaystyle\ldots$ (122) $\displaystyle x_{1}^{a_{1l}}\ldots x_{m}^{a_{ml}}=1,$

where $a_{ij}\in GF(2)$ ( $a_{ij}=0$ or 1), $l=m-k$ .

The system Eq. (4.6) in accordance with this section corresponds to a subset of $2^{k}$ points $(k=m-l)$ of the full design $2^{m}$ .

The expressions of form $x_{1}^{a_{1}}\ldots x_{m}^{a_{m}}$ $(a_{1},\ldots,a_{m}=0$ or $1)$ is called an interaction (as opposed to an interaction effect), or an $r$ - letter interaction if exactly $r$ numbers of $a_{i}(i=1,\ldots,m)$ equal 1. We will use a concept of generating, defining, independent interactions similar to a concept of generating, defining, independent flats, i.e., equalities of type Eq. (4.3). Generating interactions will also be called generators.

An elementary transformation of a set of interactions is multiplication of one of the interactions of the set by other interactions of the set.

Let $\bm{X}^{f}$ be the coefficient matrix of the full design $2^{m}$ (the design $\bm{D}^{f})$ for a full factorial model Eq. (121). As a simple consequence of Theorem 1.4.1 and Note 1 to the Theorem 1.4.1 we get the following theorem.

Theorem 3.6.1. The matrix $\bm{X}^{f}$ is a square matrix with elements $+$ 1 and $-$ 1; all columns of the matrix $\bm{X}^{f}$ are pairwise orthogonal; the values $x_{i}$ and $x_{i_{1}}\ldots x_{i_{r}}$ at the points of the design $\bm{D}^{f}$ form the vector of the main effect of the factor $F_{i}$ and the vector of the interaction effect of the factors $F_{i_{1}},\ldots,F_{i_{r}}$ respectively.

Sometimes, we will use notations $+$ and $-$ in the design matrix and in the coefficient matrix instead of $+$ 1 and $-$ 1 respectively.

Consider the matrix of the full design $2^{m}$ (the design $\bm{D}^{f})$ and its submatrix $\bm{D}=\left\{x_{iu}\right\}$ defined by the following generating relations:

$\displaystyle 1=R_{1},\ldots,1=R_{l},$ (123)

where $R_{1},\ldots,R_{l}$ are $l$ independent interactions $\left(l<m\right)$ .

In the matrix $\bm{X}^{f}$ select rows corresponding to the design $\bm{D}$ . Denote the resulting matrix by $\tilde{\bm{X}}$ . Denote by $\bm{X}$ the matrix composed of one representative from each set of identical columns of the matrix $\tilde{\bm{X}}$ .

Theorem 3.6.2. For the design $\bm{D}$ defined by the generating relations Eq. (123) the following statements hold:

1. 1.

The matrix $\tilde{\bm{X}}$ has size $2^{m-l}\times 2^{m};2^{m}$ columns of the matrix $\tilde{\bm{X}}$ are split into $2^{m-l}$ alias sets, so each alias set has identical columns, columns from different alias sets are orthogonal.

There exist $m-l$ columns of the design $\bm{D}$ that form the full design $2^{m-l}$ (the design $\bm{D}_{l}^{f})$ . There exist no columns of selected $m-l$ columns such that their product (in the sense of Definition 1.4.1) corresponds to a defining interaction.

The matrix $\bm{X}$ is identical to the coefficient matrix of the design $\bm{D}_{l}^{f}$ for the full factorial model.

Proof Statement 1 of the theorem follows from Theorem 3.4.1. We will prove statements 2 and 3 by induction. Let $l=1$ . Assume that $\bm{D}_{1}$ and $\bm{X}_{1}$ are the matrices that contain those row of the matrices $\bm{D}^{f}$ and $\bm{X}^{f}$ respectively that satisfy the first generating relation $1=R_{1}$ .

If $x_{i}$ belongs to the interaction $R_{1}$ delete the column corresponding to $x_{i}$ from $m$ columns of $\bm{D}_{1}$ . Then the rest $m-1$ columns form the full design $2^{m-1}$ (the design $\bm{D}_{1}^{f})$ , because these is no two identical rows in the $\bm{D}_{1}^{f}$ . Indeed, if such two rows exist (say, the $i$ -th and the $j$ -th rows), the $i$ -th and the $j$ -th elements of the deleted column have different signs (otherwise we get two identical row in $\bm{D}^{f}$ , which is a contradiction). Therefore, the $i$ -th and the $j$ -th elements of the column corresponding to the interaction $R_{1}$ have different signs (which is also a contradiction).

Statement 3 of the theorem for $l=1$ is obvious, because $\bm{X}_{1}$ contains a unit column, all columns of the full design $2^{m-1}$ , and all possible products of $2,\ldots,m-1$ columns of the full design $2^{m-1}$ .

Now assume that the theorem is valid for $l=n$ for the design $\bm{D}_{n}$ defined by generating relations

$\displaystyle 1=R_{1},\ldots,1=R_{n}.$

We will prove that the theorem is valid for $l=n+1$ for the design $\bm{D}_{n+1}$ defined by generating relations

$\displaystyle 1=R_{1},\ldots,1=R_{n},1=R_{n+1}.$

Let

$\displaystyle Q_{1},Q_{2}\ldots$ (124)

be the defining interactions of the design $\bm{D}_{n}$ . Then it is evident that

$\displaystyle Q_{1},Q_{2},\ldots,R_{n+1},Q_{1}R_{n+1},Q_{2}R_{n+1}\ldots$ (125)

are the defining interactions of the design $\bm{D}_{n+1}$ . By the induction hypothesis, among selected $m-n$ columns of the design $\bm{D}_{n}$ forming the full design $\bm{D}_{n}^{f}$ there are no columns which product produces the defining interactions Eq. (124). Therefore, there are no columns among of them which product produces two or more defining interactions Eq. (125). Indeed, if such two interactions exist, they are interactions of type $Q_{i}R_{n+1}$ and $Q_{j}R_{n+1}$ . Their product is an interaction of type $Q_{q}$ which is a contradiction. Now select the columns forming the full design $\bm{D}_{n}^{f}$ and consider only those columns with the product that forms an interaction from Eq. (125). Delete any of these columns. The remaining $m-n-1$ columns, obviously, form the full design $2^{m-n-1}$ (the design $D_{n+1}^{f}$ ). Among of them, there are not columns with the product that forms an interaction of the defining relation Eq. (125).

This completes the proof of the theorem.

By Theorem 3.6.2, the columns of the matrix $\tilde{\bm{X}}$ split to $2^{m-l}$ alias sets; each alias set has identical columns; columns from different alias sets are orthogonal. It is evident that the alias set containing the interaction $S$ can be found multiplying all interactions of the defining relation

$\displaystyle 1=R_{1}=R_{2}=R_{1}R_{2}=\ldots=R_{1}R_{2}\ldots R_{l}$

by $S$ . Therefore, the alias set that contains the interaction $S$ is

$\displaystyle S,R_{1}S,R_{2}S,R_{1}R_{2}S,\ldots,R_{1}R_{2}\ldots R_{l}S.$ (126)

We cannot find unique LS estimates of parameters of the full factorial model Eq. (121) for the fractional design $2^{m-l}$ (the design $\bm{D}$ ), because the coefficient matrix $\tilde{\bm{X}}$ , obviously, contains identical columns and, therefore, the matrix $\tilde{\bm{X}}^{T}\tilde{\bm{X}}$ is singular. However, if the model contains only one interaction from each alias set, we can find unique LS estimates. Indeed, in this case the matrix $\bm{X}$ with orthogonal columns is the coefficient matrix of the design $\bm{D}$ . The same is valid for the model that has not more than one representative from each alias set of interactions. Hence, the LS estimate of the vector $\hat{\bm{B}}$ of parameters of the model is

$\displaystyle\hat{\bm{B}}=\frac{1}{2^{m-l}}\bm{X}^{T}\bm{y}.$ (127)

Assume that we are using the design $\bm{D}$ for the postulated model

$\displaystyle E\bm{y}=\bm{XB}$

that contains one interaction for each alias set, but the real model

$\displaystyle E\bm{y}=\bm{XB}+\bm{X}_{0}\bm{B}_{0}$

is the full factorial model Eq. (121) (the matrix $\bm{X}_{0}$ is derived from the matrix $\tilde{\bm{X}}$ by deleting of the columns included in $\bm{X}$ ). Then the LS estimates Eq. (127) are biased. By Eq. (92),

$\displaystyle E\hat{B}=\bm{B+\bm{A}}\bm{B}_{0},$ (128)

where $\bm{A}=\left(\bm{X}^{T}\bm{X}\right)^{-1}\bm{X}^{T}\bm{X}_{0}$ is the bias matrix.

Put together the identical columns in $\bm{X}_{0}$ . Then the matrix $\bm{X}^{T}\bm{X}_{0}$ of size $2^{m-l}\times 2^{m-l}(2^{l}-1)$ is

$\displaystyle\bm{X}^{T}\bm{X}_{0}=\begin{Vmatrix}N&\ldots&N&0&\ldots&0&\ldots&% 0&\ldots&0\\ 0&\ldots&0&N&\ldots&N&\ldots&0&\ldots&0\\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots&\ddots&\vdots&\ddots&\vdots\\ 0&\ldots&0&0&\ldots&0&\ldots&N&\ldots&N\\ \end{Vmatrix},$

where each row contains $2^{l}-1$ elements equal $N$ . Hence, the bias matrix is:

$\displaystyle\begin{Vmatrix}1&\ldots&1&0&\ldots&0&\ldots&0&\ldots&0\\ 0&\ldots&0&1&\ldots&1&\ldots&0&\ldots&0\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots&\ddots&\vdots&\ddots&\vdots\\ 0&\ldots&0&0&\ldots&0&\ldots&1&\ldots&1\\ \end{Vmatrix}.$

Using Eq. (128), we get the system of scalar equalities. Considering any of them, we get the following theorem.

Theorem 3.6.3. Suppose that we are using the design $\bm{D}$ to estimate coefficients of the postulated model that contains one interaction for each alias set, but the real model is the full factorial model Eq. (121). Then the estimate Eq. (127) of the coefficient corresponding to the interaction $S$ is biased. The bias is equal to the sum of effects corresponding to all interactions (excluding $S$ ) that belong to the alias set containing $S$ .

Hence, any estimate Eq. (127) is an unbiased estimate of the sum of effects corresponding to the interactions from one alias set. Such effects are called confounded.

In this paragraph we will discuss estimates of effects for a geometric design $\bm{D}$ assuming that the model contains not more than one interaction from each alias set.

Now we will focus on construction technique of a family of geometric designs, introduced by Box and Hunter (1961) and will give a series of theorems based on their ideas.

Consider $l$ generators of the geometric design $2^{m}//2^{m-l}$ . They can be treated as $l$ independent interactions. Change signs of some of them. It is evident that the resulting interactions are also independent. They correspond to a geometric design that is different from the initial design. There are $2^{l}$ ways of allocating signs plus and minus to $l$ generators. All corresponding designs are said to belong to the same family.

Definition 3.6.1. Generators of one of the design of the family are called principal generators if they have only positive signs; the corresponding defining relation is called a principal defining relation; the corresponding design is called a principal design of the family.

The whole set of defining relations of the same family can be represented by the following formal relation:

$\displaystyle 1=\left(1\pm R_{1}\right)\ldots\left(1\pm R_{l}\right),$

where $R_{1},\ldots,R_{l}$ are the principal generators.

Lemma 3.6.1. If the interactions

$\displaystyle R_{1},\ldots,R_{n},R_{n+1},\ldots,R_{l}$ (129)

are independent, the interactions

$\displaystyle R_{1},R_{1}R_{2},\ldots,R_{1}R_{n},R_{n+1},\ldots,R_{l}$ (130)

are also independent.

Proof Suppose in the contrary that the following relationship is valid:

$\displaystyle\pm 1=R_{n+i_{1}}\ldots R_{n+i_{p}}R_{1}R_{j_{1}}\ldots R_{1}R_{j% _{r}}\bar{R}_{1},$

where $0<i_{q}\leqslant l-n,0<j_{v}\leqslant n,q=1,\ldots,p$ , $v=1,\ldots,r$ , and $\bar{R}_{1}$ equal either $R_{1}$ or 1.

Then we get

$\displaystyle\pm 1=R_{j_{1}}\ldots R_{j_{r}}R_{n+i_{1}}\ldots R_{n+i_{p}}\bar{% R}_{1}.$

Since the interactions Eq. (129) are independent, we came to a contradiction. This proves the lemma.

Lemma 3.6.1 can be reformulated as follows.

Lemma 3.6.2. If Eq. (129) are generators of the design $2^{m}//2^{m-l}$ , Eq. (130) are also generators of the design.

Lemma 3.6.3. For two designs of the same family, we can select generators in such a way that all of them are pairwise identical except one pair with the generators that have different signs.

Proof Consider two designs $2^{m}//2^{m-l}$ of the same family. The first design has the generators Eq. (129). The generators of the second design are

$\displaystyle-R_{1},-R_{2},\ldots,-R_{n},R_{n+1},\ldots,R_{l}.$ (131)

By Lemma 3.6.2, the interactions Eq. (130) are the generators of the first design, and the interactions

$\displaystyle-R_{1},R_{1}R_{2},\ldots,R_{1}R_{n},R_{n+1},\ldots,R_{l}$ (132)

are the generators of the second design, which was to be proved.

Definition 3.6.2. The design that contains all treatments of the designs $\bm{D}_{1},\ldots,\bm{D}_{r}$ is called an aggregated design

Note that an aggregated design is not similar to a unit of the set theory. For example, if each of two designs contains the same treatment combination, the aggregated design includes this treatment twice.

Theorem 3.6.4. The aggregated design of two geometric designs $2^{m}//2^{m-l}$ with the generators Eqs (129) and (131) is also a geometric design $2^{m}//2^{m-l+1}$ with the generators

$\displaystyle R_{1}R_{2},\ldots,R_{1}R_{n},R_{n+1},\ldots,R_{l}.$

Proof Since the interactions Eqs (130) and (132) are generators of the first and the second designs respectively, the aggregated design satisfies the relations

$\displaystyle 1=R_{1}R_{2},\ldots,1=R_{1}R_{n},1=R_{n+1},\ldots,1=R_{l}.$ (133)

It is evident that the interactions of Eq. (133) are independent. Since the number of interactions in Eq. (133) is $l-1$ , these interactions are the generators of the aggregated design.

Thus, the proof is complete.

Since defining relations of two geometric designs of the same family differ only by signs, each alias set of interactions of one design corresponds to some alias set of the other design with interactions that differ only by signs.

Theorem 3.6.5. Let $\bm{D}_{1}$ and $\bm{D}_{2}$ be two geometric designs $2^{m}//2^{m-l}$ of the same family. Then $2^{m-l+1}$ estimates of the aggregated design $\bm{D}$ are half-sums and half-differences of $2^{m-l}$ pair of the unbiased estimates of sums of effects in corresponding alias sets of the designs $\bm{D}_{1}$ and $\bm{D}_{2}$ .

Proof Since the defining relation contains all possible products of the generators, a half of interactions (including 1) of the defining relation of the design $\bm{D}_{1}$ by Lemma 3.6.2, are identical to a half of interactions of the defining relation of the design $\bm{D}_{2}$ other interactions of the defining relation of the design $\bm{D}_{1}$ differ by signs from corresponding interactions of the defining relation of the design $\bm{D}_{2}$ . Therefore, in any pair of corresponding alias sets, a half of interactions $(T_{1},\ldots,T_{2^{l-1}}$ in $\bm{D}_{1}$ and $\bm{D}_{2}$ ) are identical, and a half of interaction $(T_{2^{l-1}+1},\ldots,T_{2^{l}}$ in $\bm{D}_{1}$ and $-T_{2^{l-1}+1},\ldots,-T_{2^{l}}$ in $\bm{D}_{2}$ ) differ by signs. It is evident that the interactions $T_{1},\ldots,T_{2^{l-1}}$ belong to one alias set in $\bm{D}$ , and the interactions $T_{2^{l-1}+1},\ldots,T_{2^{l}}$ belong to other alias set. The column of the coefficient matrix of the design $\bm{D}_{1}$ corresponding to interactions $T_{1},\ldots,T_{2^{l-1}},T_{2^{l-1}+1},\ldots,T_{2^{l}}$ denote by $\bm{S}_{1}$ . The column of the coefficient matrix of the design $\bm{D}_{2}$ corresponding to interactions $T_{1},\ldots,T_{2^{l-1}},-T_{2^{l-1}+1},\ldots,-T_{2^{l}}$ denote by $\bm{S}_{2}$ . Then the column of the coefficient matrix of the design $\bm{D}$ corresponding to the interactions $T_{1},\ldots,T_{2^{l-1}}$ is

$\displaystyle S^{\prime}=\begin{Vmatrix}S_{1}\\ S_{2}\\ \end{Vmatrix}.$

The column of the coefficient matrix of the design $\bm{D}$ corresponding to the interactions $T_{2^{l-1}+1},\ldots,T_{2^{l}}$ is

$\displaystyle S^{\prime\prime}=\begin{Vmatrix}S_{1}\\ -S_{2}\\ \end{Vmatrix}.$

The estimate corresponding to the alias set $T_{1},\ldots,T_{2^{l-1}}$ in $\bm{D}$ is

$\displaystyle\frac{1}{2^{m-l+1}}\bm{y}_{D}^{T}\bm{S}^{\prime}=\frac{1}{2^{m-l}% }\bm{y}_{D_{1}}^{T}\bm{S}_{1}+\frac{1}{2^{m-l}}\bm{y}_{D_{2}}^{T}\bm{S}_{2},$

where $\bm{y}_{D},\bm{y}_{D_{1}}$ , and $\bm{y}_{D_{2}}$ are vector-columns of observations in the designs $\bm{D},\bm{D}_{1}$ , and $\bm{D}_{2}$ respectively.

The estimate corresponding to alias set $T_{2^{l-1}+1},\ldots,T_{2^{l}}$ is

$\displaystyle\frac{1}{2^{m-l+1}}\bm{y}_{D}^{T}\bm{S}^{\prime\prime}=\frac{1}{2% ^{m-l}}\bm{y}_{D_{1}}^{T}\bm{S}_{1}-\frac{1}{2^{m-l}}\bm{y}_{D_{2}}^{T}\bm{S}_% {2},$

which was to be proved.

Let $\bm{D}_{l}$ be the geometric design $2^{m}//2^{m-l}$ with the generators $R_{1},\ldots,R_{l}$ . Assume that first $n$ generators $(0\leqslant n\leqslant l)$ do not contain variable $x_{r}$ . Let $\bm{D}_{l-1}$ be the design $2^{m-1}//2^{m-l}$ derived from $\bm{D}_{l}$ by deleting the column $x_{r}$ . Then the following theorem holds.

Theorem 3.6.6. $\bm{D}_{l-1}$ is a geometric design with the generators

$\displaystyle R_{1}R_{2},\ldots,R_{n}R_{n+1}R_{n+2},\ldots,R_{n+1}R_{l}.$ (134)

Proof For the design $\bm{D}_{l-1}$ obviously, the following relations hold:

$\displaystyle 1=R_{1},1=R_{2},\ldots,1=R_{n},1=R_{n+1}R_{n+2},\ldots,1=R_{n+1}% R_{l},$

because these relations hold for the design $\bm{D}_{l}$ and interactions Eq. (134) do not contain $x_{r}$ . By Lemma 3.6.1, the interactions Eq. (134) are independent. Their number equals $l-1$ . Hence, they are the generators.

This completes the proof of the theorem.

Next, we will present a few more theorems for two-factor geometric designs from the article by Brodsky and Brodsky (1977). The theorem will be accompanied by examples from the same article. Partially, these theorems are consequences of the results presented in this article for the general case of geometric designs with factors at $s$ levels. However, in the article of Brodsky and Brodsky (1977) one can find the direct proofs of the theorems for $s=$ 2, and we will present below some of them.

Theorem 3.6.7. $2^{m-l}$ points satisfying the system Eq. (4.3) contain all combinations of the levels of the factors $F_{i_{1}},\ldots,F_{i_{t}}$ with equal frequency if and only if any nontrivial linear combination of Eq. (4.3) contains at least one nonzero coefficient other than the $i_{1}$ -th $,\ldots,i_{t}$ -th.

The proof of Theorem 3.6.7 is similar to the proof of Theorem 3.3.1.

The set of the factors $F_{i_{1}},\ldots,F_{i_{t}}$ , by Theorem 3.6.7, contains all combinations of the levels with equal frequency if and only if no defining pencil has simultaneously all nonzero coordinates other than the $i_{1}th,\ldots,i_{t}th$ . Therefore, the following theorem holds.

Theorem 3.6.8. The design $\bm{D}$ corresponding to the generating relations Eq. (4.6) contains all combinations of the levels of the factors $F_{i_{1}}$ , …, $F_{i_{t}}$ with equal frequency if and only if $x_{i_{1}}^{a_{1}}\ldots x_{i_{t}}^{a_{t}}$ is not a defining interaction for any $a_{1},\ldots,a_{t}=$ 0 or 1.

Following statement due to Rao (1950) is a special case of Theorem 3.6.8 and restatement of Theorem 3.3.1.

Theorem 3.6.9 (Rao, 1950). The design $\bm{D}$ corresponding to the generating relations Eq. (4.6) is a hypercube of strength $t$ if and only if all defining interactions contain more than $t$ letters.

The following theorem is a consequence of Theorem 3.4.1 for two-level designs.

Theorem 3.6.10. For the design $\bm{D}$ corresponding to $l$ generating relations Eq. (4.6), all $2^{m}-1$ effects of the design $\bm{D}^{f}$ are split into $2^{k}$ alias sets ( $k=m-l$ ). One of them (defining) contains $2^{m-k}-1$ effects. Each of the rest $2^{k}-1$ alias sets contains $2^{m-k}$ effects. Effects from different alias sets (one from each set) generate pairwise orthogonal effects in the design $\bm{D}$ . Effects of the same alias set generate identical effects in the design $\bm{D}$ .

The nature of the effects of alias sets is defined by the following theorem that is a consequence of Theorem 3.4.2.

Theorem 3.6.11. If $x_{i_{1}}^{a_{1}}\ldots x_{i_{t}}^{a_{t}}$ is not a defining interaction of the design $\bm{D}$ for any $a_{1},\ldots,a_{t}=$ 0 or 1, all main effects and interaction effects of the factors $F_{i_{1}}$ , …, ${F}_{i_{t}}$ in the design $\bm{D}^{f}$ generate main effects and interaction effects of the same factors in the design $\bm{D}$ .

Let $R_{1},R_{2},\ldots,R_{m-k}$ be the generators of the design $\bm{D}$ . Then the defining relation of the design $\bm{D}$ is

$\displaystyle 1=R_{1}=R_{2}=R_{1}R_{2}=\ldots=R_{1}R_{2}\ldots R_{m-k}.$ (135)

Let $S$ be some interaction. Then it follows from Eq. (135) that

$\displaystyle S={SR}_{1}=SR_{2}={SR}_{1}R_{2}=\ldots={SR}_{1}R_{2}\ldots R_{m-% k}.$ (136)

All interactions in Eq. (136) are different and their number is equal to $2^{m-k}$ (including maybe 1). Hence, the following theorem holds.

Theorem 3.6.12. An alias set that includes an interaction $S$ can be represented by the interactions of Eq. (136). If no interaction of Eq. (136) equals 1, the interactions Eq. (136) and only they form alias set including $S$ . If one of the interactions Eq. (136) equals 1, the rest $2^{m-k}-1$ interactions and only they form defining alias set including $S$ .

Theorem 3.6.13. Let

$\displaystyle P_{i_{1}},P_{i_{1}}^{\prime};\ldots;P_{i_{r}},P_{i_{r}}^{\prime}$

be the pairs of confounded effects in the design $\bm{D}$ . Then $P_{i_{1}\ldots i_{r}}^{\prime}=P_{i_{1}}^{\prime}\ldots P_{i_{r}}^{\prime}$ is confounded with $P_{i_{1}\ldots i_{r}}=P_{i_{1}}\ldots P_{i_{r}}$ (they belong to the same alias set).

Proof It is evident that

$\displaystyle P_{i_{1}}=P_{i_{1}}^{\prime}P_{i_{1}}^{o},\ldots,P_{i_{r}}=P_{i_% {r}}^{\prime}P_{i_{r}}^{o},$

where $P_{i_{k}}^{o}$ is a defining interaction of the design $\bm{D}^{\prime}\left(k=1,\ldots,r\right)$

Hence,

$\displaystyle P_{i_{1}\ldots i_{r}}=P_{i_{1}\ldots i_{r}}^{\prime}P^{o},$

where $P^{o}$ is, obviously, a defining interaction. This proves the theorem.

Theorem 3.6.14. Let $P_{1}P_{2},\ldots,P_{2^{m-k}}$ be all interactions of the same alias set. Then $P_{1}P_{2},\ldots,P_{1}P_{2^{m-k}}$ and only they form all $2^{m-k}-1$ defining interactions.

Proof The alias set of interactions $P_{1}P_{2},\ldots,P_{2^{m-k}}$ , by Theorem 3.6.12, can be represented as $P_{1}$ , $P_{1}R_{1}$ , $P_{1}R_{2}$ , $P_{1}R_{1}R_{2}$ , …, $P_{1}R_{1}$ … $R_{m-k}$ . This proves the theorem.

If the model (part of the model Eq. (121)) contains at least two interactions that belong to the same alias set of the design $\bm{D}$ , coefficient matrix $\bm{X}$ of the design has identical columns. Therefore, the information matrix $\bm{X}^{\prime}\bm{X}$ is singular and the solution of the normal equations of the method of least squares for the parameters of the model is not unique. If the model contains not more than one interaction from each alias set, the solution of the normal equations is unique.

For a special case of the model of main effects that includes only 1-letter interactions, the solution of the normal equations is unique if and only if for the design $\bm{D}$ there is no alias set that contains more than one 1-letter interaction. The last condition, by Theorem 3.6.9, is equivalent to the condition that the design $\bm{D}$ is a hypercube of strength 2.

A case when the model contains (except all 1-letter interactions $x_{1},\ldots,x_{m}$ ) also some interactions $S_{1},\ldots,S_{l}$ can be reduced to the main effect model as follows. Instead of the nonsingular design $\bm{D}$ for the model

$\displaystyle Ey=b_{0}+b_{1}x_{1}+\ldots+b_{m}x_{m}+b_{m+1}S_{1}+\ldots+b_{m+l% }S_{l}$ (137)

consider the nonsingular geometric main effect design $\bm{D}^{\prime}$ for the model

$\displaystyle Ey=b_{0}+b_{1}x_{1}+\ldots+b_{m}x_{m}+b_{m+1}x_{m+1}+\ldots+b_{m% +l}x_{m+l},$ (138)

where $x_{m+1},\ldots,x_{m+l}$ correspond to additional factors $F_{m+1},\ldots,F_{m+l}$ .

Assume that for the design $\bm{D}^{\prime}$ the following condition holds: there exist the generators $x_{m+1}S_{1},\ldots,x_{m+l}S_{l}$ . Then $x_{m+i}$ and $S_{i}$ belong to the same alias set for any $i=1,\ldots,l$ . However, the interactions $x_{1},\ldots,x_{m},x_{m+1},\ldots,$ $x_{m+l}$ belong to the different alias sets, because $\bm{D}$ is a nonsingular main effect design. Therefore, the interactions $x_{1},\ldots,x_{m},{S}_{1},\ldots,S_{l}$ belong to the different alias sets. Therefore, the following theorem holds.

Theorem 3.6.15. If there exists a nonsingular geometric main effect design in $N$ runs for the model Eq. (138) with the generators including $x_{m+1}S_{1},\ldots,x_{m+l}S_{l},$ then there exists a nonsingular geometric design in $N$ runs for the model Eq. (137).

Theorem 3.6.16. For the design corresponding to Eq. (4.6), there exist such $i_{1},\ldots,i_{m-k}$ and nondefining interactions $P_{1},\ldots,P_{2^{k}-1}$ (unique for the given $i_{1},\ldots,i_{m-k}$ and one from each alias set) that no interaction of $P_{1},\ldots,P_{2^{k}-1}$ contains any of letter $x_{i_{1}},\ldots,x_{i_{m-k}}$ .

Proof It is evident that there exist such $i_{1},\ldots,i_{m-k}$ that the set of the generators Eq. (4.6) can be converted to the set of generators $R_{1},\ldots,R_{m-k}$ by elementary transformations and the following condition holds: the generator $R_{j}$ contains $x_{i_{j}}$ and does not contain $x_{i_{u}}$ $(j=1,\ldots,m-k,u\neq j)$ .

Consider now some interaction $S$ that belongs to the alias set ${\cal L}_{S}$ . Assume that $S$ contains $x_{i_{j}}$ $(j=l_{1},\ldots,l_{p};$ $l_{1},\ldots,l_{p}=i_{1},\ldots,i_{m-k})$ and does not contain $x_{i_{u}}(u\neq l_{1},\ldots,l_{p})$ . Then, by Theorem 3.6.12, $SR_{l_{1}}\ldots R_{l_{p}}\in{\cal L}_{S}$ and has the property specified by Theorem 3.6.14. To prove that the interaction $SR_{l_{1}}\ldots R_{l_{p}}$ is the only one for the given $i_{1},i_{2},\ldots,i_{m-k}$ , assume in contrary that there exist two such interactions $P_{1}$ and $P_{2}$ . By Theorem 3.6.14, their product $P_{1}P_{2}$ is a defining interaction (not containing $x_{i_{1}},\ldots,x_{i_{m-k}})$ . On the other hand, a defining interaction should contain some of $x_{i_{1}},\ldots,x_{i_{m-k}}$ as a product of some generators. This contradiction proves the theorem.

Theorem 3.6.17. In the design $\bm{D}$ corresponding to the $l$ generating relations Eq. (4.6), there exist $k=m-l$ factors (columns) forming the full design $\bm{D}^{f}$ . The elements of any of remaining columns $\bm{\xi}$ are the products of the corresponding elements of some columns of $\bm{D}^{f}$ (fixed for the given $\bm{\xi}$ ).

Proof Similar to the proof of Theorem 3.6.16, find such $i_{1},\ldots,i_{m-k}$ and generators $R_{1},\ldots,R_{m-k}$ of the design that the generator $R_{j}$ contains $x_{i_{j}}$ and does not contain $x_{i_{u}}(j=1,\ldots,m-k;u\neq j)$ . Then, obviously, any defining interaction contains at least one letter of $x_{i_{j}}\left(j=1,\ldots,m-k\right),$ or (which is the same) the interactions that contain no letter of $x_{i_{j}}\left(j=1,\ldots,m-k\right)$ is not defining. I.e., the interaction $x_{j_{1}}^{a_{1}}\ldots x_{j_{k}}^{a_{k}}(j_{1},\ldots,j_{k}\neq i_{1},\ldots,% i_{m-k})$ is not defining for any $a_{1},\ldots,a_{k}=0$ or 1. By Theorem 3.6.8, the design $\bm{D}$ contains all combinations of the levels of the factors $F_{j_{1}},\ldots,F_{j_{k}}\left(j_{1},\ldots,j_{k}\neq i_{1},\ldots,i_{m-k}\right)$ . These factors, obviously, form the full design. We can get the elements of columns $x_{i_{j}}(j=1,\ldots,m-k)$ from the generating relations $1=R_{i_{1}},\ldots,1=R_{i_{m-k}}$ by using the following formula:

$\displaystyle x_{i_{1}}=R_{i_{1}}x_{i_{1}},\ldots,x_{i_{m-k}}=R_{i_{m-k}}x_{i_% {m-k}},$

where the generators $R_{i_{1}},\ldots,R_{i_{m-k}}$ do not contain $x_{i_{1}},\ldots,x_{i_{m-k}}$ .

Theorem 3.6.18. There exist such representatives of all nondefining alias sets (one from each set) that their product equals either 1 or any given defining interaction.

Proof By Theorem 3.6.16, we can find such $i_{1},\ldots,i_{k}$ that there exist nondefining interactions $P_{1},\ldots,P_{2^{k}-1}$ (one from each alias set) containing only letters $x_{i_{1}},\ldots,x_{i_{k}}$ . The number of nondefining alias sets equals $2^{k}-1$ . The number of different interactions containing only $x_{i_{1}},\ldots,x_{i_{k}}$ also equals $2^{k}-1$ . Therefore, selected nondefining interactions $P_{1},\ldots,P_{2^{k}-1}$ include all different interactions containing only $x_{i_{1}},\ldots,x_{i_{k}}$ . It is evident that the number of the interactions including the given letter $x_{i}(i=i_{1},\ldots,i_{k})$ equals $2^{k}-1$ . Therefore, the product of selected nondefining interactions equals 1.

To make this product equal to the given defining interaction $P_{0}$ replace the interactions $P_{1}$ with the interaction $P_{1}P_{0}$ (the interaction $P_{1}P_{0}$ , belongs to the alias set with the interaction $P_{1}).$

This proves the theorem.

The following theorem is a simple consequence of Theorem 3.6.18.

Theorem 3.6.19. There exist such representatives of all alias sets (one from each set) that their product equals either 1 or any given defining interaction.

Theorem 3.6.20. The product of the interactions $P_{1},\ldots,P_{2^{k}}$ from different alias sets equals either 1 or a defining interaction.

Proof By Theorem 3.6.18, we can select the representatives $P_{1}^{\prime},\ldots,P_{2^{k}}^{\prime}$ of alias sets so that $P_{1}^{\prime}\ldots P_{2^{k}}^{\prime}=1$ . However, for any representative $P_{i}$ of the $i$ -th alias set, $P_{i}=P_{i}^{\prime}P_{i_{0}}$ , where $P_{i_{0}}$ is a defining interaction. Hence, $P_{1}\ldots P_{2^{k}}=P_{1}^{\prime}\ldots P_{2^{k}}^{\prime}P_{1_{0}}\ldots P% _{2^{k}_{0}}=P_{1_{0}}\ldots P_{2^{k}_{0}}$ , which was to be proved.

5. Application: Program algorithm of factorial designs

The introduced theory of factorial designs can be useful not only for future theoretical studies but also for applications. One of example of such approach is a computer algorithm (Brodsky et al., 1978; Brodsky, 2014) of construction of optimal and close to them factorial designs. The algorithm includes two basic modules and is based on a combination of analytical methods, a catalog of basic designs, and numerical procedures.

The application of general numerical procedures for constructing an optimal design seems appropriate only for relatively small dimensions. This is true not because they lead to time consuming calculations for large dimensions but mostly because they do not give the factorial structure of the designs, so essential for a clear interpretation of the results of experiments. That is why the described algorithm has been chosen as a combination of analytical techniques and numerical procedures. The algorithm generates designs that are obtained as transformations of some class of prebuilt regular uniform (nongeometric) designs and as transformations of several types of geometric designs.

We used three types of transformations. The first type of transformation – collapsing of the factors – is introduced by Chakravarti (1956) and developed by Addelman (1962). An example of this type of transformation is given below in the diagram.

Three-level Factor		Two-level Factor
0	$\longrightarrow$	0
1	$\longrightarrow$	1
2	$\longrightarrow$	0

The second type of transformation is called the splitting of factors. It was introduced by Addelman (1962). An example of this type of transformation is given below.

Four-level Factor		Two-level Factors
0	$\longrightarrow$	0 0 0
1	$\longrightarrow$	1 0 1
2	$\longrightarrow$	0 1 1
3	$\longrightarrow$	1 1 0

The third type of transformation – replacement of factors – is also the technique by Addelman (1962). This method is the inverse to the splitting procedure. It transforms three two level factors to one four level factor.

The algorithm works for the factorial models that include main effects of quantitative and/or qualitative factors, any set of two-factor interaction effects of two-level factors, and also all interaction effects of sets of three two-level factors.

Input data are supposed to include the following information:

1. 1.
The number of factors and the numbers of their levels.
2.
Required interactions of two-level factors.
3.
The maximal number of experiments.
4.
A size of the maximal block.

There exist many methods leading to construction of effective designs for different types of factorial models. However, it is relatively easy to construct a design using a given method, but it is much more difficult to solve the inverse problem, namely, the task of finding method of construction corresponding to the requested input data. To illustrate this point, consider the following example. Let

$\displaystyle x_{1}x_{3}x_{4}x_{6},x_{2}x_{3}x_{5}x_{7},x_{1}x_{2}x_{3}x_{8},x% _{1}x_{2}x_{4}x_{9},x_{1}x_{2}x_{5}x_{10},$ $\displaystyle x_{1}x_{5}x_{11},x_{2}x_{3}x_{4}x_{12},x_{1}x_{3}x_{5}x_{13},x_{% 2}x_{4}x_{5}x_{14}$ (139)

be the generators of a geometric design $2^{14}//32$ for 14 two-level factors in 32 runs.

Anyone familiar with a methodology of geometric designs can easily construct the design and the alias sets for the given generators Eq. (139). An analysis of the alias sets shows that the design is nonsingular (and therefore has a wide range of optimal properties) for the model that includes absolute term, main effects of all 14 factors $F_{1},\ldots,F_{14}$ , all two-factor interaction effects of factors $F_{3},\ldots,F_{7}$ and three two-factor interaction effects of the following pairs of factors: $F_{1}-F_{2},F_{11}-F_{12}$ , and $F_{13}-F_{14}$ . A pair of main effects of two-level factors together with interaction effects of these factors is equivalent to main effects of a four-level factors. Therefore, instead of 14 two-level factors we can construct a design for 8 two-level factors and 3 four-level factors. One of these four-level factor can be treated as block factor (with the block size 8), and two other four-level factors can be treated as two qualitative factors. Therefore, the geometric design defined by the generators Eq. (139) can be transformed to an effective nonsingular design in four blocks (size 8 each). The model will include absolute term, effects of levels of two qualitative four-level factors, effects of levels of a block factor, main effects of eight two-level factors and all two-factor interaction effects between the first five of them.

Now consider the inverse problem, which actually occurs in practice. Suppose that we need to estimate parameters of the model that includes absolute term, effects of levels of two qualitative four-level factors, main effects of eight two-level factors and all two-factor interaction effects between the first five of them. Besides, the number of experiments that have to be conducted in homogeneous environments shall not exceed 10. The problem of finding generators that lead to a nonsingular design for the model specified above is a very laborious task. It presents a significant challenge even for experts in the theory of factorial designs.

In other cases, we are also faced with a problem to construct an optimal factorial design with the required properties while navigating through a huge amount of methods of construction. That is why the most sensible approach in this situation is an elaborate computer algorithm for the construction of effective designs.

The described algorithm is based on three prebuilt components (Brodsky et al., 1978; Brodsky, 2014): catalog of regular uniform (nongeometrical) designs, catalog of optimal transformations, and set of geometrical designs. Numerical procedures have three goals: to find generators of the geometric design corresponding to program input, to find optimal combinations of transformations, and to split the resulting design into groups with experiments performed in homogeneous environment. The algorithm is fast and finds optimal solution for very complicated tasks. In simple cases, this algorithm can be also used for manual operation with a help of all three prebuilt catalogs.
6. Conclusion

The article introduces mathematical foundation of theory of factorial design of experiments. It presents a formal concept of factorial model and designs. The most significant results that are based on the introduced concept are obtained in the following areas. The first series of results are represented by theorems developing the fundamental concept of the frequency proportionality condition originally introduced by Plackett (1946). Second area are represented by the results on equivalence of various factorial models, whether they belong to traditional regression models, models for analysis of variance, or mixed models. The third area of important results contains many powerful theorems on the optimality of factorial plans for various types of factor models. In particular, these results include theorems based both on the properties of regular factorial designs and on ideas of Hoel (1958) and Guest (1958) for polynomial regression. Another series of results represents further developments of the fundamental idea of Bose (1947) on the splitting of degrees of freedom in a full symmetrical design. In accordance with the introduced concept, this idea is developed for fractional factorial plans. One more series of theorems is associated with the factorial designs obtained as a solution of a system of linear equations in finite Euclidean space. Many of these results related to two-level designs that have the most of application.

Thus, the introduced concept of factorial designs and factorial models supports many important aspects of experimental design, including the effectiveness of statistical inferences and construction of the designs, thereby opening the way for advancements in the theory of experiments.

Footnotes

For a discrete domain $Z$ of $N$ points $\bar{d}\left(\bm{D},X_{1},\ldots,X_{m}\right)=1/N\sum_{Z}d(\bm{D},X_{1},\ldots% ,X_{m})$ .

References

Addelman

(1962). Orthogonal main effect plans for asymmetrical factorial experiments. Technometrics, 4, 21-46.

Bose

R. C.

(1947). Mathematical theory of the symmetrical factorial design. Sankhyā, 8, 107-166.

Bose

R. C.

& Bush

K. A.

(1952). Orthogonal arrays of strength two and three. Annals of Mathematical Statistics, 23, 508-524.

Box

G. E. P.

& Hunter

J. S.

(1961). The 2k-p fractional factorial designs, Part 1. Technometrics, 3, 311-351; Part 2. Technometrics, 3, 449-458.

Brodsky

L. I.

& Brodsky

V. Z.

(1977). Properties of geometric designs 2k (in Russian). In Nalimov

V. V.

(ed.) Regression experiments. Design and analysis (85-102): The Moscow University Press.

Brodsky

(2014). An introduction to the factorial design of experiments: Manhattan Academia.

Brodsky

V. Z.

(1971). On orthogonal designs (in Russian): The Moscow University Press.

Brodsky

V. Z.

(1972). Multifactorial regular designs (in Russian): The Moscow University Press.

Brodsky

V. Z.

(1975). Factorial experiments: models, designs, optimality (in Russian). In Nalimov

V. V.

(ed.) Design of optimal experiments (51-105): The Moscow University Press.

10.

Brodsky

V. Z.

Brodsky

L. I.

Maloletkin

G. N.

& Melnikov

N. N.

(1978). On computer catalog of factorial designs of experiment (in Russian). In Markova

E. V.

(ed.) Problems of Cybernetics, Issue 47, Mathematical-statistical methods of analysis and design of experiments (6-24): USSR Academy of Sciences.

11.

Brodsky

V. Z.

& Golikova

T. I.

(1981). On compatibility of optimality criteria of designs (in Russian). In Markova

E. V.

(ed.) Problems of Cybernetics, Non-traditional approach to the design of experiments (158-160): USSR Academy of Sciences.

12.

Chakravarti

I. M.

(1956). Fractional replication in asymmetrical factorial designs and partially balanced arrays. Sankhyā, 17, 143-164.

13.

Cheng

C.-S.

(2013). Theory of factorial design Single- and multi-stratum experiments: Chapman and Hall, CRC Press.

14.

Fedorov

V. V.

(1972). Theory of optimal experiments: Academic Press.

15.

Guest

P. G.

(1958). The spacing of observations in polynomial regression. Annals of Mathematical Statistics, 29, 294-299.

16.

Hoel

P. G.

(1958). Efficiency problems in polynomial estivation. Annals of Mathematical Statistics, 29, 1134-1145.

17.

Kiefer

& Wolfowitz

(1960). The equivalence of two extremum problems. Canadian Journal of Mathematics, 12, 363-366.

18.

Mukerjee

& Wu

C. F. J.

(2006). A modern theory of factorial designs. New York, NY: Springer.

19.

Plackett

R. L.

(1946). Some generalizations in the multifactorial design. Biometrika, 33, 328-332.

20.

Raghavarao

(1971). Construction and combinatorial problems in design of experiments. New York, NY: John Wiley.

21.

Rao

C. R.

(1950). The theory of fractional replication in factorial experiments. Sankhyā, 10, 81-86.

22.

Scheffé

(1959). The analysis of variance. Oxford, England: Wiley.