Cumulative conditional expectation index

Abstract

In this paper we introduce approximations of the cumulative conditional expectation function (CCEF) based on Bernstein polynomial copulas. These approximations converge pointwise and uniformly to CCEF. In addition some examples of the behavior of these approximations are shown. We also built estimators for the CCEF, which were constructed through Bernstein polynomial estimators for copulas. It is shown that the estimators are asymptotically normal and biased for the CCEF. We exemplify the use of these estimators through real data from the educational field.

Keywords

Conditional tail expectation copula distributions cumulative conditional expectation educational researchAMS 2010 subject classifications: primary 62H20 62G05;secondary 62P25.

1. Introduction

In this paper we study an extension of the concept of directional dependence in regression setting, which was designated by Sungur [15] as copula regression function. Formally, let $(U,V)$ be a random pair with uniform marginals on $[0,1]$ , with associated 2-copula $C$ , the expected value $\mathbb{E}[V|U=u]$ which is a function of $C$ , is referenced as copula regression function of $V$ on $U$ . It is proposed in Sungur [15] as a way of summarizing the relation between response and explanatory variables by separating the marginal and the joint behavior. The copula regression function is used to identify the directional dependence in joint behavior of the random pair $(U,V)$ , as introduced in Sungur [14] (see also Muddapur [11]). In addition, the directional dependence describes the likely direction of influence between two variables, from the joint behavior between them. Advances in this line [14, 15] have been applied to real data by several authors. Here we highlight the recent work of D. Kim and J. Kim [9] where it is proposed a generalized multiple-step procedure for the full inference of the directional dependence in joint behavior based in some asymmetric copula families.

We explore $\mathbb{E}[V|U\leqslant u]$ as a function of $u\in(0,1)$ , we denote this quantity as cumulative conditional expectation function. One of the motivations of working with this quantity is its use for making decisions in real problems. In addition, as noted by Hua and Joe [6], the conditional expectation can capture the strength of dependence of a bivariate random vector and helps in developing graphical techniques for visualizing the dependence in a region of $[0,1]^{2}$ .

Our target is to give the foundation for the construction of approximations and estimators for the cumulative conditional expected function. Also, we present in the first case the rate of approximation and in the second, the asymptotic distribution of the empirical process related to the estimator.

In Section 2 we show a tractable representation of the cumulative conditional expectation function. Also, we present how we can take advantage of certain structures of copula functions to calculate the cumulative conditional expectation function, as for example in the cases of convex mixture of copulas and copulas with polynomial cross sections. In Section 3, we assume the copula model is known and propose a polynomial approximation for the corresponding cumulative conditional expectation function. Also, we obtain a rate of convergence for the approximation. In Section 4, by means of the Bernstein polynomial estimator for copula functions, we propose estimators for the cumulative conditional expectation function and we show its asymptotic normality. We conclude the section with an application to real data. This application is in the educational field, previous works regarding copula models to educational data are recent, see Fernández and González-López [2] and Fernández et al. [4]. Finally, the conclusions are exposed in Section 5 and the proof of our results are given in the Appendix.

2. Cumulative conditional expectation

For some applied problems, it is helpful knowing the mean performance of one variable conditioned to another variable restricted to a region and not just to a single observation. This concept is formally presented in the next definition.

.

Let $(U,V)$ be a random vector with associated 2-copula $C$ . Then the cumulative conditional expectation function (CCEF) of $V$ on $U$ is given by $R_{C}(u)=\mathbb{E}[V|U\leqslant u],\forall u\in(0,1)$ .

.

The cases $u=1$ and $u=0$ are excluded because they restrict themselves to the copula regression function. Indeed $\mathbb{E}[V|U\leqslant 1]=\mathbb{E}[V]=0.5$ is trivial and $\mathbb{E}[V|U\leqslant 0]=\mathbb{E}[V|U=0]$ has been studied in Sungur [15].

We note the previous concept was explored by Fernández and González-López [3] with focus on some bounds of the measure $R_{C}$ . In addition, an equivalent concept was studied by Hua and Joe [6] for the purpose of investigating levels of strength of dependence in the tails: asymptotically linear, sub-linear and constant. The following theorem gives a representation for $R_{C}(u)$ that let us to simplify its computation.

.

Let $(U,V)$ be a random vector with associated 2-copula $C$ . Then $R_{C}(u)=1-\frac{1}{u}\int_{0}^{1}C(u,v)dv$ , $\forall u\in(0,1)$ .

Proof. See the Appendix. $\hfill\square$

Then, it can be seen that the conditional tail expectation, $\mathbb{E}[V|U\geqslant u]$ , can be written as $\mathbb{E}[V|U\geqslant u]=\frac{1}{2(1-u)}-\frac{u}{1-u}R_{C}(u)=1-R_{\hat{C}% }(1-u)$ , for $0<u<1$ , where $\hat{C}$ is the survival copula of $U$ and $V$ , $\hat{C}(u,v)=u+v-1+C(1-u,1-v)$ .

In order to simplify the notation, hereafter we denote the partial derivatives as done in Janssen et al. [8], this is $C^{(1)}(u,v)=\frac{\partial C(u,v)}{\partial u}$ , $C^{(2)}(u,v)=\frac{\partial C(u,v)}{\partial v}$ , $C^{(1,1)}(u,v)=\frac{\partial^{2}C(u,v)}{\partial u^{2}}$ and $C^{(2,2)}(u,v)=\frac{\partial^{2}C(u,v)}{\partial v^{2}}$ . It can be seen that $\mathbb{E}[V|U\leqslant u]$ is the average of the copula regression function $\mathbb{E}[V|U=u]$ .

.

Under the hypotheses of Theorem 1, $R_{C}(u)=\frac{1}{u}\int_{0}^{u}\mathbb{E}[V|U=w]dw,\forall u\in(0,1)$ .

Proof. By Theorem 1, $R_{C}(u)=1-\frac{1}{u}\int_{0}^{1}C(u,v)dv$ , then

$\displaystyle R_{C}(u)=\frac{1}{u}\int_{0}^{u}dw-\frac{1}{u}\int_{0}^{1}\int_{% 0}^{u}C^{(1)}(w,v)dwdv=\frac{1}{u}\int_{0}^{u}\left(1-\int_{0}^{1}C^{(1)}(w,v)% dv\right)dw.$

The last term is equal to $\frac{1}{u}\int_{0}^{u}\mathbb{E}[V|U=w]dw$ according to Sungur [15]. $\hfill\square$

By means of Theorem 1 we can compute the cumulative conditional expectation function for several cases, as it was explored in Fernández and González-López [3]. Now we present below how we can take advantage of certain structures of copula functions to calculate the function $R_{C}(\cdot)$ .

The following result shows that the cumulative conditional expectation of a convex mixture of co-pulas is the convex mixture of the cumulative conditional expectations. It is clear that the mixture $\sum_{\gamma=1}^{\Gamma}p_{\gamma}C_{\gamma}(u,v)$ of copulas $C_{1},\ldots,C_{\Gamma}$ is also a copula, when $\sum_{\gamma=1}^{\Gamma}p_{\gamma}=1$ and $0\leqslant p_{\gamma}\leqslant 1$ . The usefulness of this result can be justified by the applications of convex combination of copulas found in the literature, for example, in financial market [5], in dynamical clustering [17] and to represent different experts’ information about the parameters under a Bayesian perspective [1]. A complete explanation of this last recent approach can be found in Fernández et al. [4].

.

Under the hypotheses of Theorem 1, let $C$ be the convex combination copula, $C(u,v)=\sum_{\gamma=1}^{\Gamma}p_{\gamma}C_{\gamma}(u,v)$ where $C_{\gamma}$ is a copula function, $0\leqslant p_{\gamma}\leqslant 1$ for $1\leqslant\gamma\leqslant\Gamma$ and $\sum_{\gamma=1}^{\Gamma}p_{\gamma}=1$ . Then, $R_{C}(u)=\sum_{\gamma=1}^{\Gamma}p_{\gamma}R_{C_{\gamma}}(u)$ .

Proof. It is a direct consequence of Theorem 1, since

$\displaystyle R_{C}(u)=1-\frac{1}{u}\int_{0}^{1}\left[\sum_{\gamma=1}^{\Gamma}% p_{\gamma}C_{\gamma}(u,v)\right]dv=\sum_{\gamma=1}^{\Gamma}p_{\gamma}-\sum_{% \gamma=1}^{\Gamma}\left[p_{\gamma}\frac{1}{u}\int_{0}^{1}C_{\gamma}(u,v)dv% \right]=\sum_{\gamma=1}^{\Gamma}p_{\gamma}R_{C_{\gamma}}(u).$

$\hfill\square$

In the next result we study a specific class of copulas, the polynomial ones. Recalling that a copula has polynomial cross sections in $u$ if it can be written as $C(u,v)=\sum_{i=1}^{k}\alpha_{i}(v)u^{i}$ for each $u\in[0,1]$ , to suitable functions $\alpha_{i},i=1,\ldots,k$ . More details about these copulas can be found in Nelsen [12] Subsection 3.2. As an immediate consequence of Theorem 1 follows the next corollary.

.

Under the hypotheses of Theorem 1, if $C(u,v)=\sum_{i=1}^{k}\alpha_{i}(v)u^{i}$ . Then, $R_{C}(u)$ is a polynomial of degree $k-1$ given by $R_{C}(u)=1-\sum_{i=0}^{k-1}u^{i}\int_{0}^{1}\alpha_{i+1}(v)dv$ .

.

Quadratic and cubic cross sections copulas.

1.
The Farlie-Gumbel-Morgenstern (FGM) family contains all the copulas with quadratic cross sections in both variables. $C(u,v)=uv+\theta uv(1-u)(1-v)$ with $\theta\in[-1,1],$ then $R_{C}(u)=(3+\theta(u-1))/6$ .
2.
The Lin’s iterated FGM family is defined by $C(u,v)=uv+\theta uv(1-u)(1-v)(1+\varphi(1-u)(1-v))$ with $\theta$ and $\varphi$ real numbers satisfying $\theta\in[-1,1]$ and $-1-\theta\leqslant\theta(1+\varphi)\leqslant(3-\theta+(9-6\theta-3\theta^{2})^% {1/2})/2$ . $C$ has cubic cross sections in both variables. For $u\in[0,1],$ $C(u,v)=\sum_{i=1}^{3}\alpha_{i}(v)u^{i}$ where

$\displaystyle\alpha_{1}(v)=(1+\theta+\varphi\theta)v+(-\theta-2\varphi\theta)v% ^{2}+\varphi\theta v^{3}$ $\displaystyle\alpha_{2}(v)=(-\theta-2\varphi\theta)v+(\theta+4\varphi\theta)v^% {2}-2\varphi\theta v^{3}$ $\displaystyle\alpha_{3}(v)=\varphi\theta v-2\varphi\theta v^{2}+\varphi\theta v% ^{3}.$

Then,

$\displaystyle R_{C}(u)=\left(\frac{1}{2}-\frac{\theta}{6}-\frac{\varphi\theta}% {12}\right)+\left(\frac{\theta}{6}+\frac{\theta\varphi}{6}\right)u-\frac{% \varphi\theta}{12}u^{2}.$

Table 1
The difference $\mathbb{E}[U|V\leqslant w]-\mathbb{E}[V|U\leqslant w]$ for the cubic sections family of copulas when $b=0.5$ and the parameter $a$ takes different values

$\mathbb{E}[U|V\leqslant w]-\mathbb{E}[V|U\leqslant w]$

$a$ Spearman’s rho $w=0.1$ $w=0.2$ $w=0.4$ $w=0.6$ $w=0.8$

$-$ 2 $-$ 0.0416 $-$ 0.1500 $-$ 0.1000 $-$ 0.0250 0.0160 0.0250

$-$ 1 0.0416 $-$ 0.0900 $-$ 0.0600 $-$ 0.0150 0.0100 0.0150

0 0.1250 $-$ 0.0300 $-$ 0.0200 $-$ 0.0050 0.0030 0.0050

0.25 0.1458 $-$ 0.0150 $-$ 0.0100 $-$ 0.0020 0.0010 0.0020

0.5 0.1666 0 0 0 0 0

0.75 0.1875 0.0150 0.0100 0.0020 $-$ 0.0010 $-$ 0.0020

1 0.2083 0.0300 0.0200 0.0050 $-$ 0.0030 $-$ 0.0050

We then explore the behavior of a concept of directional dependence that can be constructed based on the curves $R_{C}$ . For this we use a straightforward extension of the directional dependence defined in Sungur [14]. In Sungur [14] the authors compare this concept with another, showing in several scenarios their performance strongly linked to the underlying copula. In particular it is observed its adequacy in the presence of asymmetry, that is, when $C(u,v)$ is different from $C(v,u)$ . For this reason we chose for our exercise a biparametric family of asymmetric copulas, which include among the possible cases the symmetry, $C(u,v)=C(v,u)$ . In D. Kim and J. Kim [9] is proposed a procedure for the inference of the directional dependence in joint behaviour, introduced in Sungur [14]. D. Kim and J. Kim [9] emphasize the way of choosing the asymmetric copula used in the modeling, since this choice is fundamental for the interpretability of the results on the directionality of the dependence. The Corollary 3 let us to easilly compute the CCEF for the asymmetric cubic sections family $C(u,v)=uv+uv(1-u)(1-v)((a-b)v(1-u)+b)$ , where $|b|\leqslant 1$ , $\frac{b-3-\sqrt{9+6b-3b^{2}}}{2}\leqslant a\leqslant 1$ and $a\neq b$ . We get $\mathbb{E}[V|U\leqslant u]=1+\frac{1}{12}(-6-b-a(-1+u)^{2}+bu^{2})$ and $\mathbb{E}[U|V\leqslant v]=1+\frac{1}{12}(-6-(v-1)(b(v-2)-av))$ . Recalling that Sungur [14] defines a pair $(U,V)$ as directional dependent in joint behavior if $\mathbb{E}[V|U=w]\neq\mathbb{E}[U|V=w]$ and $R_{C}(u)$ has a functional relation with $\mathbb{E}[V|U=w]$ given by Corollary 1, it is natural to explore $R_{C}(\cdot)$ about its relation with directionality, through the difference $\mathbb{E}[U|V\leqslant w]-\mathbb{E}[V|U\leqslant w]$ . The parameters $a$ and $b$ of the asymmetric cubic sections family determine the asymmetry, when $a=b$ the copula is symmetric and asymmetric in any other case. At Table 1 we compute this difference for the parameters $b=0.5$ and $a=-2,-1,0,0.25,0.5,0.75,1$ . When $a=b$ there is no difference, but the other cases show directional dependence. For instance, observe the cases $a=0$ and $a=1$ , the Spearman’s rho is not the same however the values of the difference are equal in absolute value-with opposite signs-, showing its directional aspect. The same happens when $a=0.25$ and $a=0.75$ . Actually for $a_{1}=b+\delta$ and $a_{2}=b-\delta$ we have copulas $C_{1}$ and $C_{2}$ respectively, such that, the Spearman’s rho coefficients and the functions $\mathbb{E}[U|V\leqslant w]$ and $\mathbb{E}[V|U\leqslant w]$ are different, but the difference between the cumulative conditional expectations is constant and measured by the asymmetry given by the parameters, i.e. $\mathbb{E}_{C_{i}}[U|V\leqslant w]-\mathbb{E}_{C_{i}}[V|U\leqslant w]=1/12(-% \delta)^{i+1}(1-3w+2w^{2})$ . Besides, the case $a=-2$ has the same Spearman’s rho, in absolute value, than $a=-1,$ but the directions are not equal. Also, for this polynomial family, it can be seen that $\mathbb{E}[U|V\leqslant w]=\mathbb{E}[V|U\leqslant w]$ for $w=0.5$ and $w=1$ . At Table 1 can be seen the signal changing in the difference at $w=0.5$ .

In the next section it is introduced the concept of Bernstein copula, which allows to derive approximations and estimates of $R_{C}$ .
3. Bernstein copulas

		$\mathbb{E}[U\|V\leqslant w]-\mathbb{E}[V\|U\leqslant w]$
$a$	Spearman’s rho	$w=0.1$	$w=0.2$	$w=0.4$	$w=0.6$	$w=0.8$
$-$ 2	$-$ 0.0416	$-$ 0.1500	$-$ 0.1000	$-$ 0.0250	0.0160	0.0250
$-$ 1	0.0416	$-$ 0.0900	$-$ 0.0600	$-$ 0.0150	0.0100	0.0150
0	0.1250	$-$ 0.0300	$-$ 0.0200	$-$ 0.0050	0.0030	0.0050
0.25	0.1458	$-$ 0.0150	$-$ 0.0100	$-$ 0.0020	0.0010	0.0020
0.5	0.1666	0	0	0	0	0
0.75	0.1875	0.0150	0.0100	0.0020	$-$ 0.0010	$-$ 0.0020
1	0.2083	0.0300	0.0200	0.0050	$-$ 0.0030	$-$ 0.0050

Hereafter we explore the properties of the cumulative conditional expectation function for Bernstein copulas (BC). These copulas belong to the family of polynomial copulas. The bivariate BC family was introduced by Li et al. [10], they are defined in terms of Bernstein polynomials. Li et al. [10] defined a stronger concept of convergence between copulas, through Markov operators, and showed that Bernstein copulas approximates any copula $C$ . We will use this approximation to obtain an approximate analytical expression of the function $R_{C}$ for any copula $C$ .

.

Given a 2-copula $C$ and a grid of points $(u_{k},\ v_{l})\in[0,1]^{2}$ with $k,l=1,\dots,m$ , the Bernstein copula approximation of $C$ of order $m$ is given by

$\displaystyle B_{m}C(u,v)=\sum_{k=1}^{m}\sum_{l=1}^{m}C\left(\frac{k}{m},\frac% {l}{m}\right)P_{k,m}(u)P_{l,m}(v),\forall(u,v)\in[0,1]^{2},$

where $P_{j,m}(x)=\binom{m}{j}x^{j}(1-x)^{m-j},\,\,x\in[0,1]$ .

Even though our work focuses on bivariate copulas, it is relevant to cite the work of Sancetta and Satchell [13]. In that paper the concept of Bernstein copulas is extended to the multivariate case, and explored them as an approximation to known copulas and estimation to unknown copulas.

The function $B_{m}C$ approximates the copula $C$ and is a copula itself. Now we concentrate on the approximation of known copulas, and in the next section of unknown ones, as a tool for approximating the cumulative conditional expectation of any copula. The usefulness of the next theorem is to numerically simplify the computation of $R_{C}$ , through solving polynomial integrals, when $C$ is a parametric copula described by a complex analytical expression.

.

Consider a 2-copula $C$ , and a grid of points $(u_{k},\ v_{l})\in[0,1]^{2}$ with $k,l=1,\dots,m$ . Set the Bernstein approximation of $R_{C}$ of order $m$ as

$R_{B_{m}C}(u)=1-\frac{1}{(m+1)u}\sum_{k=1}^{m}\sum_{l=1}^{m}C\left(\frac{k}{m}% ,\frac{l}{m}\right)P_{k,m}(u),\forall u\in(0,1),$

where $P_{k,m}(x)=\binom{m}{k}x^{k}(1-x)^{m-k},x\in[0,1]$ .

.

Definition 3 comes from Theorem 1, by means of approximation of $C$ given by Definition 2, it holds

$\displaystyle R_{B_{m}C}(u)=1-\frac{1}{u}\int_{0}^{1}B_{m}C(u,v)dv.$

For each $m$ and $l$ with $l\leqslant m$ , $\int_{0}^{1}P_{l,m}(v)dv=\binom{m}{l}\textit{Beta}(l+1,m-l+1)=\frac{1}{m+1}$ . Then, $R_{B_{m}C}(u)$ corresponds to Definition 3.

.

Under the hypotheses of Theorem 1, $R_{B_{m}C}(u)$ converges to $R_{C}(u)$ pointwise, when $m\to\infty$ .

Proof. See the Appendix. $\hfill\square$

For appropriate copulas it is possible to give a rate of the approximation for $R_{C}$ , and also we observe that this rate is improved when $u$ increases.

.

Let $C$ be a continuous copula with first order partial derivatives being Lipschitz and $u_{0}\in(0,1)$ . For $u\in[u_{0},1)$ it holds $|R_{C}(u)-R_{B_{m}C}(u)|\leqslant\frac{7M}{12u_{0}m}$ for a constant $M$ . Then, $R_{B_{m}C}$ converges uniformly to $R_{C}$ in $[u_{0},1)$ , when $m\to\infty$ .

Proof. By Theorem 1 we know that $R_{C}(u)-R_{B_{m}C}(u)=\frac{1}{u}\int_{0}^{1}C(u,v)-B_{m}C(u,v)dv$ . In addition, in Sancetta and Satchell [13] (Theorem 2), it is proved that a continuous copula $C$ with Lipschitz partial derivatives satisfies $|C(u,v)-B_{m}C(u,v)|\leqslant\frac{M}{2m}(u(1-u)+v(1-v))$ for a constant $M$ . Then, it we can seen that

$\displaystyle|R_{C}(u)-R_{B_{m}C}(u)|\leqslant\frac{M}{2mu}\int_{0}^{1}(u(1-u)% +v(1-v))dv\leqslant\frac{7M}{12u_{0}m},$

for $u\geqslant u_{0}$ . The last inequality tends to $0$ when $m$ goes to infinity uniformly in $[u_{0},1)$ . $\hfill\square$

We exemplify the approximation of $R_{C}(u)$ through $R_{B_{m}C}(u)$ by exploring the case reported in Sancetta and Satchell [13] (Table 1).

.

Consider the Kimeldorf and Sampson (KS) copula given by $C(u,v)=(u^{-\theta}+v^{-\theta}-1)^{-1/\theta},\theta\geqslant 0.$

In Figs 1 and 2 we show the function $R_{C}$ for $C$ being the KS copula and its approximation for Bernstein copulas with $m=10,30,50,100,200,300$ , for several values of $\theta$ . It can be seen that, for all values of $\theta$ , the approximation improves for higher values of $u$ . This is a consequence of Theorem 3, since the copula KS is a continuous copula with first order partial derivatives satisfying Lipschitz condition.

We note that in the example, the evaluation of $R_{C}$ was done by solving numerically the integration given in Theorem 1 through the “NIntegrate” function of Mathematica 9.0 software.

Figure 1.

Approximation of the cumulative conditional expectation function of the Kimeldorf and Sampson (KS) copula by $R_{B_{m}C}$ (Definition 3) for $\theta=0.14,0.31,0.51,0.76,1.06$ .

Figure 2.

Approximation of the cumulative conditional expectation function of the Kimeldorf and Sampson (KS) copula by $R_{B_{m}C}$ (Definition 3) for $\theta=1.51,2.14,3.19,5.56$ .

4. Estimation

Consider a bivariate random sample $\{(U_{j},V_{j})\}_{j=1}^{n}$ of the vector $(U,V)$ with associated unknown 2-copula $C$ . Let be $R_{U_{j}}=$ rank of $U_{j}$ in $\left\{U_{1},\ldots,U_{n}\right\}$ and $R_{V_{j}}=$ rank of $V_{j}$ in $\left\{V_{1},\ldots,V_{n}\right\}$ . Denote the empirical copula as

$\displaystyle C_{n}(u,v)=\frac{1}{n}\sum_{j=1}^{n}I\left(\frac{R_{U_{j}}}{n}% \leqslant u\right)I\left(\frac{R_{V_{j}}}{n}\leqslant v\right),\forall u,v\in[% 0,1].$

By using the Bernstein estimator of the copula function $C$ proposed in Sancetta and Satchell [13], we built consistent estimators of the cumulative conditional expectation function.

.

Given a 2-copula $C,$ the estimator of $R_{C}(u)$ is

$\hat{R}_{B_{m}C_{n}}(u)=1-\frac{1}{(m+1)u}\sum_{k=0}^{m}\sum_{l=0}^{m}C_{n}% \left(\frac{k}{m},\frac{l}{m}\right)P_{k,m}(u),\forall u\in(0,1),$

where $C_{n}$ is the empirical copula.

.

The estimator of $R_{C}$ is coming from the Bernstein estimator of order $m>0$ , $B_{m}C_{n}(u,v)=\sum_{k=0}^{m}\sum_{l=0}^{m}C_{n}(\frac{k}{m},\frac{l}{m})P_{k% ,m}(u)P_{l,m}(v)$ , by applying the transformation $1-\frac{1}{u}\int_{0}^{1}B_{m}C_{n}(u,v)dv,\forall u\in(0,1)$ . Note that the order $m$ depends of $n,m(n)$ and $m\to\infty$ if $n\to\infty$ .

In order to explore the asymptotic behavior of the estimator $\hat{R}_{B_{m}C_{n}}(u)$ we make use of some known results about the Bernstein estimator. Janssen et al. [7] (Theorem 2 and Remark 3) proved that for a copula $C$ with bounded third order partial derivatives on $(0,1)^{2}$ , if $\sqrt{n}/m\rightarrow d$ , $0\leqslant d<\infty$ , when $n\to\infty$ , then the process $\sqrt{n}(B_{m}C_{n}(u,v)-C(u,v))\leadsto{\mathcal{G}}_{C}(u,v)$ in the space $l^{\infty}((0,1)^{2})$ of bounded functions, where $\leadsto$ denotes weak convergence. In addition, the limiting process ${\mathcal{G}}_{C}(u,v)$ is a tight Gaussian process with mean function $db(u,v)$ where

$b(u,v)=\frac{1}{2}(u(1-u)C^{(1,1)}(u,v)+v(1-v)C^{(2,2)}(u,v))$ (1)

and covariance function given by $E[h(u,v)h(u^{\prime},v^{\prime})]$ for $0<u,u^{\prime},v,v^{\prime}<1$ with

$\displaystyle h(u,v)=I(U\leqslant u,V\leqslant v)-C(u,v){}-C^{(1)}(u,v)(I(U% \leqslant u)-u)-C^{(2)}(u,v)(I(V\leqslant v)-v).$ (2)

This time we consider the process $\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv$ , with $u\in(0,1)$ , which is the process that defines the asymptotic behavior of the proposed estimator of $R_{C}$ .

.

Let $C$ be a 2-copula with bounded third order partial derivatives on $(0,1)^{2}$ . If $n$ and $m$ are positive integers such that $\sqrt{n}/m\rightarrow d$ with $0\leqslant d<\infty$ when $n,m\to\infty$ , then for $u\in(0,1)$ ,

$\displaystyle\sqrt{n}(\hat{R}_{B_{m}C_{n}}(u)-R_{C}(u))\leadsto-\frac{1}{u}% \int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv.$

$-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv$ is a Gaussian process with mean function $d\left(\frac{1}{2}-R_{C}(u)\right)+\frac{d(u-1)}{2}\int_{0}^{1}C^{(1,1)}(u,v)dv$ and variance function

$\displaystyle Var\left[-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv\right]% =-4R_{C}^{2}(u)+\left(4-\frac{1}{u}\right)R_{C}(u){}-4\left(2-\frac{1}{u}% \right)R_{C}(u)\mathcal{H}_{1}(u,u)+2\left(3-\frac{2}{u}\right)\mathcal{H}_{1}% (u,u){}-4\left(1-\frac{1}{u}\right)\mathcal{H}_{1}^{2}(u,u)-2+\frac{1}{u}+% \frac{1}{u^{2}}\int_{0}^{1}\left(C^{(2)}(u,v)\right)^{2}vdv.$

where $2\mathcal{H}_{1}(u,u)=\int_{0}^{1}C^{(1)}(u,v)dv$ .

Proof. See the Appendix. $\hfill\square$

.

The covariance function of the process $-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv$ is

$\displaystyle Cov\left[-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv,-\frac% {1}{u^{\prime}}\int_{0}^{1}{\mathcal{G}}_{C}(u^{\prime},v^{\prime})dv^{\prime}\right]$ $\displaystyle\quad{}=\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}C(u\wedge u^% {\prime},v\wedge v^{\prime})dv^{\prime}dv+(1-R_{C}(u))(R_{C}(u^{\prime})-1)$ $\displaystyle\quad{}+\mathcal{H}_{3}(u,u^{\prime})+\mathcal{H}_{3}(u^{\prime},% u)+\left(\frac{1}{u\vee u^{\prime}}-1\right)\int_{0}^{1}C^{(1)}(u^{\prime},v)% dv\int_{0}^{1}C^{(1)}(u,v)dv$ $\displaystyle\quad{}-\mathcal{H}_{2}(u,u^{\prime})-\mathcal{H}_{2}(u^{\prime},% u)+\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}C^{(2)}(u^{\prime},v^{\prime})% C^{(2)}(u,v)(v\wedge v^{\prime})dvdv^{\prime}$ $\displaystyle\quad{}-R_{C}(u)R_{C}(u^{\prime})+\mathcal{H}_{1}(u,u^{\prime})+% \mathcal{H}_{1}(u^{\prime},u).$

with

$\displaystyle\mathcal{H}_{1}(w_{1},w_{2})=\frac{1}{w_{1}w_{2}}\int_{0}^{1}C^{(% 1)}(w_{1},v)dv\int_{0}^{1}C(w_{1},v)C^{(2)}(w_{2},v)dv$ $\displaystyle\mathcal{H}_{2}(w_{1},w_{2})=\frac{1}{w_{1}w_{2}}\int_{0}^{1}\int% _{0}^{1}C^{(2)}(w_{2},v)C(w_{1},v\wedge v^{\prime})dvdv^{\prime}-(1-R_{C}(w_{1% }))R_{C}(w_{2})$ $\displaystyle\mathcal{H}_{3}(w_{1},w_{2})=\left(\frac{1}{w_{1}\vee w_{2}}(R_{C% }(w_{1}\wedge w_{2})-1)+1-2R_{C}(w_{1})\right)\int_{0}^{1}C^{(1)}(w_{2},v)dv.$

for $w_{1}\vee w_{2}=\max\{w_{1},w_{2}\}$ and $w_{1}\wedge w_{2}=\min\{w_{1},w_{2}\}$ . See the Appendix for details of the calculus.

.

The variance of the process $-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv$ also can be written in terms of $R_{C}$ and $r_{C}$ for $r_{C}$ being the copula regression function, this is $r_{C}(u)=\mathbb{E}[V|U=u]$ . By using the relations $1-r_{C}(u)=\int_{0}^{1}C^{(1)}(u,v)dv=2\mathcal{H}_{1}(u,u)$ , the variance is

$\displaystyle Var\left[-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv\right]% =-4R_{C}^{2}(u)+\frac{1}{u}R_{C}(u)+2\left(2-\frac{1}{u}\right)R_{C}(u)r_{C}(u% ){}-r_{C}(u)-\left(1-\frac{1}{u}\right)r_{C}^{2}(u)+\frac{1}{u^{2}}\int_{0}^{1% }(C^{(2)}(u,v))^{2}vdv.$

Table 2

For the sample of each course is reported the pairwise Spearman’s rho coefficient, the sample size $n$ and the Bernstein polynomial degree $m$ used in Fig. 3

Engineering course	Spearman’s rho	Sample size ( $n$ )	$m$
Agricultural	0.261	586	25
Civil	0.203	699	27
Computer	0.284	771	28
Electrical	0.353	597	25
Mechanical	0.288	1204	35

Figure 3.

Cumulative conditional expectation function for each of the Engineering courses where $R_{C}$ was obtained through $\hat{R}_{B_{m}C_{n}}$ Definition 4, with $m$ taken as reported by Table 2 (left) and through the emprirical copula (right).

Figure 4.

Cumulative conditional expectation function for each of the Engineering courses where $R_{C}$ was obtained through $\hat{R}_{B_{m}C_{n}}$ Definition 4, with $m=5,10,15,20$ . The legends are: $\Box$ for agricultural, $\ast$ for civil, $\circ$ for computer, $\blacktriangleright$ for electrical and $\bullet$ for mechanical course.

4.1 Application

We estimate $R_{C}$ to quantify the impact on Calculus undergraduate student’s performance of the Mathematics admission test score. In this application we work with the students of the undergraduate courses of Agricultural, Civil, Computer, Electrical and Mechanical Engineering of University of Campinas, Brazil. There were considered the ones who were admitted at the university from 2003 to 2011. For each student it was taken the Mathematics admission score, denoted by $X$ , and the Calculus I score obtained after admission, denoted by $Y$ . An annual standardization per course was used to avoid the effect of different tests applied each year. Some characteristics of the data are displayed in Table 2. Regarding the polynomial degree $m$ we follow an analogous reasoning as Janssen et al. [8] (Remark 1), but for the copula function. In Janssen et al. [7] it is shown that $B_{m}C_{n}-C$ has mean function $m^{-1}b(u,v)+o(m^{-1})$ and variance $\sigma^{2}(u,v)-m^{-1/2}V(u,v)$ where $\sigma^{2}(u,v)=Var(h(u,v))$ and

$\displaystyle V(u,v)=\left(C^{(1)}(u,v)(1-C^{(1)}(u,v))\left(\frac{u(1-u)}{\pi% }\right)^{1/2}\right)+\left(C^{(2)}(u,v)(1-C^{(2)}(u,v))\left(\frac{v(1-v)}{% \pi}\right)^{1/2}\right).$

Now, let us assume $m=n^{\alpha}$ . Because of the hypothesis condition $\sqrt{n}/m\rightarrow d$ of Theorem 4 and the reduction of the bias, it should be $1/2<\alpha$ . On the other hand, when $m$ diminishes, so does the variance. Then, an ideal degree could be $m=\lceil n^{1/2}\rceil$ . In Fig. 3 the estimator $\hat{R}_{B_{m}C_{n}}$ is shown for each engineering course where the vector $(U,V)$ is obtained by transforming $X$ and $Y$ by scaling ranks to $[0,1]$ . As a benchmark we also plot the CCEF for the empirical copula of each course given by $\hat{R}_{C_{n}}(u)=1-\frac{1}{un}\sum_{j=1}^{n}\left(1-\frac{R_{V_{j}}}{n}% \right)I\left(\frac{R_{U_{j}}}{n}\leqslant u\right)$ .

Figure 3 shows that there are some differences between the courses, this is valuable information for academic management. For instance, suppose that the university wants to offer extra lessons to improve the performance in math of the young students. Then, it would be more efficient to use different rules for defining the group for extra lessons, because, there may be several types of behavior of $R_{C}$ . For example in our data we observe two types of situations, (i) composed by Civil, Electrical and Mechanical Engineering and (ii) composed by Agricultural and Computer Engineering. In the first case, the behavior of $R_{C}$ is approximately linear, and in the second case $R_{C}$ has a tendency of second order polynomial. Quadratic forms type (ii) require strategies to efficiently define the groups of students who would need assistance in math. For example, if we set $R_{C}=0.43$ in computer engineering, there are two points that could be interesting to take in consideration $u=0.207$ and $u=0.440$ since $\hat{R}_{B_{m}C_{n}}(0.207)=0.43$ and $\hat{R}_{B_{m}C_{n}}(0.440)=0.43$ . Finally, we empirically explore the performance of $\hat{R}_{B_{m}C_{n}}$ for different degrees $m$ , since the $m$ listed in Table 2 is quite high and this could over-fit the model. In Fig. 4 we can see the degree does not impact the courses classification in two groups: one almost linear and the other almost quadratic. At Table 3 we exemplify these two behaviors showing some values for the electrical course with linear CCEF and for the computer course with quadratic CCEF. In addition, it can be seen that, for each $u$ , $\hat{R}_{B_{m}C_{n}}(u)$ decreases when $m$ gets higher and get closer to $\hat{R}_{C_{n}}(u)$ at regions where $\hat{R}_{B_{m}C_{n}}$ is increasing, this is $u=0.1,0.2,0.4,0.6$ for electrical and $u=0.2,0.4,0.6$ for computer. This highlights a possible criteria for chosing the parameter $m$ by means of analysing the monotonical behavior of $\hat{R}_{C_{n}}(u)$ . In any case, more studied needs to be done regarding the performance of several criteria for setting $m$ , in the context of the cumulative function as well as in the more general context of Bernstein copulas.

Table 3
Electrical and computer engineering courses values of $\hat{R}_{B_{m}C_{n}}$ and $\hat{R}_{C_{n}}$

	Electrical				Computer
	$u=0.1$	$u=0.2$	$u=0.4$	$u=0.6$	$u=0.1$	$u=0.2$	$u=0.4$	$u=0.6$
Empirical	0.3290	0.3408	0.3990	0.4403	0.4380	0.4286	0.4122	0.4463
$m=20$	0.3413	0.3626	0.4089	0.4441	0.4452	0.4288	0.4292	0.4507
$m=15$	0.3461	0.3706	0.4138	0.4459	0.4452	0.4311	0.4332	0.4526
$m=10$	0.3591	0.3798	0.4176	0.4490	0.4372	0.4319	0.4377	0.4564
$m=5$	0.3832	0.4046	0.4335	0.4580	0.4374	0.4392	0.4496	0.4648

5. Conclusion

From the results derived in Section 2, it can be studied the shapes and properties of the function $R_{C}$ , under the assumption of stochastic structures. Some progress in this line can be seen in Fernández and González-López [3]. Even more, Proposition 1 of Fernández and González-López [3] allows to understand theoretically why the cumulative conditional function of each course (in our application) converges to $0.5$ when $u$ gets closer to 1, and this explains also why the difference between the functions diminishes when $u$ increases (see Fig. 3).

Theorems 2 and 3 show how under the assumption of knowledge of the copula in a given grid of points, the Bernstein copula can be used to obtain approximations of $R_{C}$ , $R_{B_{m}C}$ which converge pointwise and uniformly to $R_{C}$ . Illustrations of those results are shown in the case of Kimeldorf and Sampson copula family. Our results may contribute to the study of convergence properties in particular cases of copulas. For situations where the underlying copula is unknown, we propose an estimator of $R_{C}$ , based on the Bernstein estimator of the copula function. Theorem 4 shows that the estimator $\hat{R}_{B_{m}C_{n}}$ is an asymptotically biased normal estimator of $R_{C}$ , with a variance that vanishes, when $\sqrt{n}/m\to d$ and $n$ goes to infinity. We exemplify the use of this estimator using real data, revealing the practical potential of using $\hat{R}_{B_{m}C_{n}}$ to make decisions that could impact educational policies.

Footnotes

Acknowledgments

The authors gratefully acknowledge the financial support for this research provided by FAPESP Post-doctoral Grant 2011/18285-6 to M. Fernández. Also, we wish to thank three referees and an associate editor for their many helpful comments and suggestions on an earlier draft of this paper. Special thanks to Prof. Marcelo Knobel for making the data used in this paper available to us.

Appendix

Proof of Theorem 1

Let us take $u\in(0,1)$ . We recall that $\mathbb{P}(V\leqslant v|U\leqslant u)=\frac{C(u,v)}{u}$ and, for every $u\in[0,1]$ , there exits $C^{(2)}$ for almost $v\in[0,1]$ (Nelsen [12], Theorem 2.2.7). Here the word “almost" is in the sense of Lebesgue measure. So $\mathbb{P}(V\leqslant v|U\leqslant u)=\int_{0}^{v}\frac{C^{(2)}(u,v)}{u}dv$ , for each $v\in[0,1]$ . Let us denote the set of point where $C(u,\cdot)$ is not derivable as $\{v_{j}\}_{j=1}^{\infty}$ and assume it nondecreasingly ordered. To simplify the notation we set $v_{0}=0$ . In addition, if $C(u,\cdot)$ is not derivable on a finite set $\{v_{1},\dots,v_{k}\}$ , we define $v_{n}=1$ for all $n>k$ . Now we apply integration by parts and obtain the relation stated,

$\displaystyle\mathbb{E}[V|U\leqslant u]=\int_{0}^{1}v\frac{C^{(2)}(u,v)}{u}dv=% \frac{1}{u}\sum_{j=0}^{\infty}\int_{v_{j}}^{v_{j+1}}vC^{(2)}(u,v)dv=\frac{1}{u% }\sum_{j=0}^{\infty}\left[C(u,v)v\Big{|}_{v=v_{j}}^{v=v_{j+1}}-\int_{v=v_{j}}^% {v=v_{j+1}}C(u,v)dv\right]=\frac{1}{u}\left[C(u,1)-\sum_{j=0}^{\infty}\int_{v=% v_{j}}^{v=v_{j+1}}C(u,v)dv\right]=1-\frac{1}{u}\int_{0}^{1}C(u,v)dv.$

$\hfill\square$

Proof of Theorem 2

For each $u\in[0,1]$ let us define the functions $f_{m}(v)=B_{m}C(u,v)$ , where $B_{m}C$ is the Bernstein copula approximation of $C$ of order $m$ given by Definition 2, and $f(v)=C(u,v)$ for $v\in[0,1]$ . As shown by Li et al. [10], $f_{m}$ converges strongly to $f$ , and they also show that the strong convergence in copulas is strictly stronger than uniform convergence. In addition we have the dominance $|f_{m}|\leqslant 1$ for all $m=1,2,\ldots$ . Then, by the Lebesgue Dominated Convergence Theorem, it holds $\underset{m\rightarrow\infty}{\lim}\int_{0}^{1}f_{m}(v)dv=\int_{0}^{1}f(v)dv$ . Hence, recalling that $R_{B_{m}C}(u)=1-\frac{1}{u}\int_{0}^{1}f_{m}(v)dv$ and $R_{C}(u)=1-\frac{1}{u}\int_{0}^{1}f(v)dv$ , for each $u\in(0,1)$ we get $R_{B_{m}C}(u)\longrightarrow R_{C}(u)$ , when $m\to\infty$ . $\hfill\square$

Proof of Theorem 4

In order to simplify the notation, let us introduce the following two processes

$\displaystyle Z_{m,n}(u,v)=\sqrt{n}(B_{m}C_{n}(u,v)-C(u,v)),$ $\displaystyle W_{m,n}(u)=\sqrt{n}(\hat{R}_{B_{m}C_{n}}(u)-R_{C}(u))=-\frac{1}{% u}\int_{0}^{1}Z_{m,n}(u,v)dv.$

From Janssen et al. [7], the process $Z_{m,n}(u,v)\leadsto{\mathcal{G}}_{C}(u,v)$ , in the space $l^{\infty}((0,1)^{2})$ of bounded functions. The limiting process ${\mathcal{G}}_{C}(u,v)$ is a tight Gaussian process with mean function $db(u,v)$ , with $d$ being a constant value and $b$ function, following Eq. (1), with covariance function $\mathbb{E}[h(u,v)h(u^{\prime},v^{\prime})]$ where $0<u,u^{\prime},v,v^{\prime}<1$ and $h$ function, following Eq. (2). The continuous mapping Theorem, see, for example, Van der Vaart [16] Theorem 18.11, let us to get the following convergence

(3) $W_{m,n}(u)=-\frac{1}{u}\int_{0}^{1}Z_{m,n}(u,v)dv\leadsto-\frac{1}{u}\int_{0}^% {1}{\mathcal{G}}_{C}(u,v)dv$

for every $u\in(0,1)$ . Since a continuous linear transformation of a tight Gaussian process is normally distributed, (see Van der Vaart [16], Example 20.11), then, for each $u\in(0,1)$ , $-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv$ has normal distribution.

By integration by parts of $b$ function, we obtain that the expected function is

$\displaystyle\mathbb{E}\left[-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv% \right]=-\frac{1}{u}\int_{0}^{1}\mathbb{E}[{\mathcal{G}}_{C}(u,v)]dv=-\frac{1}% {u}\int_{0}^{1}db(u,v)dv=\frac{d}{u}\int_{0}^{1}C(u,v)dv+\frac{d(u-1)}{2}\int_% {0}^{1}C^{(1,1)}(u,v)dv-\frac{d}{2}=d(1-R_{C}(u))+\frac{d(u-1)}{2}\int_{0}^{1}% C^{(1,1)}(u,v)dv-\frac{d}{2}.$

Now we compute the covariance function.

$\displaystyle Cov\left[-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv,-\frac% {1}{u^{\prime}}\int_{0}^{1}{\mathcal{G}}_{C}(u^{\prime},v^{\prime})dv^{\prime}\right]$ $\displaystyle\quad=\frac{1}{uu^{\prime}}\mathbb{E}\left[\int_{0}^{1}({\mathcal% {G}}_{C}(u,v)-db(u,v))dv\int_{0}^{1}({\mathcal{G}}_{C}(u^{\prime},v^{\prime})-% db(u^{\prime},v^{\prime}))dv^{\prime}\right]$ $\displaystyle\quad=\frac{1}{uu^{\prime}}\mathbb{E}\left[\int_{0}^{1}\int_{0}^{% 1}({\mathcal{G}}_{C}(u,v)-db(u,v))({\mathcal{G}}_{C}(u^{\prime},v^{\prime})-db% (u^{\prime},v^{\prime}))dv^{\prime}dv\right]$ $\displaystyle\quad=\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}\mathbb{E}[h(u% ,v)h(u^{\prime},v^{\prime})]dv^{\prime}dv.$

Expanding the integrand, we get

$\displaystyle E[h(u,v)h(u^{\prime},v^{\prime})]=C(u\wedge u^{\prime},v\wedge v% ^{\prime})-C(u,v)C(u^{\prime},v^{\prime}){}-C^{(1)}(u^{\prime},v^{\prime})(C(u% \wedge u^{\prime},v)-C(u,v)u^{\prime}){}-C^{(1)}(u,v)(C(u\wedge u^{\prime},v^{% \prime})-C(u^{\prime},v^{\prime})u){}+C^{(1)}(u,v)(u\wedge u^{\prime}-uu^{% \prime})C^{(1)}(u^{\prime},v^{\prime}){}-C^{(2)}(u^{\prime},v^{\prime})(C(u,v% \wedge v^{\prime})-C(u,v)v^{\prime}){}-C^{(2)}(u,v)(C(u^{\prime},v\wedge v^{% \prime})-C(u^{\prime},v^{\prime})v){}+C^{(2)}(u,v)(v\wedge v^{\prime}-vv^{% \prime})C^{(2)}(u^{\prime},v^{\prime}){}+C^{(1)}(u,v)(C(u,v^{\prime})-uv^{% \prime})C^{(2)}(u^{\prime},v^{\prime}){}+C^{(2)}(u,v)(C(u^{\prime},v)-u^{% \prime}v)C^{(1)}(u^{\prime},v^{\prime}).$

Then, recalling that $R_{C}(u)=\frac{1}{u}\int_{0}^{1}vC^{(2)}(u,v)dv$ and $1-R_{C}(u)=\frac{1}{u}\int_{0}^{1}C(u,v)dv$ , we can find that

$\displaystyle\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}\mathbb{E}[h(u,v)h(u% ^{\prime},v^{\prime})]dv^{\prime}dv$ $\displaystyle\quad=\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}C(u\wedge u^{% \prime},v\wedge v^{\prime})dv^{\prime}dv+(1-R_{C}(u))(R_{C}(u^{\prime})-1)$ $\displaystyle\quad{}+\left(\frac{1}{u\vee u^{\prime}}(R_{C}(u\wedge u^{\prime}% )-1)+1-R_{C}(u)\right)\int_{0}^{1}C^{(1)}(u^{\prime},v)dv$ $\displaystyle\quad{}+\left(\frac{1}{u\vee u^{\prime}}(R_{C}(u\wedge u^{\prime}% )-1)+1-R_{C}(u^{\prime})\right)\int_{0}^{1}C^{(1)}(u,v)dv$ $\displaystyle\quad{}+\int_{0}^{1}C^{(1)}(u^{\prime},v)dv\int_{0}^{1}C^{(1)}(u,% v)dv\left(\frac{1}{u\vee u^{\prime}}-1\right)$ $\displaystyle\quad{}-\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}C^{(2)}(u^{% \prime},v^{\prime})C(u,v\wedge v^{\prime})dvdv^{\prime}+(1-R_{C}(u))R_{C}(u^{% \prime})$ $\displaystyle\quad{}-\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}C^{(2)}(u,v)% C(u^{\prime},v\wedge v^{\prime})dvdv^{\prime}+(1-R_{C}(u^{\prime}))R_{C}(u)$ $\displaystyle\quad{}+\frac{1}{uu^{\prime}}\int_{0}^{1}\int_{0}^{1}C^{(2)}(u^{% \prime},v^{\prime})C^{(2)}(u,v)(v\wedge v^{\prime})dvdv^{\prime}-R_{C}(u)R_{C}% (u^{\prime})$ $\displaystyle\quad{}+\frac{1}{uu^{\prime}}\int_{0}^{1}C^{(1)}(u,v)dv\int_{0}^{% 1}C(u,v^{\prime})C^{(2)}(u^{\prime},v^{\prime})dv^{\prime}$ $\displaystyle\quad{}-R_{C}(u^{\prime})\int_{0}^{1}C^{(1)}(u,v)dv$ $\displaystyle\quad{}+\frac{1}{uu^{\prime}}\int_{0}^{1}C^{(1)}(u^{\prime},v)dv% \int_{0}^{1}C(u^{\prime},v)C^{(2)}(u,v)dv$ $\displaystyle\quad{}-R_{C}(u)\int_{0}^{1}C^{(1)}(u^{\prime},v)dv.$

The expression above can be simplified by introducing the functions $\mathcal{H}_{1}$ , $\mathcal{H}_{2}$ and $\mathcal{H}_{3}$ , defined in Remark 4, as follows

Finally, by taking $u=u^{\prime}$ and $v=v^{\prime}$ in the covariance’s expression we get the variance

$\displaystyle Var\left[-\frac{1}{u}\int_{0}^{1}{\mathcal{G}}_{C}(u,v)dv\right]% =\frac{1}{u}(1-R_{C}(u))-(1-R_{C}(u))^{2}{}+2\mathcal{H}_{3}(u,u)+\left(\frac{% 1}{u}-1\right)\left(\int_{0}^{1}C^{(1)}(u,v)dv\right)^{2}{}-2\mathcal{H}_{2}(u% ,u)+\frac{1}{u^{2}}\int_{0}^{1}(C^{(2)}(u,v))^{2}vdv-R_{C}(u)^{2}+2\mathcal{H}% _{1}(u,u).$

By means of the relation $\frac{2}{u^{2}}\int_{0}^{1}C^{(2)}(u,v)C(u,v)dv=1$ , we get $2\mathcal{H}_{2}(u,u)=1-2R_{C}(u)+2R_{C}^{2}(u)$ and $2\mathcal{H}_{1}(u,u)=\int_{0}^{1}C^{(1)}(u,v)dv$ . Then, $2\mathcal{H}_{3}(u,u)=4\left(\frac{1}{u}-2\right)\mathcal{H}_{1}(u,u)R_{C}(u)+% 4\mathcal{H}_{1}(u,u)\left(1-\frac{1}{u}\right)$ the variance can be reduced to

$\hfill\square$

References

Fernández

and González-López

V.A.

, A Bayesian approach for convex combination of two Gumbel-Barnett copulas, in: AIP Conference Proceedings, 1558, 2013, pp. 1491–1494.

Fernández

and González-López

V.A.

, A copula model to analyze minimum admission scores, in: AIP Conference Proceedings, 1558, 2013, pp. 1479–1482.

Fernández

and González-López

V.A.

, Bounds for the cumulative conditional expectation function, in: AIP Conference Proceedings, 1648, 2015, 060002.

Fernández

et al., A note on conjugate distributions for copulas, Mathematical Methods in the Applied Sciences 38 (2014), 4797–4803.

, Dependence patterns across financial markets: a mixed copula approach, Applied Financial Economics 16 (2006), 717–729.

Hua

and Joe

, Strength of tail dependence based on conditional tail expectation, Journal of Multivariate Analysis 123 (2014), 143–159.

Janssen

et al., Large sample behavior of the Bernstein copula estimator, Journal of Statistical Planning and Inference 142 (2012), 1189–1197.

Janssen

et al., A note on the asymptotic behavior of the Bernstein estimator of the copula density, Journal of Multivariate Analysis 124 (2014), 480–487.

Kim

and Kim

J.M.

, Analysis of directional dependence using asymmetric copula-based regression models, Journal of Statistical Computation and Simulation 84(9) (2013), 1990–2010.

10.

et al., Strong approximations of copulas, Journal of Mathematical Analysis and Applications 225 (1998), 608–623.

11.

Muddapur

M.V.

, A Remark on the Paper “A Note on Directional Dependence in Regression Setting” By Engin A. Sungur, Communications in Statistics – Theory and Methods 37(3) (2008), 386–387.

12.

Nelsen

, An Introduction to Copulas, Lecture Notes in Statistics 139, Springer-Verlag, 2006.

13.

Sancetta

and Satchell

, The Bernstein copula and its applications to modeling and approximations of multivariate distributions, Econometric Theory 20 (2004), 535–562.

14.

Sungur

E.A.

, A note on directional dependence in regression setting, Communications in Statistics – Theory and Methods 34(9–10) (2005), 1957–1965.

15.

Sungur

E.A.

, Some Observations on Copula Regression Functions, Communications in Statistics – Theory and Methods 34(9–10) (2005), 1967–1978.

16.

Van der Vaart

A.W.

, Asymptotic Statistics, Cambridge University Press, 2000.

17.

Vrac

et al., Copula analysis of mixture models, Computational Statistics 27 (2012), 427–457.

Cumulative conditional expectation index

Abstract

Keywords

1. Introduction

2. Cumulative conditional expectation

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Table 3 Electrical and computer engineering courses values of R ^ B m ⁢ C n and R ^ C n

Footnotes

Acknowledgments

Appendix

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 4

References

Table 3
Electrical and computer engineering courses values of $\hat{R}_{B_{m}C_{n}}$ and $\hat{R}_{C_{n}}$