Shape-constrained nonparametric estimation of the term structure of interest rates

Abstract

This paper studies nonparametric estimation of the discount curve, which should be decreasing and positive over the entire maturity domain. Very few papers explicitly impose these shape requirements for removing the possibility of obtaining a shape-violating estimation. No matter how small the approximating error is, a shape-violating discount curve can never be accepted by the financial industry. Since these shape requirements are continuously constrained and involve an infinite number of inequality constraints, it is hard to provide a necessary and sufficient implementation that is computationally tractable. Existing parametric and nonparametric methods fail to achieve universal flexibility and shape compliance simultaneously. This paper proposes a nonparametric method that approximates the discount curve with algebraic polynomials and ensures the discount function is decreasing and positive over the entire domain. This estimation problem can be reformulated equivalently as a semidefinite program that is convex and computationally tractable. The proposed method is the first one which not only has asymptotic universal fitting flexibility, but also fully complies with shape requirements. Experimental results on one artificial data, one US Gilt STRIPS data, and one US Treasury bonds data demonstrate its superiority over state-of-the-art methods in terms of both the compliance of shape requirements and out-of-sample fitting measures.

Keywords

Curve fitting term structure of interest rates shape restriction nonparametric regression function approximation

1 Introduction

As an important topic in finance, refining term structure estimation has received considerable attention for many decades [1]. The term structure of interest rates is the relationship between interest rates and maturities. The term structure of interest rates can be expressed in the discount curve d (t), or the spot rate curve r (t), or the instantaneous forward rate curve f (t). They are related to each other as follows $d (t) = exp [- r (t) \times t] = exp [- \int_{0}^{t} f (s) d s] .$ (1) Hence, provided that one curve is obtained, the other two curves can be calculated immediately.

In this paper the maturity domain is a finite interval [0, T] ⊂ [0, + ∞). A discount function is required to satisfy the following shape requirements:

Positive: d (t) ≥0, ∀ t ∈ [0, T]. For any maturity, the present value of $1 in the future cannot be negative.

Decreasing: d (t₁) ≥ d (t₂) , ∀0 ≤ t₁ < t₂ ≤ T. The present value of $1 in the future decreases with maturity. This requirement is equivalent to the positive requirement of the instantaneous forward rate function: f (t) ≥0, ∀ t ∈ [0, T].

Unity at time 0: d (0) =1. It means that the limit of the present value of $1 in the future is 1 as the tenor approaches 0.

Continuous: d (t) is continuous on [0, T]. Dis-continuousness of d (t) at any t ∈ [0, T] implies +∞ or -∞ instantaneous forward rate at t.

Given that the decreasingness of d (t) holds, the requirement of positiveness can be replaced with d (T) ≥0.

However, most of the existing models, including parametric and nonparametric, have the possibility of violating the above shape requirements. If one estimated discount curve $\hat{d} (\cdot)$ violates any one of the above shape restrictions at any maturity, it is useless in practice, no matter how small the approximating error is. Given that an estimated discount curve fails to satisfy these qualitative requirements, it is meaningless to measure its approximating error.

Three attempts have explicitly imposed shape requirements on cross-sectional term structure models: shape-restricted B-spline regression [2, 3] and shape-restricted Gaussian process regression [4]. The first two attempts approximate the discount function with B-splines, but they differ in implementing arbitrage-free requirements. In [2], the requirement of monotonicity over the entire domain is simplified as the requirement of monotonicity over the knots. [3] implements monotonicity with the gradual diminution of the weight sequence of B-splines. [4] estimates the term structure with a generalization of kriging models with linear equality constraints (market-fit requirement) and inequality constraints (monotone-shape requirement). Section 2 gives a detailed introduction on these three attempts.

All the above attempts, however, should be improved. If the order of B-splines is higher than 3, [2] provides a necessary, but not sufficient, implementation of these qualitative requirements, while [3] provides a sufficient, but not necessary, implementation. Insufficient implementation leads to under-restriction that fails to guarantee shape preservation, while unnecessary implementation leads to over-restriction that is harmful to fitting inflexibility. Despite the knots-constrained spline regression (order m = 3) [2], the parameters-constrained spline regression (order m = 3) [3] and the shape-restricted Gaussian process regression [4] provide a necessary and sufficient implementation of these qualitative requirements, the discount curve estimated by these models has a piece-wise linear derivative, which implies zigzag-like forward rate curves.

This paper proposes a sieve estimator of the discount function. This estimation approximates the discount function with algebraic polynomials under explicit shape requirements. The polynomial for the discount function is required to be unity at maturity 0, positive and decreasing over the maturity domain. The requirement of monotonicity involves an infinite number of inequality constraints and makes this estimation become a semi-infinite program. But it can be equivalently transformed into a semidefinite program, which is convex and computationally tractable.

The proposed model has five apparent advantages.

It has universal flexibility and can approximate any continuous discount function with an arbitrary accuracy as the polynomial degree increases.

It is a sufficient and necessary implementation of shape requirements. For any training data and any polynomial degree, the estimated term structure cannot include any negative discount factor and any negative forward rate.

It can be solved with a semidefinite program, which is convex and computationally tractable.

It can estimate the term structure of interest rates when the data include coupon bonds. This is a great advantage over many interpolation-based models that can only estimate the par-yield curve when the data include coupon bonds. In practice, spot rates with tenors longer than one year are not directly observable. Commonly bonds with maturities longer than one year are coupon bonds.

The estimated discount curve is differentiable for any order.

Table 1 summarizes the advantages of the proposed method over others.

Table 1

Summary of various methods

Category	Method	Sufficiency	Necessity	Smoothness
Parametric regression	Nelson-Siegel [5]			C ^∞
	Nelson-Siegel-Svensson [6]			C ^∞
Nonparametric regression without shape restriction	Spline regression (order m) [7 –15]			C ^m-1
	Polynomial regression [16 –19]			C ^∞
Nonparametric regression with shape restriction	Knots-constrained spline regression (order m = 3) [2]	√	√	C ²
	Knots-constrained spline regression (order m > 3) [2]		√	C ^m-1
	Parameters-constrained spline regression (order m = 3) [3]	√	√	C ²
	Parameters-constrained spline regression (order m > 3) [3]	√		C ^m-1
	Shape-restricted Gaussian Process regression [4]	√	√	C ²
	Shape-restricted polynomial regression (this paper)	√	√	C ^∞

A function f is said to be of C^m if the derivatives f′, f″, ⋯, f^(m) exist and are continuous. For example, class C⁰ consists of all continuous functions, the class C¹ consists of all differentiable functions whose derivative is continuous. A function f is said to be infinitely differentiable, or of C^∞, if it has derivatives of all orders. Therefore, C⁰⊇ C¹ ⊇ C² ⊇ ⋯. In this table, for three methods, i.e., the knots-constrained spline regression (order m = 3) [2], the parameters-constrained spline regression (order m = 3) [3] and the shape-restricted Gaussian process regression [4], the function family is C², the estimated discount function is of C², and the estimated forward rate is piece-wise linear.

All vectors are column vectors written in boldface and lowercase letters, whereas all matrices are in boldface and uppercase letters. All elements are written in plain lowercase letters. For example, the i-th element of vector q is q_i and the row-i column-j entry of matrix X is X_ij. For any symmetric square matrix A, A ⪰ 0 means that A is positive semidefinite. $q \in ℝ_{+}^{N}$ means that q is a dimension-N real vector with positive elements, while $X \in ℝ_{+}^{k \times k}$ means that X is a dimension-k positive semidefinite real symmetric matrix. For two matrices X, Y ∈ ℝ ^k×k , 〈X, Y〉 is the inner product of X and Y, i.e., $〈 X, Y 〉 = \sum_{i = 1}^{k} \sum_{j = 1}^{k} X_{ij} Y_{ij}$ . $𝕀_{{\cdot}}$ is the indicator function. C [a, b] denotes the set of real continuous functions on [a, b].

The paper proceeds as follows. Section 2 introduces three closely related methods in detail. The estimation model with shape-restricted polynomial regression is presented in Section 3. Section 4 illustrates the novelty of the proposed model with artificial and real datasets. This paper is concluded by Section 5.

2 Related models

One can estimate the term structure of interest rates by fitting bond prices with parametric models. Popular parametric models are the Nelson-Siegel model [5] and the Svensson model [6]. It is easy to extend parametric models to time series of interest rates [20, 21]. The major shortcoming of parametric models is the lack of universal flexibility. Nonparametric models for the term structure can be classified into two categroies. One category is regression splines, including quadratic and cubic splines [7, 8], cubic ℓ₁ splines [9], exponential splines [10], smooth splines [11 –13], tension splines [14] and Bayesian splines [15]. Another category is polynomials, including algebraic polynomials [16], Bernstein polynomials [17] and Chebyshev polynomials [18, 19].

However, due to four shape restrictions, the estimation of the term structure of interest rates should be regarded as shape-restricted regression. Shape-restricted regression has a long history in statistical literature with seminal works dating back to half a century, such as [22] and [23]. Common shapes analyzed in nonparametric econometrics are monotone, convex (concave), supermodular and homogeneous [24 –31]. Recently one special issue on nonparametric inference under shape constraints was published by Statistical Science [32].

The decreasing shape requirement is imposed on the entire maturity domain, which involves an infinite number of inequality constraints. Hence, the estimation problem usually becomes a semi-infinite program [33 –35], which involves finite decision variables and infinite equality or inequality constraints. Generally, semi-infinite programs are computationally intractable, except for very few special structures. One can refer to [36] for a comprehensive survey on semi-infinite programs.

This shape requirement was neglected by many classical term structure models, including the most popular one [5]. Attempts have been proposed to improve in this direction. Among them, some models provide only a necessary, but not sufficient, implementation, in which infinitely many inequality constraints corresponding to uncountably many points t ∈ [0, T] are reduced to finitely many inequality constraints corresponding to finitely many tenors. For example in [37], an extended Nelson-Siegel model, the decreasing requirement is replaced with $d (t_{1}) \geq d (t_{2}), \forall t_{1} < t_{2}, t_{1}, t_{2} \in T$ where $T$ is the set of maturities of the training data. This discretization-based scheme that replaces the requirement of monotonicity with inequality constraints on finite points provides a necessary, but not sufficient, implementation. The abidance of finite points doesn’t follow the abidance of all points in the domain. Worse than this, one example in [36] shows that for semi-infinite optimization the abidance of some points cannot guarantee the abidance of all points, even though the number of constrained points is countably infinite.

2.1 Knots-constrained spline regression [2]

Assume the bond data consist of N zero-coupon bonds ${(t_{n}, q_{n})}_{n = 1}^{N}$ , where t_n is the maturity and q_n is the price. Using order-m B-spline regression with K - 2 internal knots in the mesh ${x_{j}}_{j = 1}^{K + 2 m - 2}$ such that 0 = x₁ = ⋯ = x_m, x_m+1, ⋯, x_K+m-2, x_K+m-1 = ⋯ = x_K+2m-2 = T, the discount function has the following representation of K + m - 2 basis functions $d (t) = \sum_{j = 1}^{K + m - 2} a_{j} B_{j} (t)$ . Based on a zero-coupon bond data ${(t_{n}, q_{n})}_{n = 1}^{N}$ , the coefficients a ≜ (a₁, ⋯, a_K+m-2) can be obtained by the following optimization $min_{a} fidelity + λ \cdot roughness$ (2a) $\begin{matrix} subject to \\ \sum_{j = 1}^{K + m - 2} a_{j} B_{j}^{'} (x_{i}) \leq 0, \forall i = m, \dots, K + m - 1 \end{matrix}$ (2b) $a_{1} = 1$ (2c) $a_{K + m - 2} \geq 0$ (2d) where $λ \in ℝ_{+}$ is a trade-off parameter. Constraint(2b) implements the requirement of monotonicity, constraint (2c) is equivalent to d (0) =1, and constraint (2d) is equivalent to d (T) ≥0. Two commonly used measures for fidelity are L₁ and L₂. Five commonly used measures for roughness are L₁, L_2,∞, L_2,2, ℓ₂ and ℓ_EM.

When the B-splines is quadratic, i.e., m = 3, constrain (2b) is a necessary and sufficient condition for the requirement of monotonicity. For any x ∈ [x_i, x_i+1], i = m, ⋯ , K + m - 2, $\begin{matrix} d (x_{i}) = ξ_{i - 2} B_{i - 2} (x_{i}) + ξ_{i - 1} B_{i - 1} (x_{i}) + ξ_{i} B_{i} (x_{i}) \\ d (x_{i + 1}) = ξ_{i - 2} B_{i - 2} (x_{i + 1}) + ξ_{i - 1} B_{i - 1} (x_{i + 1}) \\ + ξ_{i} B_{i} (x_{i + 1}) \\ d (x) = ξ_{i - 2} B_{i - 2} (x) + ξ_{i - 1} B_{i - 1} (x) + ξ_{i} B_{i} (x) . \end{matrix}$ Because each basis function is quadratic, d′ (x) for x ∈ [x_i, x_i+1] is linear. Therefore $d^{'} (x) = \frac{x - x_{i}}{x_{i + 1} - x_{i}} d^{'} (x_{i}) + \frac{x_{i + 1} - x}{x_{i + 1} - x_{i}} d^{'} (x_{i + 1}) .$ (3) Hence, if d′ (x_i) ≤0 and d′ (x_i+1) ≤0, then d′ (x) ≤0 for all x ∈ [x_i, x_i+1]. This result follows the sufficiency of constrain(2b).

However, when m ≥ 4, Eq. (3) doesn’t hold, and thus constrain(2b) is not sufficient for the requirement of monotonicity. In other words, provided that m ≤ 4, the monotonicity at finite knots doesn’t follow the monotonicity over the entire interval. Hence [2] may fail to obtain a universally decreasing discount curve, even though its derivative at all knots is negative.

2.2 Parameters-restricted spline regression [3]

[3]also applies spline regression to estimate the discount function. Provided that the set of knots is the same as [2], the coefficients of the estimator $d (t) = \sum_{j = 1}^{K + m - 2} a_{j} B_{j} (t)$ can be obtained by minimizing the objective (2a) with constraints $1 = a_{1} \leq a_{2} \leq \dots \leq a_{K + m - 2} \leq 0 .$ (4)

The monotonicity of the estimated curve is ensured by the variation diminution property of the B-splines [54, pp. 138-142]de1978practical: the number of sign changes in d (·) is, at most, as large as in the sequence a₁, a₂, ⋯, a_K+m-2. When m ≤ 3, this constraint is also necessary for the requirement of monotonicity. When m > 3, this constraint is no longer necessary, which implies over-restriction and loss of flexibility.

2.3 Gaussian process regression [4]

In this method, the domain [0, T] is discretized into a regular subdivision 0 = x₀, x₁, ⋯, x_K = T with a constant mesh size δ = T/K. An associated set of basis functions φ_j is defined as $φ_{j} (x) : = \int_{0}^{x} h_{j} (u) du$ , j = 0, ⋯ , K where h_j (u) : = max {1 - |u - x_j|/δ, 0} is a hat function centered at the j-th knot x_j of the input subdivision. The discount function is represented with $d (t) = η + \sum_{j = 0}^{K} ξ_{j} φ_{j} (t)$ . Its K + 2 coefficients (η, ξ₀, ⋯ , ξ_K) can be obtained by Gaussian process regression with

Equality constraints related to market consistence: $η + \sum_{j = 0}^{K} ξ_{j} φ_{j} (t_{n}) = q_{n}$ , ∀n = 1, ⋯ , N.

Inequality constraints related to the requirement of monotonicity ξ₀, ⋯ , ξ_K ≤ 0.

In the original [4], only the requirement of monotonicity is explicitly imposed. To free from arbitrage opportunities, the discount function is further required to satisfy d (0) =1 and d (T) >0. This paper improves the original method and considers the other two requirements. To achieve d (0) =1, one can impose an additional equality constraint η = 1. To achieve d (T) >0, one should impose an additional inequality constraint η + δξ₀/2 + δξ₁ + ⋯ + δξ_K-1 + δξ_K/2 ≥0 .

3 Methodology

Assume that the current time is 0 and the term structure should be estimated from quoted prices of N bonds. Let q_n be the n-th bond’s dirty (full) price, i.e., the sum of the quoted price and the accrued interest. Assume the n-th bond has M_n determined future cash flows up to maturity: payment c_nm occurring at time t_nm, m = 1, 2, ⋯ , M_n. For a given discount function d (t), the price of the n-th bond should be $q_{n} = c_{n 1} d (t_{n 1}) + c_{n 2} d (t_{n 2}) + \dots + c_{{nM}_{n}} d (t_{{nM}_{n}}) + ɛ_{n}$ where ɛ_n is the pricing error arising from brokage cost, taxation and liquidity spreads etc. ɛ_ns are independently distributed with $𝔼 ɛ_{n} = 0$ . ɛ_ns are heteroscedastic because prices of bonds with shorter maturities are less sensitive to the fluctuation of interest rates. If the Macaulay duration of the n-th bond is D_n and the yield is expressed continuously compounded, the bond price q_n and the yield-to-maturity y_n have the following relation ∂q_n/∂y_n = - D_nq_n. Therefore, this paper assumes the standard deviation of ɛ_n is proportional to D_n.

This paper estimates the discount function d (t) in the following framework $min_{d \in D} \sum_{n = 1}^{N} w_{n}^{2} {(q_{n} - \sum_{m = 1}^{M_{n}} c_{nm} d (t_{nm}))}^{2} + λ R (d)$ (5) where w_n is the weight defined in terms of D_n $w_{n} = (1 / D_{n}) / \sum_{i = 1}^{N} (1 / D_{i}) .$ (6) $D$ is the set of functions d : [0, T] → [0, 1] that satisfies the four shape requirements. The first term in the objective function (5) is the training error. The second term R (d) is a regularization term that represents the complexity of d (t). When the feature space is high-dimensional, regularization helps to learn simpler and smoother models. Regularization is motivated as a technique to improve the generalizability of a learning model. The regularization term in ridge regression is one of the well-known examples. The parameter λ > 0 determines the degree to which the complexity of the estimator should be penalized (higher penalty for larger λ). λ is used for the trade-off between the fitting measure and the model complexity.

This paper approximates the discount function with algebraic polynomials $d (t) = a_{0} + a_{1} t + a_{2} t^{2} + \dots + a_{k} t^{k} = \sum_{ℓ = 0}^{k} a_{ℓ} t^{ℓ} .$ (7) The fitted price for the n-th bond with cash flows ${(t_{nm}, c_{nm})}_{m = 1}^{M_{n}}$ is $q_{n} = \sum_{m = 1}^{M_{n}} c_{nm} (\sum_{ℓ = 0}^{k} a_{ℓ} t_{nm}^{ℓ}) + ɛ_{n} .$ (8) This approximation has two apparent advantages. First, it is expected to achieve universal flexibility, because the Weierstrass approximation theorem indicates that polynomials have the capability of approximating any continuous function over an interval with an arbitrarily small error as the degree k increases. Second, as shown by the following lemma, this polynomial allows for a sufficient and necessary implementation of the requirement of monotonicity. The flexibility of polynomials in nonparametric regression under shape-restriction has been shown in classification probability calibration [29, 30]. The linear representation (8) of the coefficients a = (a₀, a₁, ⋯ , a_k) $q_{n} = \sum_{ℓ = 0}^{k} a_{ℓ} (\sum_{m = 1}^{M_{n}} c_{nm} t_{nm}^{ℓ}) + ɛ_{n}$ (9) makes the ℓ₂-norm fitting measure same as that of weighted linear regression. Apparently the proposed model has computational advantage over [16], which approximates the spot rate function with algebraic polynomials and involves a non-convex optimization. Thus, the discount function d : [0, T] → [0, 1] can be estimated by the following semi-infinite program

$\begin{matrix} min_{a \in ℝ^{k + 1}} \sum_{n = 1}^{N} w_{n}^{2} {q_{n} - \sum_{ℓ = 0}^{k} a_{ℓ} (\sum_{m = 1}^{M_{n}} c_{nm} t_{nm}^{ℓ})}^{2} \\ + λ a^{'} a \end{matrix}$ (10a) $\begin{matrix} subject to \\ - \sum_{ℓ = 1}^{k} ℓ a_{ℓ} t^{ℓ - 1} \geq 0, \forall t \in [0, T] \end{matrix}$ (10b) $\sum_{ℓ = 0}^{k} a_{ℓ} T^{ℓ} \geq 0$ (10c) $a_{0} = 1 .$ (10d)

Constraints (10b) corresponds to the monotonicity of functional family $D$ . Due to the differentiability of d (t), the requirement of monotonicity is replaced with -d′ (t) ≥0, ∀t ∈ [0, T]. As ridge regression, this model uses the ℓ₂-norm of coefficients a as the regularization term. It is straightforward to replace this ℓ₂ regularization term with the ℓ₁-norm regularization $\sum_{ℓ = 0}^{k} | a_{ℓ} |$ , or other roughness measures.

Instead of solving the semi-infinite program by discretization-based heuristic algorithms, this paper proposes an equivalent semidefinite representation of constraint (10b). Thanks to the following lemma [39, Theorem 9, Theorem 10], this program can be transformed to a semidefinite program with finitely many decision variables and finitely many inequality constraints. Let H_n,ℓ be the n × n Hankel matrix with row-i column-j element $H_{n, ℓ}^{ij} : = {\begin{matrix} 1, & i + j = ℓ \\ 0, & otherwise . \end{matrix}$ (11)

Lemma 1. Consider the polynomial p (t) = a₀ + a₁t + ⋯ + a_kt^k, $t \in [\underline{t}, \bar{t}]$ . When k is even, i.e., k = 2k₁, $k_{1} \in ℕ$ , p (t) is positive on the closed interval $[\underline{t}, \bar{t}]$ , if and only if there exist positive semidefinite real symmetric matrices X ∈ ℝ^{(k₁+1)×(k₁+1)} and Y ∈ ℝ^k₁×k₁ satisfying $\begin{matrix} a_{ℓ} = & 〈 H_{k_{1} + 1, ℓ + 2}, X 〉 + 〈 - 𝕀_{{ℓ \leq 2 k_{1} - 2}} \underline{t} \bar{t} H_{k_{1}, ℓ + 2} + \\ 𝕀_{{1 \leq ℓ \leq 2 k_{1} - 1}} (\underline{t} + \bar{t}) H_{k_{1}, ℓ + 1} - 𝕀_{{ℓ \geq 2}} H_{k_{1}, ℓ}, Y 〉 \end{matrix}$ for all ℓ=0, ⋯ , 2k₁.

When k is odd, i.e., k = 2k₁ - 1, $k_{1} \in ℕ$ , p (t) is positive on $[\underline{t}, \bar{t}]$ , if and only if there exist positive semidefinite real symmetric matrices $U \in ℝ^{k_{1} \times k_{1}}$ and $V \in ℝ^{k_{1} \times k_{1}}$ satisfying $\begin{matrix} a_{ℓ} = & 〈 - 𝕀_{{ℓ \leq 2 k_{1} - 2}} \underline{t} H_{k_{1}, ℓ + 2} + 𝕀_{{ℓ \geq 1}} H_{k_{1}, ℓ + 1}, U 〉 \\ + 〈 \bar{t} 𝕀_{{ℓ \leq 2 k_{1} - 2}} H_{k_{1}, ℓ + 2} - 𝕀_{{ℓ \geq 1}} H_{k_{1}, ℓ + 1}, V 〉 \end{matrix}$ for all ℓ=0, ⋯ , 2k₁ - 1.

According to the Markov-Lukács theorem, the necessary and sufficient condition for p (t) to be positive on $[\underline{t}, \bar{t}]$ is:

in case of even k, i.e., $k = 2 k_{1}, k_{1} \in ℕ$ $p (t) = p_{1} (t) + (t - \underline{t}) (\bar{t} - \underline{t}) p_{2} (t)$ (14)

in case of odd k, i.e., $k = 2 k_{1} - 1, k_{1} \in ℕ$ $p (t) = (t - \underline{t}) p_{3} (t) + (\bar{t} - t) p_{4} (t)$ (15)

for some sum-of-squares polynomials p₁ of degree 2k₁ and p₂, p₃ and p₄ of degree 2k₁ - 1. Moreover, the cone of sum-of-squares of polynomials

K [\underline{t}, \bar{t}] : = {a | a_{0} + a_{1} t + \dots + a_{k} t^{k} \geq 0, \forall t \in [\underline{t}, \bar{t}]}

is a linear image of the cone of positive semidefinite real symmetric matrices. Detailed proof can be referred to [55].

With a straightforward application of Lemma 1, the semi-infinite program (10) can be transformed to one of the following two programs according to the parity of k.

In case of even k. Let k₁ = k/2. The coefficients a can be obtained by solving the following semidefinite program

$\begin{matrix} min_{a, U, V} \sum_{n = 1}^{N} w_{n}^{2} {q_{n} - \sum_{ℓ = 0}^{2 k_{1}} a_{ℓ} (\sum_{m = 1}^{M_{n}} c_{nm} t_{nm}^{ℓ})}^{2} \\ + λ a^{'} a \end{matrix}$ (16a) $\begin{matrix} subject to \\ 〈 T H_{k_{1}, ℓ + 1} 𝕀_{{ℓ \leq 2 k_{1} - 1}} - 𝕀_{{ℓ \geq 2}} H_{k_{1}, ℓ}, V 〉 \\ + 〈 𝕀_{{ℓ \geq 2}} H_{k_{1}, ℓ}, U 〉 = - ℓ a_{ℓ}, ℓ = 1, \dots, 2 k_{1} \end{matrix}$ (16b) $\sum_{ℓ = 0}^{k} a_{ℓ} T^{ℓ} \geq 0, a_{0} = 1$ (16c) $a \in ℝ^{2 k_{1} + 1}, U, V \in ℝ_{+}^{k_{1} \times k_{1}} .$ (16d)In case of odd k. Let k₁ = (k + 1)/2. The coefficients a can be obtained by solving the following semidefinite program

$\begin{matrix} min_{a, X, Y} \sum_{n = 1}^{N} w_{n}^{2} {q_{n} - \sum_{ℓ = 0}^{2 k_{1} - 1} a_{ℓ} (\sum_{m = 1}^{M_{n}} c_{nm} t_{nm}^{ℓ})}^{2} \\ + λ a^{'} a \end{matrix}$ (17a) $\begin{matrix} subject to \\ 〈 𝕀_{{2 \leq ℓ \leq 2 k_{1} - 2}} T H_{k_{1} - 1, ℓ} - 𝕀_{{ℓ \geq 3}} H_{k_{1} - 1, ℓ - 1}, Y 〉 \\ + 〈 H_{k_{1}, ℓ + 1}, X 〉 = - ℓ a_{ℓ}, ℓ = 1, \dots, 2 k_{1} - 1 \end{matrix}$ (17b) $\sum_{ℓ = 0}^{k} a_{ℓ} T^{ℓ} \geq 0, a_{0} = 1$ (17c) $a \in ℝ^{2 k_{1}}, X \in ℝ_{+}^{k_{1} \times k_{1}}, Y \in ℝ_{+}^{(k_{1} - 1) \times (k_{1} - 1)} .$ (17d)

At the end of this section, it is necessary to emphasize again that, the extrapolation of this estimated discount function d (t) beyond the maturity domain [0, T] is highly unreliable, because the positive and monotone shape requirements are imposed only on [0, T]. T must be larger than the longest tenor of cash flows in the portfolio.

4 Numerical examples

The proposed model is built on the CVX toolbox [40] for solving semidefinite programs. When T and k are very large, some variables may reach the maximum arithmetic representation of a computer. For example, given that T = 50 and k = 200, T^k = 50²⁰⁰ will be treated as infinity because of overflow. Hence, all experiments scale the maturity domain [0, T] to the unit interval [0, 1] for avoiding possible overflow. Benchmark models, including the Nelson-Siegel model [5], the Nelson-Siegel-Svensson model [6] and the unrestricted spline regression [11], employ the built-in Financial Instruments Toolbox of Matlab 2021B with the default settings. Because the estimation of the Nelson-Siegel model [5] and the Nelson-Siegel-Svensson model [6] involves a non-convex optimization, it is a rather tough task to detect the number of local optima and search its global optimum. For simplicity, this paper relies on the default settings of the IRFitOptions in this toolbox.

This paper will not report the computational time, because semidefinite programming can be efficiently solved by CVX. According to some trial experiments, even when the number of coupons N = 500 and the polynomial degree k = 60, all semidefinite problems can be solved within 1 minute. In financial practices, few cases have so many bond quotations.

4.1 Monte Carlo analysis

Motivated by [41], this subsection generates an artificial dataset from a multi-factor Cox-Ingersoll-Ross short rate model [42] $r_{t} = \sum_{j = 1}^{K} y_{tj}$ (18) where these state variables (y₁, ⋯ , y_K) are assumed to be independent and generated as square root diffusion processes ${dy}_{j} = κ_{j} (θ_{j} - y_{j}) dt + σ_{j} \sqrt{y_{j}} d z_{j}$ (19) for j = 1, ⋯ , K. Provided that state variables at current time 0 are (y₀₁, ⋯ , y_0k), the price of a risk-free bond that pays $1 at time t is $\begin{matrix} d (t) = A_{1} (t) \dots A_{K} (t) exp {- B_{1} y_{01} - \dots - B_{K} y_{0 K}} \\ A_{j} (t) = {[\frac{2 γ_{j} exp {(κ_{j} + γ_{j}) t / 2}}{2 γ_{j} + (κ_{j} + γ_{j}) (exp {t γ_{j}} - 1)}]}^{2 κ_{j} θ_{j} / σ_{j}^{2}} \\ B_{j} (t) = \frac{2 exp {t γ_{j}} - 1}{2 γ_{j} + (κ_{j} + γ_{j}) (exp {t γ_{j}} - 1)} \\ γ_{j} = \sqrt{κ_{j}^{2} + 2 σ_{j}^{2}} . \end{matrix}$ In this experiment K = 2, (κ₁, θ₁, σ₁)=(0.7298, 0.04013, 0.16885), (κ₂, θ₂, σ₂)=(0.021185, 0.022543, 0.054415). The above parameters are estimated by [42] with a weekly dataset that consists of US Treasury bond prices on Thursday from January to December 1988. Bond data are generated from the assumption (y₀₁, y₀₂) = (0.48 % , 0.32 %), i.e., r₀ = 0.8%.

This artificial dataset consists of 100 zero-coupon bonds with face value 100. The maturities were randomly drawn from [0, T] = [0, 50] with uniform distribution. Bond prices are obtained by the multi-factor Cox-Ingersoll-Ross model plus a multiplicative normal noise with standard deviation 0.5%. The discount curve estimated by the Nelson-Siegel model [5], ${\hat{d}}_{NS} (t)$ , is shown in Fig. 1. Though ${\hat{d}}_{NS} (0)$ =1 and ${\hat{d}}_{NS} (t) \geq 0$ ∀t ∈ [0, T], it is increasing at some tenors that are near to 0. In [5] the discount function is obtained from the instantaneous forward rate function with an exponential transformation, it is certain to be unity at tenor 0 and positive everywhere. However, it is challenging to require monotonicity, because it is a weighted sum of one constant and two exponential functions. This result is consistent with many previous studies, for example [6], that the Nelson-Siegel model [5] fails to obtain a satisfactory estimation for short-maturity interest rates. Since a bond with short maturity is insensitive to interest rates, the estimated spot rate curve within 1 year may be subject to violent fluctuation. Moreover, this figure clearly shows that the estimated curve of the Nelson-Siegel model [5] deviates from the true when the maturity is long. It is the mis-specification of the functional form that gives rise to this undesirable result.

Fig. 1

The estimated discount curve of the Nelson-Siegel model [5] on the artificial data.

Six shape-restricted term structure models are compared by illustrating figures: (a) the knots-constrained spline regression with order 3 [2]; (b) the knots-constrained spline regression with order 4 [2]; (c) the parameters-constrained spline regression with order 3 [3]; (d) the parameters-constrained spline regression with order 4 [3]; (e) the shape-restricted Gaussian process regression [4]; (f) the shape-restricted polynomial regression (this paper). Model (a) and (b) use L₁ fidelity and L₁ roughness. Model (c) and (d) use L₂ fidelity and ℓ₂-norm roughness. In all models (a) - (d), the number of internal notes is 13 and the trade-off parameter λ is 10^-2. In model (e), the kernel is Gaussian with σ² = 10^-2 and K = 14. In the proposed method, the degree of the polynomial is 20 and the trade-off parameter λ is 10^-8. The above parameter settings are arbitrary just for illustrative purposes.

Since each estimated nonparametric discount curve is quite close to the true, each estimated term structure is shown with its forward rate curve. The forward rate curve f (t) = - ∂d (t)/(d (t) ∂t) is the differential of the discount curve, and can disclose more detailed information.

As displayed in Fig. 2(b), the forward rate curve estimated by [2] with order 4 is not positive over the entire maturity domain. This confirms that, given that the order is higher than 3, constraint (2b) is not a sufficient condition for the monotonicity of the discount curve. All forward rate curves estimated by the other five models are positive, which confirms the sufficiency of their implementations.

Fig. 2

Estimated forward rate curves by six shape-restricted models.

In three models, the knots-constrained spline regression with order 3, the parameters-constrained spline regression with order 3, and the shape-restricted Gaussian process regression, the forward rate curve is non-differentiable at internal knots. When the basis function is piece-wise quadratic, its derivative function between each segment is linear, and at each internal knot the left derivative and the right derivative are not necessarily equal. The knots-constrained spline regression with order 4 and the parameters-constrained spline regression with order 4 can obtain discount curves that are second-order differentiable because they are piece-wise cubic. The proposed method uses a simple polynomial, so that its estimated discount curve is infinite-order differentiable.

The experiment repeats the above estimation 10⁴ times and report out-of-sample performance in Table 2. Three state-of-the-art models are also included: the Nelson-Siegel model [5, 6] and [11]. [4] uses the Gaussian kernel with kernel parameter σ² = 2^-2. [11] uses cubic B-spline regression with trade-off parameter 10^-2.

Table 2

Out-of-sample experimental results on the artificial data

Model	d-RMSE (%)	f-RMSE(%)	PVNR(%)
Nelson-Siegel [5]	1.167 ± 0.883	7.762 ± 9.193	1.29
Nelson-Siegel-Svensson [6]	1.757 ± 0.741	12.993 ± 9.718	0.76
Unrestricted spline regression [11]	0.816 ± 1.418	5.763 ± 8.121	8.75
Knots-constrained spline regression (order m = 3) [2]	0.486 ± 0.110	3.861 ± 1.154
Knots-constrained spline regression (order m = 4) [2]	0.463 ± 0.127	3.174 ± 0.935	0.12
Parameters-constrained spline regression (order m = 3) [3]	0.457 ± 0.108	3.547 ± 0.826
Parameters-constrained spline regression (order m = 4) [3]	0.428 ± 0.101	3.236 ± 0.931
Shape-restricted Gaussian process regression [4]	0.480 ± 0.116	3.742 ± 0.724
Shape-restricted polynomial regression (this paper)	0.334 ± 0.090	2.708 ± 0.614

Model comparison is based on the following three performance measures.

d-RMSE. This RMSE measures the discrepancy between the true discount function and the estimated discount function $\hat{d} (t)$ . It is approximated with the RMSE on a series of maturities $T \subseteq [0, T]$ $d - RMSE = \sqrt{\frac{1}{| T |} \sum_{t \in T} {[d (t) - \hat{d} (t)]}^{2}} .$ (20) In this experiment $T$ is the series of all fractions, in years, based on the number of days between the quotation date and all dates before the maximum maturity.

f-RMSE. This RMSE measures the discrepancy between the true forward rate function f (t) = - d′ (t)/d (t) and the estimated forward rate function $\hat{f} (t) = - {\hat{d}}^{'} (t) / \hat{d} (t)$ . Same as d-RMSE, it is approximated as $f - RMSE = \sqrt{\frac{1}{| T |} \sum_{t \in T} {[f (t) - \hat{f} (t)]}^{2}} .$ (21)

PVNR: Percent of Violating No-arbitrage Requirement. It is not easy to judge whether one estimated discount curve violates shape requirements. For simplicity, the judge of violation is based on the grid $T$ . If there exists one $t \in T$ that satisfies $\hat{d} (t) < 0$ , this estimated discount curve is regarded as violating the requirement of positiveness. If there exists one $t_{i} \in T$ that satisfies $\hat{d} (t_{i}) < \hat{d} (t_{i + 1})$ , this estimated curve is regarded as violating the requirement of monotonicity. If $\hat{d} (0) \neq 1$ , this estimated curve is regarded as violating the requirement d (0)=1. If either one of the above three violations happens, this estimated curve is regarded as violating shape requirements.

In each cell like a ± b, a and b are the average and standard deviation of 10⁴ RMSEs. Experimental results in Table 2 show that explicit imposition of shape restrictions can improve out-of-sample fitting performance. In terms of all three measures, the method proposed in this paper has an advantage over the other eight models.

4.2 UK Gilt STRIPS data

The UK Gilt STRIPS data are downloaded from the website of UK Debt Management Office 1

¹
https://www.dmo.gov.uk/data/

. It consists of 427 563 daily closing quotes of 205 UK STRIPS between July 22, 2007 and July 21, 2017. The experiment will not analyze the evolvement of term structures, or study its time series prediction problem. In other words, discount curves at 2 528 trading days are estimated independently. Even for the same model, hyperparameter settings, such as the number of internal knots in splines models, the polynomial degree in the proposed model, are not necessarily the same at different trading days. For each trading day, T is the greatest maturity of all bonds.

Unlike the above subsection on the artificial data, the underlying term structure at each trading day is never known. As a result, d-RMSE (20) or f-RMSE (21) are not available. Therefore the experiment has to measure model comparison with fitting performance on test data. For each trading day, 70% of samples are randomly chosen as the training data, and other samples as the test data. For all models, the training data are used to estimate $\hat{d} (t)$ , and the test data are used to measure its out-of-sample performance. In this experiment, model performance is based on the following measure $q - RMSE = \sqrt{\frac{1}{N} \sum_{n = 1}^{N} {[q_{n} - 100 \hat{d} (t_{n})]}^{2}}$ (22) where N is the number of samples in the test data, t_n, q_n and $100 \hat{d} (t_{n})$ are the tenor, the quoted price and the estimated price of the n-th STRIPS bond in the test data. To decrease the error from the random training-test partition, the experiment conducts these steps ten times on each trading day. Thus, each model at each trading day has ten estimated discount curves, in total 25 280 estimated discount curves.

Five-fold cross-validation is used to determine hyper-parameters of all models. In five splines-based models, the candidate set for the number of internal knots is {3, 4, ⋯ , 30}. In the proposed method, the candidate set for the polynomial degree k is {8, 9, ⋯ , 50}. The candidate set for the weight λ is ${2^{- m}}_{m = 0}^{20}$ . [4] employs the Gaussian kernel. The candidate set for the kernel parameter σ² is ${2^{m}}_{m = - 20}^{20}$ , and the candidate set for K is {4, 5, ⋯ , 31},

Experimental results of nine models are listed in Table 3. In each a ± b cell, a and b are the average and standard deviation of 25 280 RMSEs. The six shape-restricted models have an obvious advantage over the three classical models. the shape-restricted polynomial regression (this paper), the knots-constrained spline regression (order m = 4) [2] and the parameters-constrained spline regression (order m = 4) [3] achieve the first, second and third best performances respectively. It verifies that the requirement of smoothness can improve out-of-sample fitting performance.

Table 3

Out-of-sample results on the UK STRIPS data

Model	q-RMSE(%)	PVNR(%)
Nelson-Siegel [5]	2.645 ± 0.904	0.81
Nelson-Siegel-Svensson [6]	4.037 ± 1.641	0.95
Unrestricted spline regression [11]	4.863 ± 1.396	7.54
Knots-constrained spline regression (order m = 3) [2]	1.135 ± 0.453
Knots-constrained spline regression (order m = 4) [2]	0.785 ± 0.054	0.19
Parameters-constrained spline regression (order m = 3) [3]	0.914 ± 0.389
Parameters-constrained spline regression (order m = 4) [3]	0.801 ± 0.056
Shape-restricted Gaussian process regression [4]	1.127 ± 0.206
Shape-restricted polynomial regression (this paper)	0.771 ± 0.079

4.3 US treasury bonds data

This data set is taken from the CRSP government bond files. It consists of daily closing quotes for US Treasury bonds between January 2, 2007 and December 31, 2014. This period covers the 2007-2010 sub-prime mortgage crisis. All bonds that include contingent cash flows, i.e., callable bonds, flower bonds, and inflation-adjusted bonds, are eliminated from the data. For each trading day and each bond, the mean of the closing bid and asked prices is used as the quoted price. In this experiment, the knots-constrained spline regression [2] and the parameters-constrained spline regression [3] will not be compared, as they are incapable of estimating the term structure when the data include coupon-bonds.

Because the data cover 2003 trading days, i.e., 2003 yield curves, the experiment should conduct 2003 independent model comparisons. For each trading day, 70% of samples are randomly chosen as the training data, and other samples as the test data. To decrease the error from this random partition, the experiment conduct these steps 10 times and measure the model performance with the average of these 10 runs in each trading day. Other than PVNR, two performance measures are used for model comparison.

q-RMSE. q-RMSE is the RMSE of fitting errors of bond prices. Assume the test data include N bonds, the n-th bond has cash flows ${(t_{nm}, c_{nm})}_{m = 1}^{M_{n}}$ , q-RMSE is defined as

$q - RMSE = \sqrt{\frac{1}{N} \sum_{n = 1}^{N} {[q_{n} - \sum_{m = 1}^{M_{n}} c_{nm} \hat{d} (t_{nm})]}^{2}} .$

Hit rate. The hit rate is defined as the percent of fitted prices that fall within the corresponding bid-asked spreads.

Experimental results are presented in Table 4. As expected, the proposed model achieves zero percent of violating arbitrage-free requirements, because it is a sufficient implementation of shape requirements. It achieves the least average RMSE among the five methods. 23.642% fitting prices fall in their corresponding bid-asked spreads, which is the second-best score among the five models. The proposed model significantly outperforms the other four models. In this experiment, the two parametric methods achieve worse performance than the three non-parametric methods. This result indicates that parametric functions with four or five parameters are insufficient for describing the term structure of interest rates in practice.

Table 4
Empirical results on the US treasury bonds data

Model q-RMSE Hit rate(%) PVNR(%)

Nelson-Siegel [5] 0.652 ± 0.318 5.645 0.153

Nelson-Siegel-Svensson [6] 0.334 ± 0.295 12.534 0.217

Unrestricted spline regression [11] 0.316 ± 0.177 15.392 0.892

Shape-restricted Gaussian process regression [4] 0.201 ± 0.125 24.983

Shape-restricted polynomial regression (this paper) 0.173 ± 0.096 23.642

Model	q-RMSE	Hit rate(%)	PVNR(%)
Nelson-Siegel [5]	0.652 ± 0.318	5.645	0.153
Nelson-Siegel-Svensson [6]	0.334 ± 0.295	12.534	0.217
Unrestricted spline regression [11]	0.316 ± 0.177	15.392	0.892
Shape-restricted Gaussian process regression [4]	0.201 ± 0.125	24.983
Shape-restricted polynomial regression (this paper)	0.173 ± 0.096	23.642

5 Conclusion

This paper proposes a nonparametric estimation method for the term structure of interest rates under shape requirements. To free from arbitrage opportunities, a discount function d (·) is required to be continuous, decreasing, positive and d (0)=1. Because the requirement of monotonicity is imposed on every point of the maturity domain, it is continuously constrained and computationally intractable. Many parametric and nonparametric methods neglect shape requirements. Some necessary, but not sufficient, implementations cannot guarantee full conformance of shape requirements, while some sufficient, but not necessary, implementations are too over-restrictive to achieve universal flexibility.

The proposed method approximates the discount function with an algebraic polynomial and presents an equivalent implementation of arbitrage-free requirements. The decreasing shape requirement is implemented by requiring its derivative to be non-positive everywhere. Its estimation can be solved by one of the two semidefinite programs according to the parity of its degree k. Experimental results on artificial data, UK STRIPS data, and US Treasury bonds data clearly show that the proposed method has a great advantage over many state-of-the-art methods.

The main drawback of the model is the lack of parsimony and economic interpretation, which is the limitation of all nonparametric models. In the proposed model, the estimated discount function has k + 1 coefficients that do not represent any economic intuition. Therefore, it has limited potential in analyzing the dynamics and mechanisms of interest rates. However, the proposed model is mainly motivated by pricing and valuation. After all, pricing accuracy, instead of economic interpretation, is the core in pricing fixed-income securities. An interesting future research direction is to extend our model with long maturities. For example, in actuarial science insurance and reinsurance companies need to evaluate cash flows with very long maturities. The Solvency II framework released by the European Insurance and Occupational Pensions Authority (EIOPA) requires that the maturity domain of the estimated risk-free term structure should cover 60 years at least [43]. However, there are very few treasury bonds that have a maturity longer than 20 years. Therefore, some extrapolating techniques should be included in our model.

Footnotes

Acknowledgment

The work is supported by Zhejiang Natural Science Foundation (LY19G010001,LY20G010002) and National Natural Science Foundation of China (71571163).

References

Zhang

and Metawa

, Application of machine learning algorithm and static model of interest rate curve in futures analysis, Journal of Intelligent & Fuzzy Systems (2020), 1–12.

Laurini

M.P.

and Moura

, Constrained smoothing B-splines for the term structure of interest rates, Insurance: Mathematics and Economics 46(2) (2010), 339–350.

Fengler

M.R.

and Hin

L.-Y.

, A simple and general approach to fitting the discount curve under no-arbitrage constraints, Finance Research Letters 15 (2015), 78–84.

Cousin

, Maatouk

and Rulliere

, Kriging of financial term-structures, European Journal of Operational Research 255(2) (2016), 631–648.

Nelson

C.R.

and Siegel

A.F.

, Parsimonious modeling of yield curves, Journal of Business 60(4) (1987), 473–489.

Svensson

L.E.

, Estimating and interpreting forward interest rates: Sweden, 1992–1994, 1994.

McCulloch

J.H.

, Measuring the term structure of interest rates, Journal of Business 44(1) (1971), 19–31.

McCulloch

J.H.

, The Tax-adjusted tield curve, Journal of Finance 30(3) (1975), 811–830.

Chiu

N.-C.

, Fang

S.-C.

, Lavery

J.E.

, Lin

J.-Y.

and Wang

, Approximating term structure of interest rates using cubic L1 splines, European Journal of Operational Research 184(3) (2008), 990–1004.

10.

Vasicek

O.A.

and Fong

H.G.

, Term structure modeling using exponential splines, Journal of Finance 37(2) (1982), 339–348.

11.

Fisher

, Nychka

and Zervos

, Fitting the term structure of interest rates with smoothing splines, Technical Report, Board of Governors of the Federal Reserve System (US), 1995.

12.

Kaushanskiy

and Lapshin

, A nonparametric method for term structure fitting with automatic smoothing, Applied Economics 48 (2016), 5654–5666.

13.

Filipović

and Willems

, Exact smooth term-structure estimation, SIAM Journal on Financial Mathematics 9(3) (2018), 907–929.

14.

Barzanti

and Corradi

, A note on interest rate term structure estimation using tension splines, Insurance: Mathematics and Economics 22(2) (1998), 139–143.

15.

and Yu

, Estimating the interest rate term structures of treasury and corporate debt with Bayesian penalized splines, Journal of Data Science 3(3) (2005), 223–240.

16.

Chambers

D.R.

, Carleton

W.T.

and Waldman

D.W.

, A new approach to estimation of the term structure of interest rates, Journal of Financial and Quantitative Analysis 19(3) (1984), 233–252.

17.

Schaefer

S.M.

, Measuring a tax-specific term structure of interest rates in the market for British government securities, Economic Journal 91(362) (1981), 415–438.

18.

Pham

T.M.

, Estimation of the term structure of interest rates: an international perspective, Journal of Multinational Financial Management 8(2) (1998), 265–283.

19.

Manousopoulos

and Michalopoulos

, Term structure of interest rates estimation using rational Chebyshev functions, Decisions in Economics and Finance 38(2) (2015), 119–146.

20.

Andreasen

M.M.

, Christensen

J.H.

and Rudebusch

G.D.

, Term structure analysis with big data: one-step estimation using bond prices, Journal of Econometrics 212(1) (2019), 26–46.

21.

Castro-Iragorri

, Pena

J.F.

and Rodriguez

, A segmented and observable yield curve for Colombia, Journal of Central Banking Theory and Practice 10(2) (2021), 179–200.

22.

Brunk

, Maximum likelihood estimates of monotone parameters, Annals of Mathematical Statistics 26(4) (1955), 607–616.

23.

Hildreth

, Point estimates of ordinates of concave functions, Journal of the American Statistical Association 49(267) (1954), 598–619.

24.

Wang

and Ghosh

S.K.

, Shape restricted nonparametric regression with Bernstein polynomials, Computational Statistics & Data Analysis 56(9) (2012), 2729–2741.

25.

Wang

and Ni

, Multivariate convex support vector regression with semidefinite programming, Knowledge -Based Systems 30 (2012), 87–94.

26.

Wang

, Modeling financial dependence with support vector regression, Intell Data Anal 17(2) (2013), 233–249.

27.

Wang

, Wang

, Dang

and Ge

, Nonparametric quantile frontier estimation under shape restriction, European Journal of Operational Research 232 (2014), 671–678.

28.

Feng

and Dang

, Shape constrained risk-neutral density estimation by support vector regression, Information Sciences 333 (2016), 1–9.

29.

Wang

, Li

and Dang

, Calibrating classification probabilities with shape-restricted polynomial regression, IEEE Transactions on Pattern Analysis and Machine Intelligence 41(8) (2019), 1813–1827.

30.

Wang

and Liu

, Multivariate probability calibration with isotonic Bernstein polynomials, in: Proceedings of the Twenty- Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, C. Bessiere, ed., ijcai.org, 2020, pp. 2547–2553. doi:10.24963/ijcai.2020/353.

31.

Deng

, Xie

, Wang

and Fu

, Shape-restricted support vector machine (SR-SVM): a SVM classifier taking supplementary shape information of input, Journal of Intelligent & Fuzzy Systems 40(1) (2021), 1481–1494.

32.

Samworth

R.J.

and Sen

, Editorial: special issue on “Nonparametric Inference Under Shape Constraints”, Statistical Science 33(4) (2018), 469–472.

33.

Stein

, How to solve a semi-infinite optimization problem, European Journal of Operational Research 223(2) (2012), 312–320.

34.

Goberna

M.A.

, Guerra-Vazquez

and Todorov

M.I.

, Constraint qualifications in linear vector semi-infinite optimization, European Journal of Operational Research 227(1) (2013), 12–21.

35.

Goberna

M.A.

, Guerra-Vazquez

and Todorov

M.I.

, Constraint qualifications in convex vector semi-infinite optimization, European Journal of Operational Research 249(1) (2016), 32–40.

36.

Hettich

and Kortanek

K.O.

, Semi-infinite programming: theory, methods, and applications, SIAM Review 35(3) (1993), 380–429.

37.

Bliss

R.R.

, Testing term structure estimation methods, Advances in Futures and Options Research 9 (1997), 197–231.

38.

de Boor

, A Practical Guide to Splines (Revised edition), Vol. 27, Springer-Verlag, New York, 2001.

39.

Nesterov

, Squared functional systems and optimization problems, in: High Performance Optimization, Springer-Verlag, 2000, pp. 405–440.

40.

Grant

and Boyd

, CVX: Matlab Software for Disciplined Convex Programming, version 2.1, http://cvxr.com/cvx, 2014.

41.

Laurini

M.P.

and Ohashi

, A noisy principal component analysis for forward rate curves, European Journal of Operational Research 246(1) (2015), 140–153.

42.

Chen

R.-R.

and Scott

, Multi-factor Cox-Ingersoll-Ross models of the term structure: Estimates and tests from a Kalman filter model, Journal of Real Estate Finance and Economics 27(2) (2003), 143–172.

43.

European Insurance and Occupational Pensions Authority, Technical documentation of the methodology to derive EIOPA’s risk-free interest rate term structures, Vol. EIOPABoS-15/035, 2017, pp. 1–135.

Shape-constrained nonparametric estimation of the term structure of interest rates

Abstract

Keywords

1 Introduction

2.1 Knots-constrained spline regression [2]

3 Methodology

4.1 Monte Carlo analysis

1 https://www.dmo.gov.uk/data/

Footnotes

Acknowledgment

References

¹
https://www.dmo.gov.uk/data/