Modified ridge and other regularization criteria: A brief review on meaningful regression models

Abstract

The work describes a series of techniques designed to obtain regression models resistant to multicollinearity and having some other features needed for meaningful results. These models include enhanced ridge-regressions with several regularization parameters, regressions by data segments and by levels of the dependent variable, latent class models, unitary response, models, orthogonal and equidistant regressions, minimization in Lp-metric, and other criteria and models. All the approaches have been practically implemented in various projects and found useful for decision making in economics, management, marketing research, and other fields requiring data modeling and analysis.

Keywords

Ridge-regressions regularizations latent class regression equidistant model Lp-metric

1. Introduction

Besides classical and modern methods of regression modeling described in recent literature (for example, Young, 2017; Demidenko, 2020; Irizarry, 2020), there are plenty of techniques developed for solving various special problems. In continuation to the previous review on co-operative game theory in regression modeling (Lipovetsky, 2021), the current inspection presents works on building models by different criteria and having different properties. This consideration includes enhanced ridge regressions with several regularization parameters helping to diminish distorting impact of multicollinear regressors on their parameters in the model, regressions by data segments and by levels of the dependent variable, latent class regressions, models for unitary constant response, orthogonal and equidistant regression models, minimization of deviations in different metrics and in generalized power Lp-metric, and many other criteria and models. The described approaches have been checked and practically implemented in various research projects in economics, management, marketing research, and they can be as well helpful in other fields requiring data modeling and analysis.

2. Ridge-kind regularizations in regression

The ridge-regression was originated by Hoerl and Kennard (1970, 2000) for building a model resistant to multicollinearity, and it was modified in various works among which the most popular are LASSO, Elastic nets, and Shapley value regression as well. In development of the classical ridge regression, the work (Lipovetsky & Conklin, 2005a) suggested to regularize the model not only by the minimum norm of its parameters but also by the deviations from orthogonality between the regressors and residual errors, and deviations from three other desired properties of the solution. This objective produces a generalization of the ridge regression to two-parameter model which is not prone to multicollinearity and always outperforms a regular one-parameter ridge by the better quality of fit. The further works (Lipovetsky, 2006, 2009, 2010, 2018) studied the characteristics of quality for the two-parameter ridge regression and extended it to a family of several enhanced ridge models, with even better characteristics of fit and other valuable statistical features. Application of the ridge regression to the known in marketing research problem of survey sample balancing with maximum effective base was considered in (Lipovetsky, 2007a).

Some other regularization methods have been developed in (Lipovetsky & Conklin, 2001a, 2003a, b). Comparison of several regularization techniques based on the orthonormal decomposition of the data matrix was performed in (Lipovetsky & Conklin, 2014) where it was shown that these approaches are useful in practical regression modeling especially for big data.

3. Other criteria and models

Representation of the ordinary least squares (OLS) solution for the multiple linear regression via weighted mean of partial slopes, regression models by data segments via discriminant analysis (DA), and latent class regressions in the iteratively reweighted least squares (IRLS) approach have been described in (Lipovetsky & Conklin, 2001b, 2005b,c). Unitary response regression models, and regression split by levels of the dependent variable have been considered in the works (Lipovetsky, 2007b, 2012).

Criteria of shortest distance from the observations to the theoretical surface, used in cases of errors by all variables, had been studied in the works on orthogonal regression in special metrics and in implicit function forms (Lipovetsky, 1975, 1976, 1979). Other criteria, such as the equidistant deviations, and optimization by the generalized power Lp-metric for deviations in regression are presented in (Lipovetsky, 2007c, d).

Determining theoretical form of the relation between variables by dimensional consideration (Lipovetsky, 1987), and based on it building of the constant elasticity substitution (CES) mixed with the generalized Box-Cox (GBC) function had been used for studies on the globally concave, monotone and flexible cost functions of electricity consumption (Tishler & Lipovetsky, 1997, 2000).

The questions of prediction have been considered in (Lipovetsky & Conklin, 2014), where particularly it was demonstrated why the predicted dependent variable and the coefficient of multiple determination in the OLS regression do not depend on the degree ill-conditioning of correlation matrix. In forecasting for a new set of the predictors correlated in any real data, the work (Lipovetsky, 2017) shows how to adjust the new values of the predictors taking into account their own structure of correlations. Comparison of the regression modeling and prediction by the individual observations versus their frequency have been studied in (Lipovetsky, 2019) where it is explained why a model built by a dataset could have a low quality of fit and poor predictions of individual observations, while using the frequencies of possible combinations of the predictors and the outcome yields the model with the same parameters but of a high quality of fit and precise predictions.

4. Conclusions

The listed techniques of modeling are useful for solving various specific regression problems and finding meaningful and interpretable results necessary to data scientists, managers, and decision makers in actual applications of statistical models in various fields.

References

Demidenko

(2020). Advanced Statistics with Applications in R, John Wiley & Sons, Hoboken, NJ.

Hoerl

A.E.

, & Kennard

R.W.

(1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55-67.

Hoerl

A.E.

, & Kennard

R.W.

(2000). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 42, 80-86.

Irizarry

R.A.

(2020). Data Analysis and Prediction Algorithms with R, Chapman & Hall/CRC, Boca Raton, FL.

Lipovetsky

(1975). The method of multifactor quasi-orthogonal regression. Industrial Laboratory, Plenum Publishing, 41(1), 709-717.

Lipovetsky

(1976). Statistical estimations of a single interrelation equation with measurement errors. Industrial Laboratory, Plenum Publishing, 42(1), 768-774.

Lipovetsky

(1979). Regression models of implicit functions. Industrial Laboratory, Plenum Publishing, 45(7), 1136-1141.

Lipovetsky

(1987). Determining the form of relation between variables by dimensional consideration. Industrial Laboratory, Plenum Publishing, 53(1), 59-64.

Lipovetsky

(2006). Two-parameter ridge regression and its convergence to the eventual pairwise model. Mathematical and Computer Modelling, 44, 304-318.

10.

Lipovetsky

(2007a). Ridge regression approach to sample balancing with maximum effective base. Model Assisted Statistics and Applications, 2, 17-26.

11.

Lipovetsky

(2007b). Unitary response regression models. International Journal of Mathematical Education in Science and Technology, 38, 1113-1120.

12.

Lipovetsky

(2007c). Equidistant regression modeling. Model Assisted Statistics and Applications, 2, 71-80.

13.

Lipovetsky

(2007d). Optimal Lp-metric for minimizing powered deviations in regression, Journal of Modern Applied Statistical Methods, 6, 219-227.

14.

Lipovetsky

(2009). Multiple regression in pair correlation solution. Journal of Modern Applied Statistical Methods, 8, 122-131.

15.

Lipovetsky

(2010). Enhanced ridge regressions. Mathematical and Computer Modelling, 51, 338-348.

16.

Lipovetsky

(2012). Regression split by levels of the dependent variable. Journal of Modern Applied Statistical Methods, 11, 319-324.

17.

Lipovetsky

(2017). Prediction of percent change in linear regression by correlated variables. Journal of Modern Applied Statistical Methods, 16, 2, 347-358.

18.

Lipovetsky

(2018). Regressions regularized by correlations. Journal of Modern Applied Statistical Methods, 17, 1-16.

19.

Lipovetsky

(2019). Regression modeling and prediction by individual observations versus frequency. Journal of Modern Applied Statistical Methods, 18, 1, 2-19.

20.

Lipovetsky

(2021). Game theory in regression modeling: A brief review on Shapley Value regression, Model Assisted Statistics and Applications, 16, 2, forthcoming.

21.

Lipovetsky

, & Conklin

(2001a). Multiobjective regression modifications for collinearity. Computers and Operations Research, 28, 1333-1345.

22.

Lipovetsky

, & Conklin

(2001b). Regression as weighted mean of partial lines: interpretation, properties, and extensions. International Journal of Mathematical Education in Science and Technology, 32, 697-706.

23.

Lipovetsky

, & Conklin

(2003a). Dual- and triple-mode matrix approximation and regression modeling. Applied Stochastic Models in Business and Industry, 19, 291-301.

24.

Lipovetsky

, & Conklin

(2003b). A model for considering multicollinearity. International Journal of Mathematical Education in Science and Technology, 34, 771-777.

25.

Lipovetsky

, & Conklin

(2005a). Ridge regression in two parameter solution. Applied Stochastic Models in Business and Industry, 21, 525-540.

26.

Lipovetsky

, & Conklin

(2005b). Regression by data segments via discriminant analysis. Journal of Modern Applied Statistical Methods, 4, 63-74.

27.

Lipovetsky

, & Conklin

(2005c). Latent class regression model in IRLS approach. Mathematical and Computer Modelling, 42, 301-312.

28.

Lipovetsky

, & Conklin

(2014). Predictor relative importance and matching regression parameters. Journal of Applied Statistics, 42, 1017-1031.

29.

Tishler

, & Lipovetsky

(1997). The flexible CES-GBC family of cost functions: Derivation and application, The Review of Economics and Statistics, LXXIX, 638-646.

30.

Tishler

, & Lipovetsky

(2000). A globally concave, monotone and flexible cost function: derivation and application. Applied Stochastic Models in Business and Industry, 16, 279-296.

31.

Young

D.S.

(2017). Handbook of Regression Methods, Chapman & Hall/CRC, Boca Raton, FL.