Abstract
The work presents various techniques of the logistic and multinomial-logit modeling with their modifications. These methods are useful for regression modeling with a binary or categorical outcome, structuring in regression and clustering, singular value decomposition and principal component analysis with positive loadings, and numerous other applications. Particularly, these models are employed in the discrete choice modeling and the best-worst scaling known in applied psychology and socio-economics studies.
Introduction
Logistic and multinomial-logit regressions are widely used modeling techniques applied across numerous fields. The two previous reviews listed the works on the linear multiple regression and its modifications used for the continuous dependent variable (Lipovetsky, 2021a, b), and the current review describes methods developed for solving various special problems with the binary and categorical dependent variables. They include techniques for the logistic and multinomial-logit (MNL) modeling with their modifications useful for regression modeling, structuring in regression and clustering, singular value decomposition (SVD) and principal component analysis (PCA) with positive loadings, and numerous other applications. Particularly, these models are widely employed in the item response theory (IRT), discrete choice modeling (DCM), and the best-worst scaling (BWS), well-known in applied psychology and socio-economics studies (Linden, 2019; Mair, 2018). All the described approaches have been developed and tried in real research projects in economics, management, marketing research, and they can be applied in other fields as well.
Logistic regression models enhanced
A generalization of the regular logistic model to a larger family of flexible functions with richer structure produced by the Box-Cox power transformation was suggested in (Lipovetsky & Conklin, 2000), where it was presented via a hyperbolic arcsine function having a better predictive ability. With the Box-Cox parameter reaching zero, the generalized function reduces to the regular logistic model, with other values it yields some algebraic forms. Applying the entropy criterion in logistic regression was described in (Lipovetsky, 2006a), where it was shown that this approach yields a logistic model with coefficients proportional to the coefficients of linear regression. Based on this property, the Shapley value estimation of predictors’ importance (Lipovetsky, 2021a) was employed for obtaining robust parameters adjusted to the logistic model with interpretable coefficients robust to multicollinearity.
Double logistic in regression modeling was presented in (Lipovetsky, 2010a), where instead of the regular sigmoid curve the double sigmoid behavior was described. It consists of the first increase to an early saturation at an intermediate level and the second sigmoid with the eventual plateau of saturation. Such functions have been used, for example, in biometrics, physiology, activation processes in tri-state neural networks. A trinomial response model given via one logit regression was considered in (Lipovetsky, 2015a) where it is shown that a response variable of three ordinal categorical levels of negative-neutral-positive kind can be obtained in one logit regression, with the positive category predictions located closer to 1, negative – closer to 0, and neutral are in the middle of such a continuous 0-1 scale. Examples of such ordered categorical output can be seen in large-medium-small size of soda that people buy in relation to other meals and demographics; gold – silver – bronze medaling in Olympic sport, with relevant predictors of training hours, diet, age, and popularity of this kind of activity in athletes’ home country.
Analytical closed-form solution for binary logit regression by categorical predictors was described in (Lipovetsky, 2015b). In contrast to the common belief that the logit model has no analytical presentation, it is possible to find such a solution in the case of categorical predictors. No special software and no iterative procedures of nonlinear estimation are needed to obtain a model with its coefficients, their standard errors and
MNL in relations to logistic and linear regressions with special properties
MNL, or multinomial-logit regression, is a widely used tool for categorical dependent variable problems, particularly for DCM, or discrete choice modeling. In the work (Lipovetsky, 2011) it was shown that by a special rearrangement of data, the different kinds of MNL, such as conditional and multinomial-logits can be represented via the binary logit regressions which are much easier to build and to interpret. Application of this approach to finding DCM utility and probability of choices via empirical Bayes estimation was performed on the example from a real marketing research project in the work (Lipovetsky, 2014).
MNL structuring can be applied for building multiple linear regressions with improved and special features. In the work (Lipovetsky, 2008a), to get a better fit for the values of the dependent variable it was segmented to a few ranges and built as a linear aggregate of the chain regressions weighted by the MNL shares. Several linear-MNL hybrid models were constructed by the maximum likelihood objectives for the multinomial output and least squares (LS) for the segmented linear aggregates. These hybrid models always outperform ordinary linear regressions, demonstrating a better quality of fit and more precise prediction results. Another work (Lipovetsky, 2009a) considers multiple linear regression generalized by its coefficients varying by each observation. Such individual coefficients are defined via MNL shares of the predictors’ importance. This approach corresponds to a special MNL parameterization in generalized additive and in projection pursuit modeling, is related to the random-coefficients regression. Linear regressions with special coefficients built in parameterization via exponential, logistic, and MNL functions are described in the work (Lipovetsky, 2009b). To obtain always positive coefficients the exponential parameterization is applied, and to get coefficients in the assigned range the logistic parameterization is used. The total of coefficients obtained by the MNL parameterization equals one, so they define the shares of predictors which is useful for interpretation of their importance. All these regression models are constructed by nonlinear optimization techniques, have stable solutions and good quality of fit, have simple structure of the linear aggregates, demonstrate high predictive ability, and suggest a convenient way to identify the main predictors.
Logit and MNL in SVD, PCA, and clustering methods with special features
Logit and MNL functions occurred to be very useful in constructing other multivariate techniques with special properties. For example, SVD, or Singular Value Decomposition is widely used in data processing, reduction, and visualization. However, a positive matrix approximated by the first several dual vectors of the regular SVD can yield irrelevant negative elements. In the work (Lipovetsky & Conklin, 2005) it was shown that the logistic SVD modification can be applied, producing the matrix approximation in a desired range of values at any step of approximation. In another paper (Lipovetsky, 2009c), the exponent, logistic, and MNL parameterization was applied for the eigenvectors’ elements of SVD and PCA, or principal component analysis with the nonnegative loadings. In contrast to the regular PCA and SVD, a matrix decomposition by the positive vectors shows explicitly which variables and with which precision contribute to the data approximation. The LS objective of matrix fit is reduced to the Rayleigh quotient for variational description of the eigenvalues, the eigenvectors with the nonlinear parameterization are found in the Newton-Raphson optimizing procedure, and the results get interpretation by the Perron-Frobenius theory for each subset of variables identified by sparse loading vectors.
Application of MNL structuring for clustering problems is studied in (Lipovetsky, 2012). The maximum likelihood objectives for estimating probability of each multivariate observation’s assignment to one particular cluster or to at least one or more clusters are considered, and combination of both objectives yields maximization by the total odds of probability to belong to one or another cluster. The gradient of the total odds objective is reduced to the MNL probabilities leading to a convenient clustering procedure presented via an iteratively re-weighted least squares (IRLS) technique. Several other objectives for clustering are also described. Another work (Lipovetsky, 2013) considers how to find clusters’ centers and sizes in the nonlinear LS optimization with multinomial parameterization. The method is especially useful for large data sets as it operates on the summary statistics only. This approach also works for the problem of finding clusters’ centers and sizes by the variance-covariance matrix when the original data is not available. Estimation of the clusters centers and sizes can be followed by actual clustering, and the applications are discussed.
Logit and MNL applications to BWS and other choice models
The logistic and MNL techniques plays an important role in special techniques on choice modeling widely used in psychologic and socio-economic research and applications. One of the most popular approach to problems of choice and prioritization is the Best-Worst Scaling, or BWS (Louviere et al., 2000, 2015; Marley et al., 2008, 2016), sometimes also called the Maximum Difference, or MaxDiff method. In the works (Lipovetsky & Conklin, 2014, 2015, 2019), employing the logistic and MNL regression properties, the BWS solutions were obtained in the analytical closed-form, with and without hierarchical Bayesian modeling, with adjustment to non-available items and network effects, respectively. The works (Lipovetsky, 2018a, b, 2019) present new approaches of quantum probability amplitude and complex utility in entangled discrete choice modeling, choice probability estimations on the aggregate and individual respondent level, and BWS prioritization method based on D. Kahneman’s System 1 approach to the process of making choices, respectively.
Various other approaches to the choice modeling and decision making solved with logistic and MNL techniques include, for example, van Westendrop price sensitivity meter, and Bradley-Terry choice model (Lipovetsky, 2006b, 2008b). In the large area of the multiple-criteria decision making, for example, in the Analytic Hierarchy Process (AHP) originated by T. Saaty (1980, 2005), the new extensions can be achieved with logit and MNL modeling as well (Lipovetsky, 2021c, d).
Conclusions
The listed techniques of modified and enhanced logistic and multinomial-logit modeling vary across specific requirements, but they all are useful for solving actual problems in different fields. The suggested approaches are convenient in application, and can enrich practical statistical modeling and decision making.
