Abstract
Machine learning based systems and products are reaching society at large in many aspects of everyday life, including financial lending, online advertising, pretrial and immigration detention, child maltreatment screening, health care, social services, and education. This phenomenon has been accompanied by an increase in concern about the ethical issues that may rise from the adoption of these technologies. In response to this concern, a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. The central question is how to ensure that the learned model does not treat subgroups in the population unfairly. While the design of solutions to this issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm. In this work, we will describe the state of the art on algorithmic fairness using statistical learning theory, machine learning, and deep learning approaches that are able to learn fair models and data representation.
Introduction
Machine learning is increasingly used in a wide range of decision-making scenarios that have serious implications for individuals and society, including financial lending [21, 130], hiring [17, 84], online advertising [79, 161], pretrial and immigration detention [7, 170], child maltreatment screening [30, 182], health care [42, 115], social services [3, 53], and education [150, 155]. Whilst this has the potential to overcome undesirable aspects of human decision-making, there is concern that biases in the training data and model inaccuracies can lead to decisions that treat historically discriminated groups unfavourably. The research community has therefore started to investigate how to ensure that learned models do not take decisions that are unfair with respect to sensitive attributes (i.e. race, gender, and political or sexual orientation).
The central question in algorithmic fairness is how to enhance machine learning algorithms with fairness requirements, namely ensuring that sensitive information does not unfairly influence the outcome of a model learned from data. In order to better understand the concept let us make some examples. If the learning problem is to decide whether a person should be offered a loan based on his previous credit card scores, we would like to build a model which does not unfairly use additional sensitive information such as the race. If the learning problem, instead, is to predict the probability of dropout or the performance of a student from the university based on his previous school records, we would like to build a model which does not unfairly use additional sensitive information such as the sex. Finally if the learning problem is to decide whether a person should be offered a job based on his skills, we would like to build a model which does not unfairly use additional sensitive information such as his political or sexual orientation.
Algorithmic fairness obviously requires, as cornerstone, a formal/mathematical definition of fairness and for these reasons researchers have put a lot of effort in the attempt of addressing this problem. This effort has led to the emergence of a rich set of fairness definitions [29, 200] providing researchers and practitioners with criteria to evaluate existing systems or to design new ones. Unfortunately, many of such definitions have been found to be mathematically incompatible [15, 111], and this has been viewed as representing an unavoidable trade-off establishing fundamental limits on fair machine learning, or as an indication that certain definitions do not map on to social or legal understandings of fairness [35]. Most fairness definitions express properties of the model output with respect to sensitive information, without considering the relations among the relevant variables underlying the data-generation mechanism. As different relations would require a model to satisfy different properties in order to be fair, this could lead to erroneously classify as fair/unfair models exhibiting undesirable/legitimate biases.
For this purpose, recently, the causal Bayesian network framework draw attention to this point, by visually describing unfairness in a dataset as the presence of an unfair causal path in the data-generation mechanism [27]. The causal framework [41, 162] offers an intuitive and powerful way of reasoning about fairness, by viewing unfairness as the presence of an unfair causal effect of the sensitive attribute on the decision [18, 204– 206]. This allows to understand how unfairness biases the data and it helps to select the correct notion of fairness to impose in the learned model.
In literature, it is possible to find a plethora of different methods to generate fair models, namely to impose fairness in the learned model, with respect to one or more notions of fairness and sensitive attributes. These methods can be mainly divided in three families: methods in the first family implement fairness by modifying the data representation, and then employ standard machine learning methods [25, 202]; in the second family, we can find methods that enforce fairness directly during the training phase, i.e. [2, 200]; the third family of methods change a pre-trained model in order to make it fairer (while trying to maintain the classification performance) [56, 163].
All methods in the previous families have in common the goal of creating a fair model from scratch on the specific task at hand. This solution may work well in specific cases, but in a large number of real world applications, using the same model (or at least part of it) over different tasks is helpful if not mandatory. For example, it is common to perform a fine tuning over pre-trained models [47], keeping fixed the internal representation. Indeed, most modern machine learning frameworks (especially the deep learning ones) offer a set of pre-trained models that are distributed in so-called model zoos 1 . Unfortunately, fine tuning pre-trained models on novel previously unseen tasks could lead to an unexpected unfairness behaviour, even starting from an apparently fair model for previous tasks (i.e. discriminatory transfer [118] or negative legacy [105]), due to missing generalization guarantees concerning the fairness property of the model.
In order to overcome the above problem, it is more appropriate to look at the learning problem in a multitask/lifelong learning framework [152]. In this context one leverages on task similarity in order to learn a fair representation that provably generalizes well to unseen tasks. By this we mean that when the representation is used to learn novel tasks, it is guaranteed to learn a model that has both a small error and meets the fairness requirement. Many papers in literature already pursued this goal [16, 186].
In this work, we will focus on describing the state of the art on algorithmic fairness using statistical learning theory, machine learning, and the deep leaning tools to be able to learn fair models and data representation. The rest of this work is organized as follows. Section 2 deals with the problem of formalize a way to measure the fairness. One can refer to [27] for understanding how unfair biases can be made apparent using causal models. Section 3 shows how to impose fairness in learning models from data. Section 4 shows how to learn a fair representation instead of a fair model. Section 5 concludes the work. Please refer to [151] for a deep discussion about the use of the sensitive attribute in the functional form of the models and representations. In order to not over-complicate the notation, we will use the most suited notation for each one of the different sections.
Measuring the (un)fairness
Algorithmic fairness obviously requires, as cornerstone, a formal/mathematical definition of fairness and for these reasons researchers have put a lot of effort in the attempt of addressing this problem. This effort has led to the emergence of a rich set of fairness definitions providing researchers and practitioners with criteria to evaluate existing systems or to design new ones. Unfortunately, many of such definitions have been found to be mathematically incompatible [15, 111], and this has been considered as an unavoidable trade-off establishing fundamental limits on fair machine learning, or as an indication that certain definitions do not map on to social or legal understandings of fairness [35]. Most fairness definitions express properties of the model output with respect to sensitive information. Other definitions, instead, try to define a fair representation concept, namely a way to change the data representation in order to remove the direct or indirect effect of the sensitive feature. Both approaches do not consider the relations among the relevant variables underlying the data-generation mechanism. As different relations would require a model to satisfy different properties in order to be fair, this could lead to erroneously classifying as fair/unfair models exhibiting undesirable/legitimate biases.
Examples of well known notion and widely exploited notions of fairness art Demographic Parity [76], Equal Opportunity [76], Equal Odds [76], Counterfactual Fairness [117], and the Wasserstein Distance [201]. A synthetic review of most of the notions of fairness proposed in literature can be found in Table 1.
A synthetic review of most of the notions of fairness.
A synthetic review of most of the notions of fairness.
In this work we will detail, in the following section, just the notions that we will actually use.
Broadly speaking, we can group current literature on algorithmic fairness into three main approaches. The first approach consists in pre-processing the data to remove bias, or in extracting representations that do not contain sensitive information during training [16, 2010]. The second approach consists in performing a post-processing of the model outputs [1, 208]. The third approach consists in enforcing fairness notions by imposing constraints into the optimization, or by using an adversary [4, 203]. Some methods transform the constrained optimization problem via the method of Lagrange multipliers [2, 197]. Other works similar in spirit add penalties to the objective [49, 114]. Adversarial methods maximize the system ability to predict the target while minimizing the ability to predict the sensitive attribute [203].
A full synthetic review of most of the papers in literature is reported in Table 3. In particular for each paper in
literature on fairness we reported the Method Family: Pre- (PreP), In- (InP), or Post-Processing
(PostP); the kind of Tasks is able to
solve: Binary Classification (BC), Multi-class Classification (MC), Regression (R),
Clustering (C), or Multi Armed Bandit (MAB); what kind of Protected Attribute is able to handle: Binary (B),
Categorical (C), or Numerical (N); Notion of Fairness (see Table 1); if the paper
contains or not Theoretical Results; if
the paper contains or not Experimental Results; the list of papers (methods) compared in the
papers; if the Code is
Available.
A synthetic review of most of the papers available in the literature
A synthetic review of most of the papers available in the literature
Public real world datasets exploited in fairness-related problems.
This review surely covers many papers in literature regarding the fairness topic but some of them are still missing. For example, all the works dealing just with the problem of detecting, and not solving, unfairness issues like [159, 204].
Moreover in Table 5 we review the public real world datasets exploited in fairness-related problems.
In the rest of this section we will present in details one method for each one of the three families (Sections 3.1, 3.2 and 3.3).
Common notions of fairness, methods, and studies have been developed in the setting of classification with categorical sensitive features. Nevertheless, they can be extended to the general supervised learning setting (regression and classification) with general sensitive features (categorical and continuous) [153]. For example, authors in [49] observed that simple notion of fairness can be incorporated within the Empirical Risk Minimization (ERM) framework. But, as the fairness measures mentioned above are more general than those employed in [49], authors in [153] entended the original framework to cover the whole supervised learning setting named General FERM (G-FERM). Authors in [153] show that G-FERM is supported by consistency guarantees both in terms of risk and fairness measure. Specifically, authors in [153] derive both risk and fairness bounds, which support the statistically consistency of G-FERM. Authors in [153] give a concrete instance of G-FERM in the setting of kernel methods, leading to a form of constrained regularized empirical risk minimization, in which the fairness constraint is obtained by composing the ℓ1 norm with a linear transformation.
In this section we present this approach and, in particular, we present new generalized notions of fairness that encompass well studied notions used for classification and regression with categorical and numerical sensitive features. Second, we report statistical bounds for G-FERM that imply consistency properties both in terms of fairness measure and risk of the selected model. As a third contribution, we instantiate G-FERM in the setting of kernel methods, leading to an efficient convex estimator. Authors in [153] test this estimators on a series of tests. The experimental results show that the estimator is effective at mitigating the trade-off between accuracy and fairness requirements.
The importance of having a general approach to fairness
In the context of fairness, most papers in literature address the problem of binary classification task with categorical (or even binary) sensitive features [76, 197]; a broad review on classification with categorical sensitive features is provided in [49]. This task is indeed very important, because it is strictly related to the possibility of having access to specific benefits (e.g., loans) without being discriminated due to gender or ethnic characteristics. On the other hand, the set of problems solvable by using these methods is limited and not comprehensive of all the real-world case scenarios.
Focusing on the works able to handle regression tasks, we can divide them by the type of problems they are able to solve and the notion of fairness they exploit. As we will see, with very few exceptions - e.g., [113] - most of the methods in literature are not able to deal with both classification and regression tasks and with both numerical and categorical sensitive features with a unified approach supported by theoretical consistency results. In fact, they introduce task oriented notions of fairness and/or do address the statistical consistency of their method with respect to the risk and the fairness measure employed.
The largest family of methods tackle regression problems with (single) categorical or binary sensitive features [14, 168]. For example, in [14], a convex approach for regression is proposed, where the authors use a specific definition of fairness in order to have models which treat similar examples in a similar way, in the sense of the predicted outcome. The authors tackle the problem by introducing a new convex regularizer and by imposing this notion on different regression tasks. Another example is [59], where the authors use an adapted version of Demographic Parity [50] for classification, in the context of regression.
Reducing the regression problem to have only categorical sensitive features is a serious limitation. In this sense, few interesting papers present regression methods able to deal with continuous sensitive attributes [113, 160]. Differently to the approach of this section, the authors impose other definitions of fairness (e.g., Disparate Impact [197] or even ad-hoc brand new definitions). Moreover, it is important to note that these methods do not naturally extend to the case of not-continuous sensitive attributes.
Considering a larger spectrum of possible methodologies, it is possible to find in literature other methods able to solve regression tasks by imposing some concept of fairness. The works [143, 144] tackle the regression problem exploiting the causal machine learning framework. These methods can potentially handle both continuous and categorical sensitive features. The authors’ analysis considers only the case of categorical ones, leaving the evolution to continuous sensitive attributes as possible future works. Another interesting idea, presented in [196], is to study the fairness as a property of the metric of the feature space. The authors introduce a new definition of metric-related fairness allowing them to solve a regression problem with categorical and continuous sensitive attributes. Finally, learning fair pre-processing rules is another possible way to obtain a regression model that is fair. In fact, for example in [202], the fair representation of the data can be used in synergy with any classic regression method, in order to generate a fair regression model.
Setting
Let
Let K and Q be positive integers and define the sets
We consider a function (or model) f chosen from a set
The purpose of a learning procedure is to find a model that minimizes the risk. Since the probability measure μ is usually unknown, the risk cannot be computed, however we can compute the empirical risk and a natural learning strategy, called Empirical Risk Minimization (ERM), is then used to minimize the empirical risk within a prescribed set of functions, see e.g., [175].
ε-loss general fair
In the literature different definitions of fairness of a classifier or real-valued function exist as described in Section 3.1.1. It is important to stress that there is not yet a consensus about which definition should be employed to evaluate algorithmic fairness. Moreover, most of the current fairness definitions are not able to deal with regression problems (or with continuous sensitive attributes), losing their meaning or being even not definable. In this work we use a general notion of fairness able to deal with both classification and regression and with both categorical and numerical sensitive features and which generalizes previously known notions of fairness.
We say that a function f is ε-loss general fair (ε-LGF) with ε ∈ [0, 1] if it satisfies the following condition
This definition says that a model is fair if its errors, relative to the loss function,
are approximately equally distributed independently of the value of the sensitive
attribute. Definition 2 includes Definition 1 when we choose
In this section, we aim at minimizing the risk subject to a fairness constraint. Specifically, we consider the problem
We will refer to Problem (6) as G-FERM since it generalizes the FERM approach introduced in [49].
Let f* be a solution of Problem (5), and let
For this purpose, we require that for any data distribution, it holds with probability at least 1 - δ with respect to the draw of a dataset that
We are now ready to state the first result of this section.
The proof is reported in [153] and is the generalization of the one in [49]. The proof is based on the triangular inequality and union bound.
A consequence of the first statement of Theorem 1 is that as n tends
to infinity
Thanks to Theorem 1 we can state that f* is close to
Note that, the quantities in Problem (12) cannot be computed since the underline data generating distribution is unknown. Moreover, the objective function and the fairness constraint of Problem (12) are non convex.
Theorem 1 allows us to solve the first issue since we can safely search for a solution
Unfortunately, Problem (12)empirical is still a difficult non-convex non-smooth
problem, and for this reason it is more convenient to solve a convex relaxation. That
is, we replace the possible non-convex loss function in the risk with its convex upper
bound ℓ
c
(e.g., the square loss
ℓ
c
= (y - f (z)) 2)
and the losses ℓ
k
,
1 ≤ k ≤ K, in the constraint with a relaxation
(e.g., the linear loss
Note that this approximation of the fairness constraint corresponds to matching the first order moment [49].
The questions that arise here are whether
The first statement of Proposition 1 tells us that exploiting the quality in
approximating the risk depends on the quality of the convex approximation. The second
statement of Proposition 1, instead, tells us that if
The bound in Proposition 1 may be tightened by using different non-linear approximations of the GF. However, the linear approximation proposed in this work gives a convex problem, and as showed in [153], works well in practice.
In summary, the combination of Theorem 1 and Proposition 1 provides conditions under
which a solution
In this section, we specify the G-FERM framework to the case that the underlying space of models is a reproducing kernel Hilbert space (RKHS) [176, 177].
We let
We propose to solve Problem (15) in the case that
Using Equation (17) the constraint in Problem (15) becomes
In practice, we solve the following Tikhonov regularization problem
Problem (20) can be kernelized by observing that, thanks to the Representer Theorem
[176]
The dual of Problem (20) may be derived using Fenchel duality, see e.g., [19, Theorem 3.3.5].
Finally, we note that in the case when φ is the identity mapping
(i.e., κ is the linear kernel on
In this section we will show that the method presented in Section 3.1 can be translated into a pre-processing method. To simplify, in this section, we will illustrate just the case of binary classification with a binary valued sensitive feature but the result can be easily generalized [49, 153].
Fair empirical risk minimization
We begin by introducing the notation. We let
The purpose of a learning procedure is to find a model that minimizes the risk. Since
the probability measure μ is usually unknown, the risk cannot be
computed, however we can compute the empirical risk
Let us introduce slightly less general notions of fairness with respect to the one defined in Section 3.1, which still encompasses some previously used notions and it allows to introduce new ones by specifying the loss function used below.
This definition says that a model is fair if it commits approximately the same error on the positive class independently of the group membership. That is, the conditional risk L+,g is approximately constant across the two groups. Note that if ε = 0 and we use the hard loss function, ℓ h (f (x) , y) = 1{yf(x)≤0}, then Definition 3 is equivalent to definition of EO proposed by [76], namely:
This equation means that the true positive rate is the same across the two groups. Furthermore, if we use the linear loss function ℓ l (f (x) , y) = (1 - yf (x))/2 and set ε = 0, then Definition 3 gives
By reformulating this expression we obtain a notion of fairness introduced in [51]
Yet another implication of Equation (23) is that the output of the model is uncorrelated with respect to the group membership conditioned on the label being positive [48], that is, for every g ∈ {a, b}, we have
Finally, we observe that the approach naturally generalizes to other fairness measures that are based on conditional probabilities, e.g., Equal Odds [76]. Specifically, we would require in Definition 3 that |Ly,a (f) - Ly,b (f) | ≤ ε for both y ∈ {-1, 1}.
Fair Empirical Risk Minimization
In this section, we aim at minimizing the risk subject to a fairness constraint. Specifically, we consider the problem
Since the measure μ is unknown we replace the deterministic quantities
with their empirical counterparts, possibly resulting in a convex problem (as we did in
Section 3.1). Then, we replace the deterministic quantity with their empirical
counterparts, the hard loss in the risk with a convex loss function
ℓ
c
(e.g., the Hinge loss
ℓ
c
= max {0, ℓ
l
}) and the hard
loss in the constraint with the linear loss ℓ
l
. In this way,
we look for a solution
Note that this approximation of the EO constraint corresponds to matching the first order moment. Other works tries to match the second order moment [189] or potentially infinitely many moments [164] but these approaches result in non-convex approaches.
Note that the solutions of Problems (24) and (25) are close under the same hypothesis described in Section 3.1.
In this section, we specify the FERM framework to the case that the underlying space of
models is a reproducing kernel Hilbert space (RKHS), see e.g., [176, 177] and references
therein. We let
We solve Problem (25) with
In practice, we solve the Tikhonov regularization problem
By the representer theorem [173], the
solution to Problem (29) is a linear combination of the feature vectors
φ (x1) , …,
φ (x
n
) and the vector
u. However, in this case u is itself a linear
combination of the feature vectors (in fact only those corresponding to the subset of
positive labeled points) hence w is a linear combination of the input
points, that is
Let us consider Problem (29) when φ is the identity mapping
(i.e., κ is the linear kernel on
In other words, we are able to obtain a fair linear model without any other constraint and by using a representation that has one feature fewer than the original one.
We can extend the approach to the non-linear case by also defining a fair kernel matrix instead of fair data mapping [49, 153].
In this section, we study the problem faced by [32] of fair binary classification using the notion of Equal Opportunity. It requires the true positive rate to distribute equally across the sensitive groups. Within this setting, it is possible to show that the fair optimal classifier is obtained by recalibrating the Bayes classifier by a group-dependent threshold. We provide a constructive expression for the threshold. This result motivates us to devise a plug-in classification procedure based on both unlabeled and labeled datasets. While the latter is used to learn the output conditional probability, the former is used for calibration. The overall procedure can be computed in polynomial time and it is shown to be statistically consistent both in terms of the classification error and fairness measure.
More precisely, in [32] the important problem of devising statistically consistent and computationally efficient learning procedures that meet the fairness constraint is addressed. Specifically, makes the following contributions are derived in [32]. First, authors derive in Proposition 2 the expression for the optimal equal opportunity classifier, derived via thresholding of the Bayes regressor. Second, inspired by the above result authors proposed a semi-supervised plug-in type method, which first estimates the regression function on labeled data and then estimates the unknown threshold using unlabeled data. Consequently, authors establish in Theorem 2 that the proposed procedure is consistent, that is, it asymptotically satisfies the equal opportunity constraint and its risk converges to the risk of the optimal equal opportunity classifier. Finally, in [32], authors present numerical experiments which indicate that the method is often superior or competitive with the state-of-the-art on benchmark datasets.
We highlight that the proposed learning algorithm can be applied on top of any off-the shelf method which consistently estimates the regression function (class condition probability), under mild additional assumptions which we discuss in the paper. Furthermore, the calibration procedure is based on solving a simple univariate problem. Hence the generality, statistical consistency and computational efficiency are strengths of the approach.
Related work
To the best of our knowledge the formula for the optimal fair classifier presented in [32] is novel. In [76] the authors note that the optimal equalized odds or equal opportunity classifier can be derived from the Bayes optimal regressor, however, no explicit expression for this threshold is provided. The idea of recalibrating the Bayes classifier is also discussed in a number of papers, see for example [139, 163] and references therein.
Plug-in methods in classification problems are well established and are well studied from statistical perspective, see [10, 193] and references therein; in particular, it is known that one can build a plug-in type classifier which is optimal in minimax sense [10, 193]. Until very recently, theoretical studies on such methods were reduced to an efficient estimation of the regression function. Indeed, in standard settings of classification the threshold is always known beforehand, thus, all the information about the optimal classifier is wrapped into the distribution of the label conditionally on the feature.
More recently, classification problems with a distribution dependent threshold have emerged. Prominent examples include classification with non-decomposable measures [116, 209], classification with reject option [43, 120], and confidence set setup of multi-class classification [31, 172], among others. A typical estimation algorithm in these scenarios is based on the plug-in strategy, which uses extra data to estimate the unknown threshold. Interestingly, in some setups a practitioner does not need to have access to two labeled samples and optimal estimation can be efficiently performed in semi-supervised manner [31, 44].
Optimal equal opportunity classifier
Let (X, S, Y) be a tuple on
There are various definitions of fairness available in the literature, each having its critics and its supporter. In this work, we employ the following definition introduced in [76]. We refer the reader to this work as well as [2, 139] for a discussion, motivation of this definition, and a comparison to other fairness definitions.
The set of all fair classifiers is denoted by
Note, that the definition of fairness depends on the underlying distribution
Using this notion of fairness we define an optimal equal opportunity classifier as a solution of the optimization problem
We now introduce an assumption on the regression function that plays an important role in establishing the form of the optimal fair classifier.
Moreover, for every s ∈ {0, 1}, we assume that η (X, s) ≥1/2S = s > 0.
The first part of Assumption 3.3.2 is achieved by many distributions and has been introduced in various contexts, see e.g., [31, 191] and references therein. It says that, for every s ∈ {0, 1} the random variable η (X, s) does not have atoms, that is, the event ensη (X, s) = t has probability zero. The second part of the assumption states that the regression function η (X, s) must surpass the level 1/2 on a set of non-zero measure. Informally, returning to scholarship example mentioned in the introduction, this assumption means that there are individuals from both groups who are more likely to be offered a scholarship based on their curriculum.
In the following result we establish that the optimal equal opportunity classifier is obtained by recalibrating the Bayes classifier.
Furthermore it holds that absθ* ≤ 2.
The proof is reported in [32].
Before proceeding further, let us define a notion of unfairness, which plays a key role in the statistical analysis; it is sometimes referred to as difference of equal opportunity (DEO) in the literature [49].
A principal goal of this paper is to construct a classification algorithm
In this section, we present the proposed plug-in algorithm and begin to study its theoretical properties.
We assume that we have at our disposal two datasets, labeled
Based on the estimator
Using the above estimators a straightforward procedure to mimic the optimal classifier
g* provided by Proposition 2 is to employ a
plug-in rule
It remains to define the value of
Notice that the empirical unfairness
Using the above expression we can rewrite
Hence, the passage from the population unfairness to its empirical version in Definition 6 formally reduces to substituting “hats” to all the unknown quantities.
Using Definition 6, a logical estimator
In this case, the algorithm
Consistency
In this section we establish that the proposed procedure is consistent. To present the
theoretical results we impose two assumptions on the estimator
There
exists a sequence
cn,N > 0
satisfying
The second part of Assumption 2 means that the quanitity
Additionally, we impose one more condition on the estimator
In the proposed settings this assumption allows us to show that the value of
The remarks suggest that both Assumptions 2 and 3 can be easily satisfied in a variety
of practical settings and the most demanding part of these assumptions is the
consistency of
The next result establishes the statistical consistency of the proposed algorithm.
The proof is reported in [32].
Methods for learning fair representations
Let us consider a composition of models
f (g (x)) where
In the current literature [16, 202] - with fair representation we refer to the
concept of learning a representation function g, which does not
discriminate subgroups in the data. Namely, g is conditionally independent
of subgroup membership. This approach is different from most commonly used approaches [49, 197], in which the focus is to solve a task (or a set
of tasks) without discriminating subgroups in the data, regardless of the fairness of the
representation itself. That is, in the previously mentioned work a fair model
Note that these method could be considered a special case of the Pre-Processing methods, but the conceptual difference is that here, the representation, is not a deterministic mapping but it is a learned, problem dependent, mapping.
In particular, in [16, 186], the authors propose different neural networks architectures together with modified learning strategies able to learn a representation that obscures or removes the sensitive variable. In the general case, all these methods have an input, a target variable (i.e., the task at hand) and a binary sensitive variable. The objective is to learn a representation that: preserves information about the input space; is useful for predicting the target; is approximately independent of the sensitive variable. In practice, these methods pursue the goal of making the generated model act randomly when the internal representation is exploited to predict the sensitive variable. In this sense, no actual constraint is directly imposed on the internal representation, but only on the output of the model.
In [92], instead, the authors show how to formulate the problem of counterfactual inference as a domain adaptation problem, and more specifically a covariate shift problem [165]. The authors derive two new families of representation algorithms for counterfactual inference. The first one is based on linear models and variable selection, and the other one on deep learning. The authors show that learning representations that encourage similarity (i.e., balance) between the treatment and control populations lead to better counterfactual inference; this is in contrast to many methods which attempt to create balance by re-weighting samples.
Finally, in [202], the authors learn a representation of the data that is a probability distribution over clusters where learning the cluster of a datapoint contains no-information about the sensitive variable, namely fair clustering. In this sense, the clustering is learned to be fair and also discriminative for the prediction task at hand.
Method
In this section, we present a method to learn a shared fair representation from multiple
tasks [152]. We consider T
supervised learning tasks (each could be a binary classification or regression problem).
Each task t ∈ {1, …, T} is identified by a probability
distribution μ
t
on
Depending on the application at hand, the model may include
(i.e.,
A general multitask learning formulation (MTL) is based on minimizing the multitask
empirical error plus a regularization term which leverages similarities between the tasks.
A natural choice for the regularizer which is considered in this section is given by the
trace norm, namely the sum of the singular values of the matrix
Note that the method can be interpreted as a 2-layer network with linear activation
functions. Indeed, the matrix applied to an input vector
This is a crude approximation since it corresponds to requiring the first order moment of the two distributions to be the same. However, as shown in [152], it works well in practice and has the major advantage of turning a non-convex constraint in a convex one. We note that a similar approximation has been considered in [153] in the case of fair regression, and reported to be empirically effective.
Based on the above reasoning, we propose to learn a fair linear representation as a
solution to the optimization problem
When we regard A as fixed and solve w.r.t. B, then Problem (37) can be reformulated as
We conclude this section by noting that if demographic parity is satisfied at the representation level, that is, Equation (34) holds true, then every model built from such representation will satisfy demographic parity as well. Likewise if the representation satisfies the convex relaxation of Equation (36), then it will also hold that 〈w t , c (z t )〉=〈b t ,A⊤ c (z t ) 〉=0, that is the task weight vectors will satisfy the first order moment approximation of demographic parity. More importantly, as we will show in the next section, if the tasks are randomly observed, then demographic parity will also be satisfied on future tasks with high probability. In this sense the method can be interpreted as learning a fair transferable representation.
In this section, we study the learning ability of the proposed method. We consider the setting of learning-to-learn [12], in which the training tasks (and their corresponding datasets) used to find a fair data representation are regarded as random variables from a meta-distribution. The learned representation matrix A is then transferred to a novel task, by applying ridge regression on the task dataset, in which the input x is transformed as . In [136] a learning bound is presented, linking the average risk of the method over tasks from the meta-distribution (the so-called transfer risk) to the multi-task empirical error on the training tasks. This result quantifies the good performance of the representation learning method when the number of tasks grow and the data distribution on the raw input data is intrinsically high dimensional (hence learning is difficult without representation learning). We extend this analysis to the setting of algorithmic fairness, in which the performance of the algorithm is evaluated both relative to risk and the fairness constraint. We show that both quantities can be bounded by their empirical counterparts evaluated on the training tasks.
To present the result we introduce some more notation. We let
Furthermore, for every matrix
The proof is reported in [152].
We need to make some remarks on the above result. The first bound in Theorem 4.2 improves
Theorem 2 in [136]. The improvement is due to
the introduction of the empirical total covariance in the second term in the RHS of the
inequality. The result in [136] instead contains
the term
Since machine learning based systems and products are reaching many aspects of everyday life, an increase in concern about the ethical issues that may rise from the adoption of these technologies started to emerge. Contemporary researchers have started to investigate methods to mitigate the possible side effects of these technologies and a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. In this work we tried to describe the state of the art on algorithmic fairness using statistical learning theory, machine learning, and the deep leaning tools to be able to learn fair models and data representation. The central question that we tried to answer is how to ensure that the learned model does not treat subgroups in the population unfairly. While solving such an issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm. In particular, we first show that it is possible to learn fair models with pre/in/post-processing techniques able to ensure that the outputs of a learned model satisfy a particular notion of fairness. After this step, we show that forcing the fairness in the model output is not enough to reach fairness. In modern contexts, models are not learned from scratch since datasets are too complex or small in cardinality. Consequently, we need to exploit correlation between tasks in order to solve a new task. This can be done using a shared representation, in the transfer/multitask/lifelong setting, which allows to learn more effectively all the tasks. In the context of this work, the representation should also be fair, in the sense that each possible model learned from this representation should be, as a consequence, fair. For this reason, in this work, we also reviewed methods to build a fair representation.
This field or research is very interdisciplinary and relatively new and this work surely does not exhaustively review all the ideas around fairness. For example, all the legal issues about fairness have not been taken into account and the collaboration between society, politician, lawyers, and AI researchers is the necessary future to let this field of research be useful in real applications. Another interesting field of research, which will become increasingly important in the future, is the field of causal inference which provides a precious tool to reason about and deal with fairness, especially in complex unfairness scenarios. Another limitation of this manuscript is that it did not discuss fairness in a temporal setting, namely the consequences of current decisions in the long term. This area of research has only recently started to be explored but will have to play a critical role in the development of truly fair techniques.
Acknowledgments
This work has been partially supported by the Amazon AWS Machine Learning Research Award and the European Union’s Horizon 2020 research and innovation programme under the NGI_TRUST grant agreement no 825618 - Third-party project AMNESIA Assessment of fairness of NGI AI-based future interactive technologies.
Footnotes
This problem has been faced in deep in [
].
We did not face this issue in details in this paper since it is mainly a legal issue. In
fact, removing the sensitive feature from the model functional form of the model does not
ensure fairness since other features may be correlated to the sensitive one, nevertheless
including or not these features is just a technical issue (which has been treated in a
different section of this paper).
Note additionally that for all s ∈ {0, 1} we can write Y = 1, g (X, s) =1 ≡ Yg (X, s), since both Y and g are binary.
The method naturally extends to multiple sensitive variables but for ease of presentation we consider only the binary case.
If r < min(d, T) then Problem (33) is equivalent to trace norm regularization plus a rank constraint.
