Learning fair models and representations

Abstract

Machine learning based systems and products are reaching society at large in many aspects of everyday life, including financial lending, online advertising, pretrial and immigration detention, child maltreatment screening, health care, social services, and education. This phenomenon has been accompanied by an increase in concern about the ethical issues that may rise from the adoption of these technologies. In response to this concern, a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. The central question is how to ensure that the learned model does not treat subgroups in the population unfairly. While the design of solutions to this issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm. In this work, we will describe the state of the art on algorithmic fairness using statistical learning theory, machine learning, and deep learning approaches that are able to learn fair models and data representation.

Keywords

Algorithmic fairness fair models fair representation

1 Introduction

Machine learning is increasingly used in a wide range of decision-making scenarios that have serious implications for individuals and society, including financial lending [21, 130], hiring [17, 84], online advertising [79, 161], pretrial and immigration detention [7, 170], child maltreatment screening [30, 182], health care [42, 115], social services [3, 53], and education [150 , 155]. Whilst this has the potential to overcome undesirable aspects of human decision-making, there is concern that biases in the training data and model inaccuracies can lead to decisions that treat historically discriminated groups unfavourably. The research community has therefore started to investigate how to ensure that learned models do not take decisions that are unfair with respect to sensitive attributes (i.e. race, gender, and political or sexual orientation).

The central question in algorithmic fairness is how to enhance machine learning algorithms with fairness requirements, namely ensuring that sensitive information does not unfairly influence the outcome of a model learned from data. In order to better understand the concept let us make some examples. If the learning problem is to decide whether a person should be offered a loan based on his previous credit card scores, we would like to build a model which does not unfairly use additional sensitive information such as the race. If the learning problem, instead, is to predict the probability of dropout or the performance of a student from the university based on his previous school records, we would like to build a model which does not unfairly use additional sensitive information such as the sex. Finally if the learning problem is to decide whether a person should be offered a job based on his skills, we would like to build a model which does not unfairly use additional sensitive information such as his political or sexual orientation.

Algorithmic fairness obviously requires, as cornerstone, a formal/mathematical definition of fairness and for these reasons researchers have put a lot of effort in the attempt of addressing this problem. This effort has led to the emergence of a rich set of fairness definitions [29 , 200] providing researchers and practitioners with criteria to evaluate existing systems or to design new ones. Unfortunately, many of such definitions have been found to be mathematically incompatible [15 , 111], and this has been viewed as representing an unavoidable trade-off establishing fundamental limits on fair machine learning, or as an indication that certain definitions do not map on to social or legal understandings of fairness [35]. Most fairness definitions express properties of the model output with respect to sensitive information, without considering the relations among the relevant variables underlying the data-generation mechanism. As different relations would require a model to satisfy different properties in order to be fair, this could lead to erroneously classify as fair/unfair models exhibiting undesirable/legitimate biases.

For this purpose, recently, the causal Bayesian network framework draw attention to this point, by visually describing unfairness in a dataset as the presence of an unfair causal path in the data-generation mechanism [27]. The causal framework [41 , 162] offers an intuitive and powerful way of reasoning about fairness, by viewing unfairness as the presence of an unfair causal effect of the sensitive attribute on the decision [18 , 204– 206]. This allows to understand how unfairness biases the data and it helps to select the correct notion of fairness to impose in the learned model.

In literature, it is possible to find a plethora of different methods to generate fair models, namely to impose fairness in the learned model, with respect to one or more notions of fairness and sensitive attributes. These methods can be mainly divided in three families: methods in the first family implement fairness by modifying the data representation, and then employ standard machine learning methods [25, 202]; in the second family, we can find methods that enforce fairness directly during the training phase, i.e. [2 , 200]; the third family of methods change a pre-trained model in order to make it fairer (while trying to maintain the classification performance) [56 , 163].

All methods in the previous families have in common the goal of creating a fair model from scratch on the specific task at hand. This solution may work well in specific cases, but in a large number of real world applications, using the same model (or at least part of it) over different tasks is helpful if not mandatory. For example, it is common to perform a fine tuning over pre-trained models [47], keeping fixed the internal representation. Indeed, most modern machine learning frameworks (especially the deep learning ones) offer a set of pre-trained models that are distributed in so-called model zoos 1 . Unfortunately, fine tuning pre-trained models on novel previously unseen tasks could lead to an unexpected unfairness behaviour, even starting from an apparently fair model for previous tasks (i.e. discriminatory transfer [118] or negative legacy [105]), due to missing generalization guarantees concerning the fairness property of the model.

In order to overcome the above problem, it is more appropriate to look at the learning problem in a multitask/lifelong learning framework [152]. In this context one leverages on task similarity in order to learn a fair representation that provably generalizes well to unseen tasks. By this we mean that when the representation is used to learn novel tasks, it is guaranteed to learn a model that has both a small error and meets the fairness requirement. Many papers in literature already pursued this goal [16 , 186].

In this work, we will focus on describing the state of the art on algorithmic fairness using statistical learning theory, machine learning, and the deep leaning tools to be able to learn fair models and data representation. The rest of this work is organized as follows. Section 2 deals with the problem of formalize a way to measure the fairness. One can refer to [27] for understanding how unfair biases can be made apparent using causal models. Section 3 shows how to impose fairness in learning models from data. Section 4 shows how to learn a fair representation instead of a fair model. Section 5 concludes the work. Please refer to [151] for a deep discussion about the use of the sensitive attribute in the functional form of the models and representations. In order to not over-complicate the notation, we will use the most suited notation for each one of the different sections.

2 Measuring the (un)fairness

Algorithmic fairness obviously requires, as cornerstone, a formal/mathematical definition of fairness and for these reasons researchers have put a lot of effort in the attempt of addressing this problem. This effort has led to the emergence of a rich set of fairness definitions providing researchers and practitioners with criteria to evaluate existing systems or to design new ones. Unfortunately, many of such definitions have been found to be mathematically incompatible [15 , 111], and this has been considered as an unavoidable trade-off establishing fundamental limits on fair machine learning, or as an indication that certain definitions do not map on to social or legal understandings of fairness [35]. Most fairness definitions express properties of the model output with respect to sensitive information. Other definitions, instead, try to define a fair representation concept, namely a way to change the data representation in order to remove the direct or indirect effect of the sensitive feature. Both approaches do not consider the relations among the relevant variables underlying the data-generation mechanism. As different relations would require a model to satisfy different properties in order to be fair, this could lead to erroneously classifying as fair/unfair models exhibiting undesirable/legitimate biases.

Examples of well known notion and widely exploited notions of fairness art Demographic Parity [76], Equal Opportunity [76], Equal Odds [76], Counterfactual Fairness [117], and the Wasserstein Distance [201]. A synthetic review of most of the notions of fairness proposed in literature can be found in Table 1.

Table 1
A synthetic review of most of the notions of fairness.

Notion Abbreviation First Appeared

α-Protection α-P [159]

Indirect Discriminatory Measure ELB [72]

Decision Policy Discrimination DPD [131]

Prediction Dependency PredD [23]

Dataset Discrimination DD [97]

Discrimination Score DS [22]

Calders-Verwer Score CVS [22, 105]

Statistical Parity SP [50]

Mean Difference MD [24]

Area Under ROC Curve AUC [24]

Disparate Impact DI [56]

ε-Fairness ε-F [56]

η-Neutrality η-N [60]

Discrimination Correlation Indicator DCI [125]

Demographic Parity DP [76]

Equal Opportunity EOp [76]

Equal Odds EOd [76]

Fair Prediction Rule FairPR [124]

Indifference Indiff [94]

Total Causal Effect TCE [206]

Cross-Pair Group Fairness CPGF [14]

Hilbert-Schmidt Empirical Cross-Covariance HSIC [160]

Expected Statistical Parity ESP [37]

Expected Predictive Equality EPE [37]

Calibration Calib [37]

Balanced Loss BL [51]

False Positive Subgroup Fairness FPSF [107]

Proxy Discrimination ProxD [108]

Proxy Discrimination in Expectetion PDE [108]

P% -Rule P-R [113]

Normalised Disparate Impact NDI [137]

α-Discrimination α-D [189]

Value Unfairness ValU [194]

Absolute Unfairness AbsU [194]

Underestimation Unfairness UeU [194]

Overestimation Unfairness OeU [194]

Preferred Impact PrefI [199]

Preferred Treatment PrefT [199]

Disparate Mistreatment DM [197]

Absolute Value Difference AVD [13]

Squared Difference SD [13]

Balance Bal [28]

Relaxed Equal Odds with Calibration REOC [163]

Path Specific Effect PSE [143]

Natural Direct Effect NDE [143]

Mean Difference Discrimination Score MDDS [168]

k-way Discrimination Score k-DS [168]

Maximum Discrimination MaxD [168]

Discrimination In Prediction DiscrP [208]

Loss-Averse Statistical Parity L-ASP [5]

Loss-Averse Equal Opportunity L-AEOp [5]

Difference of Equal Opportunity DEO [33]

Hirschfeld-Gebelein-Rényi HGR [134]

Coefficient of Determination Cod ∣ R 2 [114]

Difference of Equal Opportunity DEOp [151]

Difference of Equal Odds DEOd [151]

Subgroup Risk SR [188]

Strong Demographic Parity SDP [91]

Strong Pairwise Demographic Disparity SPDD [91]

Group Fairness in Expectation GFE [59]

Prejudice Index PI [105]

Fair-Factorization FF [106]

Resilience to Random Bias RRB [58]

Normalised Discounted Difference rND [192]

Normalised Discounted Ratio rRD [192]

Normalised Discounted KL-divergence rKL [192]

Explanatory Conditional Discrimination ECD [210]

Expected Conditional Statistical Parity ECSP [37]

Individual Proxy Discrimination IPD [108]

Disparate Treatment (DispT) [197]

Disparity Amplification DA [78]

k-Neighbours Difference k-ND [126]

Fairness Lipschitz Property FLP [50]

Cross-Pair Individual Fairness CPIF [14]

Decision Boundary Covariance DBC [198]

Random Bias Individual Fairness RBIF [57]

Inconsistency Score IS [168]

(α, γ)-Approximately Metric-Fair (α, γ)-AMF [196]

Constant Relative Risk Aversion CRRA [81]

Rawlsian Equal Opportunity R-EOP [82]

Egalitarion Equal Opportunity e-EOP [82]

Generalised Entropy Index GEI [179]

Counterfactual Fairness CF [117]

ε, δ-Approximate Counterfactual Fairness ε, δ-ACF [171]

Counterfactual Direct Effect CF-DE [204]

Counterfactual Indirect Effect CF-IE [204]

Counterfactual Spurious Effect CF-SE [204]

Chebyshev Demographic Parity CDP [207]

Maximum Mean Discrepancy MMD [68]

Fairness Ramp-Constraint FRC [65]

δ-fairness δ-F [96]

Impartiality Score IS [94]

Formal Equality of Opportunity FEO [94]

Full Substantive Equality of Opportunity F-SEO [94]

Log-Linear Interaction LLI [190]

Markov Decision Fairness MDF [88]

Approximate-Choice Markov Decision Fairness α-CF [88]

Approximate-Action Markov Decision Fairness α-AF [88]

Indirect Influence II [1]

ε-Loss Fair ε-LF [49]

α-MultiCalibration α-MC [80]

Covariance Constraint CC [149]

Metric MultiFairness MMC [109]

ε-Loss General Fair ε-LGF [153]

Mutual Information MI [186]

Kullback-Leibler Divergence KL-D [186]

Wasserstein Distance WD [201]

Path Specific Counterfactual Fairness PSCF [26]

Notion	Abbreviation	First Appeared
α-Protection	α-P	[159]
Indirect Discriminatory Measure	ELB	[72]
Decision Policy Discrimination	DPD	[131]
Prediction Dependency	PredD	[23]
Dataset Discrimination	DD	[97]
Discrimination Score	DS	[22]
Calders-Verwer Score	CVS	[22, 105]
Statistical Parity	SP	[50]
Mean Difference	MD	[24]
Area Under ROC Curve	AUC	[24]
Disparate Impact	DI	[56]
ε-Fairness	ε-F	[56]
η-Neutrality	η-N	[60]
Discrimination Correlation Indicator	DCI	[125]
Demographic Parity	DP	[76]
Equal Opportunity	EOp	[76]
Equal Odds	EOd	[76]
Fair Prediction Rule	FairPR	[124]
Indifference	Indiff	[94]
Total Causal Effect	TCE	[206]
Cross-Pair Group Fairness	CPGF	[14]
Hilbert-Schmidt Empirical Cross-Covariance	HSIC	[160]
Expected Statistical Parity	ESP	[37]
Expected Predictive Equality	EPE	[37]
Calibration	Calib	[37]
Balanced Loss	BL	[51]
False Positive Subgroup Fairness	FPSF	[107]
Proxy Discrimination	ProxD	[108]
Proxy Discrimination in Expectetion	PDE	[108]
P% -Rule	P-R	[113]
Normalised Disparate Impact	NDI	[137]
α-Discrimination	α-D	[189]
Value Unfairness	ValU	[194]
Absolute Unfairness	AbsU	[194]
Underestimation Unfairness	UeU	[194]
Overestimation Unfairness	OeU	[194]
Preferred Impact	PrefI	[199]
Preferred Treatment	PrefT	[199]
Disparate Mistreatment	DM	[197]
Absolute Value Difference	AVD	[13]
Squared Difference	SD	[13]
Balance	Bal	[28]
Relaxed Equal Odds with Calibration	REOC	[163]
Path Specific Effect	PSE	[143]
Natural Direct Effect	NDE	[143]
Mean Difference Discrimination Score	MDDS	[168]
k-way Discrimination Score	k-DS	[168]
Maximum Discrimination	MaxD	[168]
Discrimination In Prediction	DiscrP	[208]
Loss-Averse Statistical Parity	L-ASP	[5]
Loss-Averse Equal Opportunity	L-AEOp	[5]
Difference of Equal Opportunity	DEO	[33]
Hirschfeld-Gebelein-Rényi	HGR	[134]
Coefficient of Determination	Cod ∣ R 2	[114]
Difference of Equal Opportunity	DEOp	[151]
Difference of Equal Odds	DEOd	[151]
Subgroup Risk	SR	[188]
Strong Demographic Parity	SDP	[91]
Strong Pairwise Demographic Disparity	SPDD	[91]
Group Fairness in Expectation	GFE	[59]
Prejudice Index	PI	[105]
Fair-Factorization	FF	[106]
Resilience to Random Bias	RRB	[58]
Normalised Discounted Difference	rND	[192]
Normalised Discounted Ratio	rRD	[192]
Normalised Discounted KL-divergence	rKL	[192]
Explanatory Conditional Discrimination	ECD	[210]
Expected Conditional Statistical Parity	ECSP	[37]
Individual Proxy Discrimination	IPD	[108]
Disparate Treatment	(DispT)	[197]
Disparity Amplification	DA	[78]
k-Neighbours Difference	k-ND	[126]
Fairness Lipschitz Property	FLP	[50]
Cross-Pair Individual Fairness	CPIF	[14]
Decision Boundary Covariance	DBC	[198]
Random Bias Individual Fairness	RBIF	[57]
Inconsistency Score	IS	[168]
(α, γ)-Approximately Metric-Fair	(α, γ)-AMF	[196]
Constant Relative Risk Aversion	CRRA	[81]
Rawlsian Equal Opportunity	R-EOP	[82]
Egalitarion Equal Opportunity	e-EOP	[82]
Generalised Entropy Index	GEI	[179]
Counterfactual Fairness	CF	[117]
ε, δ-Approximate Counterfactual Fairness	ε, δ-ACF	[171]
Counterfactual Direct Effect	CF-DE	[204]
Counterfactual Indirect Effect	CF-IE	[204]
Counterfactual Spurious Effect	CF-SE	[204]
Chebyshev Demographic Parity	CDP	[207]
Maximum Mean Discrepancy	MMD	[68]
Fairness Ramp-Constraint	FRC	[65]
δ-fairness	δ-F	[96]
Impartiality Score	IS	[94]
Formal Equality of Opportunity	FEO	[94]
Full Substantive Equality of Opportunity	F-SEO	[94]
Log-Linear Interaction	LLI	[190]
Markov Decision Fairness	MDF	[88]
Approximate-Choice Markov Decision Fairness	α-CF	[88]
Approximate-Action Markov Decision Fairness	α-AF	[88]
Indirect Influence	II	[1]
ε-Loss Fair	ε-LF	[49]
α-MultiCalibration	α-MC	[80]
Covariance Constraint	CC	[149]
Metric MultiFairness	MMC	[109]
ε-Loss General Fair	ε-LGF	[153]
Mutual Information	MI	[186]
Kullback-Leibler Divergence	KL-D	[186]
Wasserstein Distance	WD	[201]
Path Specific Counterfactual Fairness	PSCF	[26]

In this work we will detail, in the following section, just the notions that we will actually use.

3 Methods for learning fair models

Broadly speaking, we can group current literature on algorithmic fairness into three main approaches. The first approach consists in pre-processing the data to remove bias, or in extracting representations that do not contain sensitive information during training [16 , 2010]. The second approach consists in performing a post-processing of the model outputs [1 , 208]. The third approach consists in enforcing fairness notions by imposing constraints into the optimization, or by using an adversary [4 , 203]. Some methods transform the constrained optimization problem via the method of Lagrange multipliers [2 , 197]. Other works similar in spirit add penalties to the objective [49, 114]. Adversarial methods maximize the system ability to predict the target while minimizing the ability to predict the sensitive attribute [203].

A full synthetic review of most of the papers in literature is reported in Table 3. In particular for each paper in literature on fairness we reported

the Method Family: Pre- (PreP), In- (InP), or Post-Processing (PostP);

the kind of Tasks is able to solve: Binary Classification (BC), Multi-class Classification (MC), Regression (R), Clustering (C), or Multi Armed Bandit (MAB);

what kind of Protected Attribute is able to handle: Binary (B), Categorical (C), or Numerical (N);

Notion of Fairness (see Table 1);

if the paper contains or not Theoretical Results;

if the paper contains or not Experimental Results;

the list of papers (methods) compared in the papers;

if the Code is Available.

Table 2
A synthetic review of most of the papers available in the literature

Paper Method Family Task Protected Attribute Notion of Fairness Theoretical Results Experimental Results Comparison Against Code Available

[97] PreP BC B, C DD ✓ ✓

[158] PostP BC B, C α-P ✓

[23] PreP BC B, C PredD ✓ [97]

[22] PreP, InP, PostP BC B DS (CVS) ✓ ✓

[100] InP, PostP BC B DS ✓ [22 , 97] ✓

[98] PreP BC B, C DS ✓ [23, 97] ✓

[126] PreP BC, MC B, C, N k-ND ✓ ✓

[210] PreP BC B ECD ✓ [23, 98] ✓

[72] PreP BC B, C ELB ✓ ✓

[105] InP BC B, C PI ✓ [22]

[50] InP BC B,C,N FLP, SP ✓

[71] PreP BC B, C ELB ✓ ✓

[101] InP BC B, C DS ✓ [22 , 100] ✓

[99] PreP BC B DS ✓ ✓ [22, 100] ✓

[131] PreP BC B, C DPD ✓ ✓

[73] PostP BC B, C α-P ✓ ✓

[24] InP BC, MC, R B, C MD, AUC ✓

[106] InP BC B FF ✓ [22]

[202] PreP, InP BC B SP ✓ [97, 103]

[102] PreP, PostP BC B ECD ✓ [23, 98] ✓

[132] PreP BC B, C α-P ✓ [22] ✓

[74] PreP BC B, C α-P ✓ ✓

[56] PreP BC B DI, ε-F ✓ ✓ [97 , 202]

[123] PreP BC, MC, R B, C, N DP, MMD ✓ [202]

[52] PreP, InP BC B SP ✓ ✓

[60] InP BC, MC, R B, C, N η-N ✓ ✓ [22, 104]

[75] PreP BC B, C α-P ✓ ✓ ✓

[125] InP BC, MC B, C DCI ✓ ✓

[55] PreP BC B, C SP, DI ✓ [202] ✓

[57] PreP, InP BC B DP, RBIF ✓ [202]

[76] PostP BC B, C EOp, EOd ✓ ✓

[65] InP BC B, C FRC ✓ [198] ✓

[96] InP BC B, C δ-F ✓ ✓

[95] InP BC B, C δ-F ✓ ✓ ✓

[58] PostP BC, MC, R B, C, N RRB ✓ ✓ [97 , 202]

[124] PreP BC, MC, R B, C, N FairPR ✓

[94] InP BC, MC, R B, C, N FEO, F-SEO ✓

[190] PostP BC, MC B, C LLI ✓ ✓

[206] PreP BC B, C TFE ✓ [56, 210] ✓

[198] InP BC B, C DBC ✓ [98, 103]

[51] InP BC, MC B, C BL ✓ ✓

[14] InP BC, MC, R B CPIF, CPGF ✓

[88] InP BC B, C MDF, α-CF, α-AF ✓

[93] PreP BC B, C, R FairPR ✓ ✓

[108] InP BC, MC B, C ProxD, PDE, IPD

[113] PreP, InP BC, MC, R B, C, N P-R ✓ ✓ [24 , 202] ✓

[117] InP BC, MC, R B, C CF ✓ ✓

[171] InP BC, MC, R B, C ε, δ-ACF ✓

[189] InP, PostP BC B α-D ✓

[199] InP, PostP BC, MC B, C PrefI, PrefT ✓

[197] InP BC B, C DM ✓ [76]

[192] InP BC B rND, rKL, rRD ✓ ✓

[207] PreP BC B CDP ✓ ✓ [56, 210] ✓

[13] InP BC B AVD, SD ✓ [76, 197]

[28] PreP C B Bal ✓ ✓

[160] PreP, InP BC, MC, R B, C, N HSIC ✓

[16] PreP, InP BC, MC, R B, C DP, EOp ✓

[164] InP BC B, C MMD ✓ [76, 197]

[194] InP BC, MC, R B, C ValU, AbsU, UeU, OeU ✓

[137] PreP BC, MC B SP, NDI ✓ ✓

[149] InP BC B, C CC ✓ ✓ [197]

[163] PostP BC B REOC ✓ ✓ [76, 197]

[1] PostP BC, MC B, C, N II ✓ [83] ✓

[25] PreP BC B, C α-P, DP ✓ ✓ [202]

[37] InP BC B, C ESP, ECSP, EPE ✓

[107] InP BC B, C SP, FPSF ✓ ✓ [2] ✓

[80] InP BC B, C, N α-MC ✓ ✓

[69] InP BC B ESP, EPE

[208] PreP, PostP BC B DiscrP ✓ ✓ [76, 197]

[4] InP BC, MC B, C EOd ✓ ✓

[49] PreP, InP BC B, C ε-LF ✓ ✓ [76, 197] ✓

[2] PostP BC B, C DP, EOd ✓ ✓ [76, 99]

[64] InP MAB B, C FLP

[78] InP BC, MC, R B, C, N DA ✓

[109] PostP BC B, C, N MMC ✓

[129] InP BC B, C EOd ✓

[128] PreP BC B DP, EOp, EOd ✓ ✓ [52] ✓

[145] InP BC, MC B, C DP, EOd ✓ ✓

[203] PreP, InP BC, MC, R B, C DP, EOd, EOp ✓ ✓ [16]

[133] PreP, InP BC, MC B, C DP, EOd ✓ [2 , 203]

[143] InP BC, MC, R B, C PSE, NDE ✓

[168] PostP BC, MC, R B, C, N MDDS, k-DS, MaxD, IS ✓ [97 , 202]

[167] PreP, InP BC B MDDS, IS ✓ [52 , 202]

[196] InP BC, MC, R B, C, N (α, γ)-AMF ✓

[185] Prep, InP BC, MC, R B, C DP, EOd ✓

[67] PreP BC B DI ✓ ✓ [56]

[63] PreP, InP BC, MC, R B, C MI, EOd ✓

[5] PostP BC B L-ASP, L-AEOP ✓

[39] InP BC, MC B, C DP, EOp ✓ ✓ ✓

[40] InP BC, MC B, C DP, EOp ✓ ✓ ✓

[81] InP BC, MC, R B, C, N CRRA ✓

[114] InP BC, MC, R B, C, N CoD ✓ ✓

[186] PreP BC, MC, R B, C, N MI, KL-D ✓

[178] PreP BC, MC, R B, C, N MI ✓ ✓

[82] InP BC, MC, R B, C R-EOP, e-EOP ✓ [81]

[179] InP BC, MC, R B, C GEI ✓ ✓ [197]

[201] PreP BC, MC, R B, C WD ✓ ✓

[59] InP, PostP BC, MC, R B, C GFE ✓ ✓

[144] InP BC, MC, R B, C PSE ✓ ✓

[33] PostP BC B DEO ✓ ✓ [49 , 197]

[86] InP BC B ε-LF ✓

[110] PostP BC B α-MC ✓ ✓ ✓

[134] InP BC, MC, R B, C, N HGR ✓ ✓ [13, 49]

[147] PostP BC B EOd, EOp ✓ [76]

[153] PreP, InP BC, MC, R B, C, N ε-LGF ✓ ✓ [197, 198]

[151] InP BC B, C DEOp, DEOd ✓

[152] PreP, InP BC, MC, R B, C DP ✓ ✓ [52, 128]

[188] InP BC, MC, R B, C, N SR ✓ ✓ [49]

[26] InP BC, MC, R B, C PSCF, MMD ✓ ✓

[91] InP, PostP BC B, C, R SDP, SPDD, WD ✓ ✓ [76]

[200] InP BC, MC B, C DBC, DI, DM ✓ [37 , 103]

[138] PreP BC B SP, DI, FLP ✓

[89] PostP, InPro BC, MC B EOd ✓ ✓

Paper	Method Family	Task	Protected Attribute	Notion of Fairness	Theoretical Results	Experimental Results	Comparison Against	Code Available
[97]	PreP	BC	B, C	DD		✓		✓
[158]	PostP	BC	B, C	α-P		✓
[23]	PreP	BC	B, C	PredD		✓	[97]
[22]	PreP, InP, PostP	BC	B	DS (CVS)		✓		✓
[100]	InP, PostP	BC	B	DS		✓	[22 , 97]	✓
[98]	PreP	BC	B, C	DS		✓	[23, 97]	✓
[126]	PreP	BC, MC	B, C, N	k-ND		✓		✓
[210]	PreP	BC	B	ECD		✓	[23, 98]	✓
[72]	PreP	BC	B, C	ELB		✓		✓
[105]	InP	BC	B, C	PI	✓	[22]
[50]	InP	BC	B,C,N	FLP, SP	✓
[71]	PreP	BC	B, C	ELB		✓		✓
[101]	InP	BC	B, C	DS		✓	[22 , 100]	✓
[99]	PreP	BC	B	DS	✓	✓	[22, 100]	✓
[131]	PreP	BC	B, C	DPD		✓		✓
[73]	PostP	BC	B, C	α-P		✓		✓
[24]	InP	BC, MC, R	B, C	MD, AUC		✓
[106]	InP	BC	B	FF		✓	[22]
[202]	PreP, InP	BC	B	SP		✓	[97, 103]
[102]	PreP, PostP	BC	B	ECD		✓	[23, 98]	✓
[132]	PreP	BC	B, C	α-P		✓	[22]	✓
[74]	PreP	BC	B, C	α-P		✓		✓
[56]	PreP	BC	B	DI, ε-F	✓	✓	[97 , 202]
[123]	PreP	BC, MC, R	B, C, N	DP, MMD		✓	[202]
[52]	PreP, InP	BC	B	SP		✓		✓
[60]	InP	BC, MC, R	B, C, N	η-N	✓	✓	[22, 104]
[75]	PreP	BC	B, C	α-P	✓	✓		✓
[125]	InP	BC, MC	B, C	DCI		✓		✓
[55]	PreP	BC	B, C	SP, DI		✓	[202]	✓
[57]	PreP, InP	BC	B	DP, RBIF		✓	[202]
[76]	PostP	BC	B, C	EOp, EOd	✓	✓
[65]	InP	BC	B, C	FRC		✓	[198]	✓
[96]	InP	BC	B, C	δ-F	✓			✓
[95]	InP	BC	B, C	δ-F	✓	✓		✓
[58]	PostP	BC, MC, R	B, C, N	RRB	✓	✓	[97 , 202]
[124]	PreP	BC, MC, R	B, C, N	FairPR		✓
[94]	InP	BC, MC, R	B, C, N	FEO, F-SEO		✓
[190]	PostP	BC, MC	B, C	LLI		✓		✓
[206]	PreP	BC	B, C	TFE		✓	[56, 210]	✓
[198]	InP	BC	B, C	DBC		✓	[98, 103]
[51]	InP	BC, MC	B, C	BL	✓	✓
[14]	InP	BC, MC, R	B	CPIF, CPGF		✓
[88]	InP	BC	B, C	MDF, α-CF, α-AF	✓
[93]	PreP	BC	B, C, R	FairPR		✓		✓
[108]	InP	BC, MC	B, C	ProxD, PDE, IPD
[113]	PreP, InP	BC, MC, R	B, C, N	P-R	✓	✓	[24 , 202]	✓
[117]	InP	BC, MC, R	B, C	CF	✓	✓
[171]	InP	BC, MC, R	B, C	ε, δ-ACF		✓
[189]	InP, PostP	BC	B	α-D	✓
[199]	InP, PostP	BC, MC	B, C	PrefI, PrefT		✓
[197]	InP	BC	B, C	DM		✓	[76]
[192]	InP	BC	B	rND, rKL, rRD		✓		✓
[207]	PreP	BC	B	CDP	✓	✓	[56, 210]	✓
[13]	InP	BC	B	AVD, SD		✓	[76, 197]
[28]	PreP	C	B	Bal	✓	✓
[160]	PreP, InP	BC, MC, R	B, C, N	HSIC		✓
[16]	PreP, InP	BC, MC, R	B, C	DP, EOp		✓
[164]	InP	BC	B, C	MMD		✓	[76, 197]
[194]	InP	BC, MC, R	B, C	ValU, AbsU, UeU, OeU	✓
[137]	PreP	BC, MC	B	SP, NDI	✓	✓
[149]	InP	BC	B, C	CC	✓	✓	[197]
[163]	PostP	BC	B	REOC	✓	✓	[76, 197]
[1]	PostP	BC, MC	B, C, N	II		✓	[83]	✓
[25]	PreP	BC	B, C	α-P, DP	✓	✓	[202]
[37]	InP	BC	B, C	ESP, ECSP, EPE	✓
[107]	InP	BC	B, C	SP, FPSF	✓	✓	[2]	✓
[80]	InP	BC	B, C, N	α-MC	✓			✓
[69]	InP	BC	B	ESP, EPE
[208]	PreP, PostP	BC	B	DiscrP	✓	✓	[76, 197]
[4]	InP	BC, MC	B, C	EOd	✓			✓
[49]	PreP, InP	BC	B, C	ε-LF	✓	✓	[76, 197]	✓
[2]	PostP	BC	B, C	DP, EOd	✓	✓	[76, 99]
[64]	InP	MAB	B, C	FLP
[78]	InP	BC, MC, R	B, C, N	DA		✓
[109]	PostP	BC	B, C, N	MMC				✓
[129]	InP	BC	B, C	EOd	✓
[128]	PreP	BC	B	DP, EOp, EOd	✓	✓	[52]	✓
[145]	InP	BC, MC	B, C	DP, EOd	✓	✓
[203]	PreP, InP	BC, MC, R	B, C	DP, EOd, EOp	✓	✓	[16]
[133]	PreP, InP	BC, MC	B, C	DP, EOd		✓	[2 , 203]
[143]	InP	BC, MC, R	B, C	PSE, NDE		✓
[168]	PostP	BC, MC, R	B, C, N	MDDS, k-DS, MaxD, IS		✓	[97 , 202]
[167]	PreP, InP	BC	B	MDDS, IS		✓	[52 , 202]
[196]	InP	BC, MC, R	B, C, N	(α, γ)-AMF	✓
[185]	Prep, InP	BC, MC, R	B, C	DP, EOd		✓
[67]	PreP	BC	B	DI	✓	✓	[56]
[63]	PreP, InP	BC, MC, R	B, C	MI, EOd	✓
[5]	PostP	BC	B	L-ASP, L-AEOP		✓
[39]	InP	BC, MC	B, C	DP, EOp	✓	✓		✓
[40]	InP	BC, MC	B, C	DP, EOp	✓	✓		✓
[81]	InP	BC, MC, R	B, C, N	CRRA		✓
[114]	InP	BC, MC, R	B, C, N	CoD	✓	✓
[186]	PreP	BC, MC, R	B, C, N	MI, KL-D		✓
[178]	PreP	BC, MC, R	B, C, N	MI	✓	✓
[82]	InP	BC, MC, R	B, C	R-EOP, e-EOP		✓	[81]
[179]	InP	BC, MC, R	B, C	GEI	✓	✓	[197]
[201]	PreP	BC, MC, R	B, C	WD	✓	✓
[59]	InP, PostP	BC, MC, R	B, C	GFE	✓	✓
[144]	InP	BC, MC, R	B, C	PSE	✓	✓
[33]	PostP	BC	B	DEO	✓	✓	[49 , 197]
[86]	InP	BC	B	ε-LF	✓
[110]	PostP	BC	B	α-MC	✓	✓		✓
[134]	InP	BC, MC, R	B, C, N	HGR	✓	✓	[13, 49]
[147]	PostP	BC	B	EOd, EOp		✓	[76]
[153]	PreP, InP	BC, MC, R	B, C, N	ε-LGF	✓	✓	[197, 198]
[151]	InP	BC	B, C	DEOp, DEOd		✓
[152]	PreP, InP	BC, MC, R	B, C	DP	✓	✓	[52, 128]
[188]	InP	BC, MC, R	B, C, N	SR	✓	✓	[49]
[26]	InP	BC, MC, R	B, C	PSCF, MMD	✓	✓
[91]	InP, PostP	BC	B, C, R	SDP, SPDD, WD	✓	✓	[76]
[200]	InP	BC, MC	B, C	DBC, DI, DM		✓	[37 , 103]
[138]	PreP	BC	B	SP, DI, FLP	✓
[89]	PostP, InPro	BC, MC	B	EOd	✓	✓

Table 3

Public real world datasets exploited in fairness-related problems.

Datasets	Reference	Number of Samples	Number of Features	Sensitive Features	Task
School Effectiveness	[66]	15362	9	Ethnicity, Gender	R
Heart Disease	[90]	303	75	Age, Gender	MC, R
German Credit	[85]	1K	20	Age, Gender/Marital-Stat	MC
Census/Adult Income	[112]	48842	14	Age, Ethnicity, Gender, Native-Country	BC
Contraceptive Method Choice	[121]	1473	9	Age, Religion	MC
Law School Admission	[187]	21792	5	Ethnicity, Gender	R
Arrhythmia	[70]	452	279	Age, Gender	MC
Communities &crime	[169]	1994	128	Ethnicity	R
Wine Quality	[154]	4898	13	Color	MC, R
Heritage Health	[146]	≈60K	≈20	Age, Gender	MC, R
Stop, Question &Frisk	[45]	84868	≈100	Age, Ethnicity, Gender	BC, MC
Bank Marketing	[142]	45211	17-20	Age	BC
Diabetes US	[181]	101768	55	Age, Ethnicity	BC, MC
Student Performance	[38]	649	33	Age, Gender	R
CelebA Faces	[122]	≈200K	40	Gender Skin-Paleness, Youth	BC
xAPI Students Perf.	[6]	480	16	Gender, Nationality, Native-Country	MC
Chicago Faces	[127]	597	5	Ethnicity, Gender	MC
Credit Card Default	[195]	30K	24	Age, Gender	BC
COMPAS	[119]	11758	36	Age, Ethnicity, Gender	BC, MC
MovieLens	[77]	100K	≈20	Age, Gender	R
Drug Consumption	[54]	1885	32	Age, Ethnicity, Gender, Country	MC
Student Academics Perf.	[87]	300	22	Caste, Gender	MC
NLSY	[148]	≈10K	Birth-date, Ethnicity, Gender	BC, MC, R
Diversity in Faces	[140]	1 M	47	Age, Gender	MC, R

This review surely covers many papers in literature regarding the fairness topic but some of them are still missing. For example, all the works dealing just with the problem of detecting, and not solving, unfairness issues like [159 , 204].

Moreover in Table 5 we review the public real world datasets exploited in fairness-related problems.

In the rest of this section we will present in details one method for each one of the three families (Sections 3.1, 3.2 and 3.3).

3.1 In-processing methods

Common notions of fairness, methods, and studies have been developed in the setting of classification with categorical sensitive features. Nevertheless, they can be extended to the general supervised learning setting (regression and classification) with general sensitive features (categorical and continuous) [153]. For example, authors in [49] observed that simple notion of fairness can be incorporated within the Empirical Risk Minimization (ERM) framework. But, as the fairness measures mentioned above are more general than those employed in [49], authors in [153] entended the original framework to cover the whole supervised learning setting named General FERM (G-FERM). Authors in [153] show that G-FERM is supported by consistency guarantees both in terms of risk and fairness measure. Specifically, authors in [153] derive both risk and fairness bounds, which support the statistically consistency of G-FERM. Authors in [153] give a concrete instance of G-FERM in the setting of kernel methods, leading to a form of constrained regularized empirical risk minimization, in which the fairness constraint is obtained by composing the ℓ₁ norm with a linear transformation.

In this section we present this approach and, in particular, we present new generalized notions of fairness that encompass well studied notions used for classification and regression with categorical and numerical sensitive features. Second, we report statistical bounds for G-FERM that imply consistency properties both in terms of fairness measure and risk of the selected model. As a third contribution, we instantiate G-FERM in the setting of kernel methods, leading to an efficient convex estimator. Authors in [153] test this estimators on a series of tests. The experimental results show that the estimator is effective at mitigating the trade-off between accuracy and fairness requirements.

3.1.1 The importance of having a general approach to fairness

In the context of fairness, most papers in literature address the problem of binary classification task with categorical (or even binary) sensitive features [76, 197]; a broad review on classification with categorical sensitive features is provided in [49]. This task is indeed very important, because it is strictly related to the possibility of having access to specific benefits (e.g., loans) without being discriminated due to gender or ethnic characteristics. On the other hand, the set of problems solvable by using these methods is limited and not comprehensive of all the real-world case scenarios.

Focusing on the works able to handle regression tasks, we can divide them by the type of problems they are able to solve and the notion of fairness they exploit. As we will see, with very few exceptions - e.g., [113] - most of the methods in literature are not able to deal with both classification and regression tasks and with both numerical and categorical sensitive features with a unified approach supported by theoretical consistency results. In fact, they introduce task oriented notions of fairness and/or do address the statistical consistency of their method with respect to the risk and the fairness measure employed.

The largest family of methods tackle regression problems with (single) categorical or binary sensitive features [14 , 168]. For example, in [14], a convex approach for regression is proposed, where the authors use a specific definition of fairness in order to have models which treat similar examples in a similar way, in the sense of the predicted outcome. The authors tackle the problem by introducing a new convex regularizer and by imposing this notion on different regression tasks. Another example is [59], where the authors use an adapted version of Demographic Parity [50] for classification, in the context of regression.

Reducing the regression problem to have only categorical sensitive features is a serious limitation. In this sense, few interesting papers present regression methods able to deal with continuous sensitive attributes [113 , 160]. Differently to the approach of this section, the authors impose other definitions of fairness (e.g., Disparate Impact [197] or even ad-hoc brand new definitions). Moreover, it is important to note that these methods do not naturally extend to the case of not-continuous sensitive attributes.

Considering a larger spectrum of possible methodologies, it is possible to find in literature other methods able to solve regression tasks by imposing some concept of fairness. The works [143, 144] tackle the regression problem exploiting the causal machine learning framework. These methods can potentially handle both continuous and categorical sensitive features. The authors’ analysis considers only the case of categorical ones, leaving the evolution to continuous sensitive attributes as possible future works. Another interesting idea, presented in [196], is to study the fairness as a property of the metric of the feature space. The authors introduce a new definition of metric-related fairness allowing them to solve a regression problem with categorical and continuous sensitive attributes. Finally, learning fair pre-processing rules is another possible way to obtain a regression model that is fair. In fact, for example in [202], the fair representation of the data can be used in synergy with any classic regression method, in order to generate a fair regression model.

3.1.2 Setting

Let $D = {(x_{1}, s_{1}, y_{1}),$ …, (x_n, s_n, y_n)} be a training set formed by n samples drawn independently from an unknown probability distribution μ over $X \times S \times Y$ , where $X$ is the input space, $S$ is the space of the sensitive attributes and $Y$ is the output space. $S$ is the set of attributes for which we aim to ensure fairness (e.g., the race if we want to decide to offer a loan, the gender if we want to predict the student performance, and sexual orientation if we want to decide to hire someone). Both $S$ and $Y$ may be finite or continuous; if $Y$ is a finite set of labels we are dealing with the classification setting and if $Y \subseteq R$ we are dealing with the regression setting.

Let K and Q be positive integers and define the sets $Y_{K} = {t_{1}, \dots, t_{K + 1}}$ ⊂ $ℝ$ and $S_{Q} = {σ_{1}, \dots, σ_{Q + 1}}$ ⊂ $ℝ$ , where t₁ < t₂ < ⋯ < t_K+1, and σ₁ < σ₂ < ⋯ < σ_Q+1. The sets $Y_{K}$ and $S_{Q}$ are prescribed by the user: the discretization process is driven by the application at hand and points in the same interval are regarded as indistinguishable. For example, it does not make sense to state that a group of students in an university is mistreated because the average grades are distant by less than 5% of the mark range. We also define, for every 1 ≤ k ≤ K and 1 ≤ q ≤ Q, the subsets of training points $D_{k, q} = {(x_{i}, s_{i}, y_{i}) : 1 \leq i \leq n, y \in [t_{k}, t_{k + 1}), s \in [σ_{q}, σ_{q + 1})}$ and let $n_{k, q} = | D_{k, q} |$ .

We consider a function (or model) f chosen from a set $F$ of possible ones. The functional form of the model may explicitly depend on the sensitive feature (i.e. $f : X \times S \to ℝ$ ) or not (i.e. $f : X \to ℝ$ ) based on specific legal requirements in the application at hand [51, 151]. For this reason we will indicate $f : Z \to ℝ$ where $Z$ may contain the sensitive feature (i.e. $Z = X \times S$ ) or not (i.e. $Z = X$ ). The error (risk) of f is measured by a prescribed loss function $ℓ : ℝ \times Y \to ℝ$ . The risk of a model L (f), together with its empirical counterpart $\hat{L} (f)$ , are defined respectively as $L (f) = E [ℓ (f (z), y)]$ , and $\hat{L} (f) = \frac{1}{n} \sum_{(z, y) \in D} ℓ (f (z), y)$ . When necessary we will indicate with a subscript the particular loss function used and the associated risk, i.e. $L_{p} (f) = E [ℓ_{p} (f (z), y)]$ .

The purpose of a learning procedure is to find a model that minimizes the risk. Since the probability measure μ is usually unknown, the risk cannot be computed, however we can compute the empirical risk and a natural learning strategy, called Empirical Risk Minimization (ERM), is then used to minimize the empirical risk within a prescribed set of functions, see e.g., [175].

3.1.3 ε-loss general fair

In the literature different definitions of fairness of a classifier or real-valued function exist as described in Section 3.1.1. It is important to stress that there is not yet a consensus about which definition should be employed to evaluate algorithmic fairness. Moreover, most of the current fairness definitions are not able to deal with regression problems (or with continuous sensitive attributes), losing their meaning or being even not definable. In this work we use a general notion of fairness able to deal with both classification and regression and with both categorical and numerical sensitive features and which generalizes previously known notions of fairness.

Definition 1. A model f is ε-general fair (ε-GF) with ε ∈ [0, 1] if it satisfies the following condition

$\frac{1}{K Q^{2}} \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | P^{k, p} (f) - P^{k, q} (f) | \leq ε,$ (1) where, for every 1 ≤ k ≤ K and 1 ≤ q ≤ Q, we have defined the conditional probabilities $P^{k, q} (f) =$ (2) $ℙ {f (z) \in .$

We say that a function f is ε-loss general fair (ε-LGF) with ε ∈ [0, 1] if it satisfies the following condition

$\frac{1}{K Q^{2}} \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | L_{k}^{k, p} (f) - L_{k}^{k, q} (f) | \leq ε .$ (3)

This definition says that a model is fair if its errors, relative to the loss function, are approximately equally distributed independently of the value of the sensitive attribute. Definition 2 includes Definition 1 when we choose $ℓ_{k} (\hat{y}, y) = 1 {\hat{y} \notin [t_{k}, t_{k + 1})}$ , for 1 ≤ k ≤ K. Moreover, it is possible to link Definition 3.1.3 to other fairness measures used before in the literature.

Remark 1. If we choose $Y = {- 1, + 1}$ , $S = {0, 1}$ , $Y_{K} = {- 1.5,$ 0, +1.5}, $S_{Q} = {- 0.5,$ 0.5, 1.5}, ε = 0, and, for every 1 ≤ k ≤ K, let ℓ_k be the 0-1-loss, that is $ℓ_{k} (y, y) = 1 {y \hat{y} \leq 0}$ , then Definition 2 reduces to the notion of Equalized Odds [49, 76]. On the other hand, in the same setting, if we let, for every k, ℓ_k be the linear loss, $ℓ_{k} (\hat{y}, y) = (1 - y \hat{y}) / 2$ , then we recover other notions of fairness introduced in [51]. When ε = 0, $Y \subseteq R$ , $S = {0, 1}$ , $Y_{K} = {- \infty, \infty}$ , $S_{Q} =$ {-0.5, 0.5, 1.5} then Definition 2 reduces to the notion of Mean Distance introduced in [24] and also exploited in [113]. Finally, in the same setting, if $S \subseteq R$ in [113] it is proposed to use the correlation coefficient which is equivalent to setting $S_{Q} = S$ in Definition 3.1.3.

3.1.4 General fair empirical risk minimization

In this section, we aim at minimizing the risk subject to a fairness constraint. Specifically, we consider the problem

$min_{f \in F} {L (f) : \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | L_{k}^{k, p} (f) - L_{k}^{k, q} (f) | \leq ε},$ (4) where ε ∈ [0, 1] is the amount of unfairness that we are willing to bear. Since the measure μ is unknown we replace the deterministic quantities with their empirical counterparts. That is, we replace Problem (5) with

$min_{f \in F} {\hat{L} (f) : \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | {\hat{L}}_{k}^{k, p} (f) - {\hat{L}}_{k}^{k, q} (f) | \leq \hat{ε}},$ (5) where $\hat{ε} \in [0, 1]$ , and, for every k ∈ {1, ⋯ , K} and every q ∈ {1, ⋯ , Q} we define the empirical conditional risks

${\hat{L}}_{k}^{k, q} (f) = \frac{1}{n_{k, q}} \sum_{(z, y) \in D_{k, q}} ℓ_{k} (f (z), y) .$ (6)

We will refer to Problem (6) as G-FERM since it generalizes the FERM approach introduced in [49].

3.1.5 Statistical analysis

Let f^* be a solution of Problem (5), and let $\hat{f}$ a solution of Problem (6). In this section we will show that these solutions are linked one to another. In particular, if the parameter $\hat{ε}$ is chosen appropriately, we will show that, in a certain sense, the estimator $\hat{f}$ is consistent. The analysis extends the reasoning in [49] to the more general setting presented here.

For this purpose, we require that for any data distribution, it holds with probability at least 1 - δ with respect to the draw of a dataset that

$sup_{f \in F} | L (f) - \hat{L} (f) | \leq B (δ, n, F),$ (7) where $B (δ, n, F)$ goes to zero as n grows to infinity, that is the class $F$ is learnable with respect to the loss [175]. Moreover $B (δ, n, F)$ is usually an exponential bound, which means that $B (δ, n, F)$ grows logarithmically with respect to the inverse of δ.

Remark 2. If $F$ is a compact subset of linear separators in a reproducing kernel Hilbert space, and the loss is Lipschitz in its first argument, then $B (δ, n, F)$ can be obtained via Rademacher bounds [11]. In this case $B (δ, n, F)$ goes to zero at least as $\sqrt{1 / n}$ as n grows and decreases with δ as $\sqrt{ln (1 / δ)}$ .

We are now ready to state the first result of this section.

Theorem 1. Let $F$ be a learnable set of functions with respect to the loss function $ℓ : ℝ \times Y \to ℝ$ , let f^* be a solution of Problem (5) and let $\hat{f}$ be a solution of Problem (6) with $\hat{ε} = ε + \sum_{k = 1}^{K} \sum_{q, q^{'} = 1}^{Q} \sum_{p \in {q, q^{'}}} B (δ, n_{k, p}, F) .$ (8) With probability at least 1 - δ it holds simultaneously that $\sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | L_{k}^{k, p} (f) - L_{k}^{k, q} (f) | \leq ε +$ (9) $\begin{matrix} 2 \sum_{k = 1}^{K} \sum_{q, q^{'} = 1}^{Q} \sum_{p \in {q, q^{'}}} B (\frac{δ}{(4 K Q^{2} + 2)}, n_{k, p}, F), \\ L (\hat{f}) - L (f^{*}) \leq 2 B (\frac{δ}{(4 K Q^{2} + 2)}, n, F) . \end{matrix}$ (10)

The proof is reported in [153] and is the generalization of the one in [49]. The proof is based on the triangular inequality and union bound.

A consequence of the first statement of Theorem 1 is that as n tends to infinity $L (\hat{f})$ , namely the error of the estimator, tends to a value which is not larger than L (f^*), namely the error of the oracle, that is, G-FERM is consistent with respect to the risk of the selected model. The second statement of Theorem 1, instead, implies that as n tends to infinity we have that $\hat{f}$ tends to be ε-fair, namely the estimator will be ε-fair as the oracle. In other words, G-FERM is consistent with respect to the fairness of the selected model.

Remark 3. Since K, Q ≤ n the bound in Theorem 1 behaves as $\sqrt{ln (1 / δ) / n}$ in the same setting of Remark 3.1.5 which is optimal [175].

Thanks to Theorem 1 we can state that f^* is close to $\hat{f}$ both in term of its risk and its fairness. Nevertheless, the final goal is to find an $f_{h}^{*}$ which solves the following problem

$min_{f \in F} {L (f) : \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | P^{k, p} (f) - P^{k, q} (f) | \leq ε} .$ (11)

Note that, the quantities in Problem (12) cannot be computed since the underline data generating distribution is unknown. Moreover, the objective function and the fairness constraint of Problem (12) are non convex.

Theorem 1 allows us to solve the first issue since we can safely search for a solution ${\hat{f}}_{h}$ of the empirical counterpart of Problem (12), which is given by

$min_{f \in F} {\hat{L} (f) : \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | {\hat{P}}^{k, p} (f) - {\hat{P}}^{k, q} (f) | \leq \hat{ε}},$ (12) where

${\hat{P}}^{k, q} (f) = \frac{1}{n_{k, q}} \sum_{(z, y) \in D_{k, q}} 1 {f (z) \in [t_{k}, t_{k + 1})} .$ (13)

Unfortunately, Problem (12)empirical is still a difficult non-convex non-smooth problem, and for this reason it is more convenient to solve a convex relaxation. That is, we replace the possible non-convex loss function in the risk with its convex upper bound ℓ_c (e.g., the square loss ℓ_c = (y - f (z)) ²) and the losses ℓ_k, 1 ≤ k ≤ K, in the constraint with a relaxation (e.g., the linear loss $ℓ_{l} (\hat{y}, y) = \hat{y} - y$ ) which allows to make the constraint convex. In this way, we look for a solution ${\hat{f}}_{c}$ of the convex G-FERM problem

$min_{f \in F} {{\hat{L}}_{c} (f) : \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | {\hat{L}}_{l}^{k, p} (f) - {\hat{L}}_{l}^{k, q} (f) | \leq \hat{ε}} .$ (14)

Note that this approximation of the fairness constraint corresponds to matching the first order moment [49].

The questions that arise here are whether ${\hat{f}}_{c}$ is close to ${\hat{f}}_{h}$ , how much, and under which assumptions. The following proposition sheds some lights on these issues.

Proposition 1. If ℓ_c is a convex upper bound of the loss exploited to compute the risk then ${\hat{L}}_{h} (f) \leq {\hat{L}}_{c} (f)$ . Moreover, if for $f : X \to ℝ$ and for ℓ_l $\begin{matrix} \sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | {\hat{P}}^{k, p} (f) - {\hat{P}}^{k, q} (f) | \\ - | {\hat{L}}_{l}^{k, p} (f) - {\hat{L}}_{l}^{k, q} (f) | \leq \hat{Δ}, \end{matrix}$ (15) with $\hat{Δ}$ small, then also the fairness is well approximated.

The first statement of Proposition 1 tells us that exploiting the quality in approximating the risk depends on the quality of the convex approximation. The second statement of Proposition 1, instead, tells us that if $\hat{Δ}$ is small then the linear loss based fairness is close to the GF. This condition is quite natural, empirically verifiable, and it has been exploited in previous work [49, 135]. Moreover, in [153] authors present experiments showing that $\hat{Δ}$ is small.

The bound in Proposition 1 may be tightened by using different non-linear approximations of the GF. However, the linear approximation proposed in this work gives a convex problem, and as showed in [153], works well in practice.

In summary, the combination of Theorem 1 and Proposition 1 provides conditions under which a solution ${\hat{f}}_{c}$ of Problem (6), which is convex, is close, both in terms of risk and fairness measure, to a solution $f_{h}^{*}$ of Problem (12), which is the final goal.

3.1.6 G-FERM with kernel methods

In this section, we specify the G-FERM framework to the case that the underlying space of models is a reproducing kernel Hilbert space (RKHS) [176, 177].

We let $κ : Z \times Z \to ℝ$ be a positive definite kernel and let $φ : Z \to ℍ$ be an induced feature mapping such that κ (z, z′) = 〈φ (z) , φ (z′) 〉, for all $z, z^{'} \in Z$ , where $ℍ$ is the Hilbert space of square summable sequences. Functions in the RKHS can be parametrized as $f (z) = 〈 w, φ (z) 〉, z \in Z,$ (16) for some vector of parameters $w \in ℍ$ . In practice a bias term (threshold) can be added to f but to ease the presentation we do not include it here.

We propose to solve Problem (15) in the case that $F$ is a ball in the RKHS and employ a convex loss function $ℓ_{c} (y, \hat{y})$ to measure the empirical error. Standard choices are the square loss in the case of regression or the hinge loss in the case of binary classification. They are defined, for every $y, \hat{y} \in ℝ$ , as $(y - \hat{y})^{2}$ and $max (0, 1 - y \hat{y})$ , respectively. As for the fairness constraint we use the linear loss function ℓ_l which implies the constraint to be convex. Then, we introduce the mean of the feature vectors associated with the training points restricted by the discretization of the sensitive feature and real outputs, namely

$u_{k, q} = \frac{1}{N_{k, q}} \sum_{(z, y) \in D_{k, q}} φ (z) .$ (17)

Using Equation (17) the constraint in Problem (15) becomes

$\sum_{k = 1}^{K} \sum_{p, q = 1}^{Q} | 〈 w, u_{k, p} - u_{k, q} 〉 | \leq \hat{ε},$ (18) which can be written with more compact notation as $∥ A^{T} w ∥_{1} \leq \hat{ε}$ , where A is the matrix having as columns the vectors u_k,p - u_k,q. With this notation, the fairness constraint can be interpreted as the composition of $\hat{ε}$ ball of the ℓ₁ norm with a linear transformation A.

In practice, we solve the following Tikhonov regularization problem $min_{w \in ℍ} \sum_{(z, y) \in D} ℓ_{c} (y, 〈 w, φ (z) 〉) + λ ∥ w ∥^{2}$ (19) $s . t . ∥ A^{⊤} w ∥_{1} \leq \hat{ε},$ where λ is a positive parameter. Note that, if $\hat{ε} = 0$ the constraint reduces to the linear constraint A^⊤ w = 0.

Problem (20) can be kernelized by observing that, thanks to the Representer Theorem [176] $w = \sum_{(z, y) \in D} φ (z) .$ (20)

The dual of Problem (20) may be derived using Fenchel duality, see e.g., [19, Theorem 3.3.5].

Finally, we note that in the case when φ is the identity mapping (i.e., κ is the linear kernel on $ℝ^{d}$ ) and $\hat{ε} = 0$ then the fairness constraint of Problem (20) can be implicitly enforced by making a change of representation [49].

3.2 Pre-processing methods

In this section we will show that the method presented in Section 3.1 can be translated into a pre-processing method. To simplify, in this section, we will illustrate just the case of binary classification with a binary valued sensitive feature but the result can be easily generalized [49, 153].

3.2.1 Fair empirical risk minimization

We begin by introducing the notation. We let $D = {(x_{1}, s_{1}, y_{1}), \dots, (x_{n}, s_{n}, y_{n})}$ be a sequence of n samples drawn independently from an unknown probability distribution μ over $X \times S \times Y$ , where $Y = {- 1, + 1}$ is the set of binary output labels, $S = {a, b}$ represents group membership among two groups (e.g., ‘female’ or ‘male’), and $X$ is the input space. We note that the input $x \in X$ may further contain or not the sensitive feature $s \in S$ in it². We also denote by $D^{+, g} = {(x_{i}, s_{i}, y_{i}) : y_{i} = 1, s_{i} = g}$ for g ∈ {a, b} and $n^{+, g} = | D^{+, g} |$ . Let us consider a function (or model) $f : X \to ℝ$ chosen from a set $F$ of possible models. The error (risk) of f in approximating μ is measured by a prescribed loss function $ℓ : ℝ \times Y \to ℝ$ . The risk of f is defined as $L (f) = E [ℓ (f (x), y)]$ . When necessary we will indicate with a subscript the particular loss function used, i.e., $L_{p} (f) = E [ℓ_{p} (f (x), y)]$ .

The purpose of a learning procedure is to find a model that minimizes the risk. Since the probability measure μ is usually unknown, the risk cannot be computed, however we can compute the empirical risk $\hat{L} (f) = \hat{E} [ℓ (f (x), y)]$ , where $\hat{E}$ denotes the empirical expectation. A natural learning strategy, called Empirical Risk Minimization (ERM), is then to minimize the empirical risk within a prescribed set of functions.

Fairness Definitions

Let us introduce slightly less general notions of fairness with respect to the one defined in Section 3.1, which still encompasses some previously used notions and it allows to introduce new ones by specifying the loss function used below.

Definition 3. Let $L^{+, g} (f) = E [ℓ (f (x), y) | y = 1, s = g]$ be the risk of the positive labeled samples in the g-th group, and let ε ∈ [0, 1]. We say that a function f is ε-fair if |L^+,a (f) - L^+,b (f) | ≤ ε.

This definition says that a model is fair if it commits approximately the same error on the positive class independently of the group membership. That is, the conditional risk L^+,g is approximately constant across the two groups. Note that if ε = 0 and we use the hard loss function, ℓ_h (f (x) , y) = 1_{yf(x)≤0}, then Definition 3 is equivalent to definition of EO proposed by [76], namely:

$ℙ {f (x) > 0 | y = 1, s = a} = ℙ {f (x) > 0 | y = 1, s = b} .$ (21)

This equation means that the true positive rate is the same across the two groups. Furthermore, if we use the linear loss function ℓ_l (f (x) , y) = (1 - yf (x))/2 and set ε = 0, then Definition 3 gives

$E [f (x) | y = 1, s = a] = E [f (x) | y = 1, s = b] .$ (22)

By reformulating this expression we obtain a notion of fairness introduced in [51]

$\sum_{g \in {a, b}} | E [f (x) | y = 1, s = g] - E [f (x) | y = 1] | = 0 .$

Yet another implication of Equation (23) is that the output of the model is uncorrelated with respect to the group membership conditioned on the label being positive [48], that is, for every g ∈ {a, b}, we have

$E [f (x) 1_{{s = g}} | y = 1] = E [f (x) | y = 1] E [1_{{s = g}} | y = 1] .$

Finally, we observe that the approach naturally generalizes to other fairness measures that are based on conditional probabilities, e.g., Equal Odds [76]. Specifically, we would require in Definition 3 that |L^y,a (f) - L^y,b (f) | ≤ ε for both y ∈ {-1, 1}.

Fair Empirical Risk Minimization

In this section, we aim at minimizing the risk subject to a fairness constraint. Specifically, we consider the problem

$min {L_{h} (f) : f \in F, | L_{h}^{+, a} (f) - L_{h}^{+, b} (f) | \leq ε},$ (23) where ε ∈ [0, 1] is the amount of unfairness that we are willing to bear. Basically we want to minimize the misclassification error keeping the EO small.

Since the measure μ is unknown we replace the deterministic quantities with their empirical counterparts, possibly resulting in a convex problem (as we did in Section 3.1). Then, we replace the deterministic quantity with their empirical counterparts, the hard loss in the risk with a convex loss function ℓ_c (e.g., the Hinge loss ℓ_c = max {0, ℓ _l}) and the hard loss in the constraint with the linear loss ℓ_l. In this way, we look for a solution ${\hat{f}}_{c}$ of the convex FERM problem

$min {{\hat{L}}_{c} (f) : f \in F, | {\hat{L}}_{l}^{+, a} (f) - {\hat{L}}_{l}^{+, b} (f) | \leq \hat{ε}} .$ (24)

Note that this approximation of the EO constraint corresponds to matching the first order moment. Other works tries to match the second order moment [189] or potentially infinitely many moments [164] but these approaches result in non-convex approaches.

Note that the solutions of Problems (24) and (25) are close under the same hypothesis described in Section 3.1.

3.2.2 Fair learning with kernels

In this section, we specify the FERM framework to the case that the underlying space of models is a reproducing kernel Hilbert space (RKHS), see e.g., [176, 177] and references therein. We let $κ : X \times X \to ℝ$ be a positive definite kernel and let $φ : X \to ℍ$ be an induced feature mapping such that κ (x, x′) = 〈φ (x) , φ (x′) 〉, for all $x, x^{'} \in X$ , where $ℍ$ is the Hilbert space of square summable sequences. Functions in the RKHS can be parametrized as $f (x) = 〈 w, φ (x) 〉, x \in X,$ (25) for some vector of parameters $w \in ℍ$ . In practice a bias term (threshold) can be added to f but to ease the presentation we do not include it here.

We solve Problem (25) with $F$ a ball in the RKHS and employ a convex loss function ℓ. As for the fairness constraint we use the linear loss function, which implies the constraint to be convex. Let u_g be the barycenter in the feature space of the positively labelled points in the group g ∈ {a, b}, that is

$u_{g} = \frac{1}{n^{+, g}} \sum_{i \in I^{+, g}} φ (x_{i}),$ (26) where $I^{+, g} = {i : y_{i} = 1, s_{i} = g}$ . Then using Equation (26) the constraint in Problem (25) takes the form |〈w, u_a - u_b〉| ≤ ε.

In practice, we solve the Tikhonov regularization problem $min_{w \in ℍ} \sum_{i = 1}^{n} ℓ (〈 w, φ (x_{i}) 〉, y_{i}) + λ ∥ w ∥^{2}$ (27) $s . t . | 〈 w, u 〉 | \leq ε,$ where u = u_a - u_b and λ is a positive parameter which controls model complexity. In particular, if ε = 0 the constraint in Problem (29) reduces to an orthogonality constraint that has a simple geometric interpretation. Specifically, the vector w is required to be orthogonal to the vector formed by the difference between the barycenters of the positive labelled input samples in the two groups.

By the representer theorem [173], the solution to Problem (29) is a linear combination of the feature vectors φ (x₁) , …, φ (x_n) and the vector u. However, in this case u is itself a linear combination of the feature vectors (in fact only those corresponding to the subset of positive labeled points) hence w is a linear combination of the input points, that is $w = \sum_{i = 1}^{n} α_{i} φ (x_{i})$ . The corresponding function used to make predictions is then given by $f (x) = \sum_{i = 1}^{n} α_{i} κ (x_{i}, x)$ . Let K be the Gram matrix. The vector of coefficients α can then be found by solving $min_{α \in ℝ^{n}} \sum_{i = 1}^{n} ℓ (\sum_{j = 1}^{n} K_{ij} α_{j}, y_{i}) + λ \sum_{i, j = 1}^{n} α_{i} α_{j} K_{ij}$ (28) $s . t . | \sum_{i = 1}^{n} α_{i} [\frac{1}{n^{+, a}} \sum_{j \in I^{+, a}} K_{ij} - \frac{1}{n^{+, b}} \sum_{j \in I^{+, b}} K_{ij}] | \leq ε .$

Let us consider Problem (29) when φ is the identity mapping (i.e., κ is the linear kernel on $ℝ^{d}$ ) and ε = 0. In this special case we can solve the orthogonality constraint 〈w, u〉=0 for w_i, where the index i is such that |u_i| = ∥ u ∥ _∞, obtaining that $w_{i} = - \sum_{j = 1, j \neq i}^{d} w_{j} \frac{u_{j}}{u_{i}}$ . Consequently the linear model rewrites as $\sum_{j = 1}^{d} w_{j} x_{j} = \sum_{j = 1, j \neq i}^{d} w_{j} (x_{j} - x_{i} \frac{u_{j}}{u_{i}})$ . In this way, we then see the fairness constraint is implicitly enforced by making the change of representation $x \mapsto \tilde{x} \in ℝ^{d - 1}$ , with ${\tilde{x}}_{j} = x_{j} - x_{i} \frac{u_{j}}{u_{i}}, j \in {1, \dots, i - 1, i + 1, \dots, d} .$ (29)

In other words, we are able to obtain a fair linear model without any other constraint and by using a representation that has one feature fewer than the original one.

We can extend the approach to the non-linear case by also defining a fair kernel matrix instead of fair data mapping [49, 153].

3.3 Post-processing methods

In this section, we study the problem faced by [32] of fair binary classification using the notion of Equal Opportunity. It requires the true positive rate to distribute equally across the sensitive groups. Within this setting, it is possible to show that the fair optimal classifier is obtained by recalibrating the Bayes classifier by a group-dependent threshold. We provide a constructive expression for the threshold. This result motivates us to devise a plug-in classification procedure based on both unlabeled and labeled datasets. While the latter is used to learn the output conditional probability, the former is used for calibration. The overall procedure can be computed in polynomial time and it is shown to be statistically consistent both in terms of the classification error and fairness measure.

More precisely, in [32] the important problem of devising statistically consistent and computationally efficient learning procedures that meet the fairness constraint is addressed. Specifically, makes the following contributions are derived in [32]. First, authors derive in Proposition 2 the expression for the optimal equal opportunity classifier, derived via thresholding of the Bayes regressor. Second, inspired by the above result authors proposed a semi-supervised plug-in type method, which first estimates the regression function on labeled data and then estimates the unknown threshold using unlabeled data. Consequently, authors establish in Theorem 2 that the proposed procedure is consistent, that is, it asymptotically satisfies the equal opportunity constraint and its risk converges to the risk of the optimal equal opportunity classifier. Finally, in [32], authors present numerical experiments which indicate that the method is often superior or competitive with the state-of-the-art on benchmark datasets.

We highlight that the proposed learning algorithm can be applied on top of any off-the shelf method which consistently estimates the regression function (class condition probability), under mild additional assumptions which we discuss in the paper. Furthermore, the calibration procedure is based on solving a simple univariate problem. Hence the generality, statistical consistency and computational efficiency are strengths of the approach.

3.3.1 Related work

To the best of our knowledge the formula for the optimal fair classifier presented in [32] is novel. In [76] the authors note that the optimal equalized odds or equal opportunity classifier can be derived from the Bayes optimal regressor, however, no explicit expression for this threshold is provided. The idea of recalibrating the Bayes classifier is also discussed in a number of papers, see for example [139, 163] and references therein.

Plug-in methods in classification problems are well established and are well studied from statistical perspective, see [10 , 193] and references therein; in particular, it is known that one can build a plug-in type classifier which is optimal in minimax sense [10, 193]. Until very recently, theoretical studies on such methods were reduced to an efficient estimation of the regression function. Indeed, in standard settings of classification the threshold is always known beforehand, thus, all the information about the optimal classifier is wrapped into the distribution of the label conditionally on the feature.

More recently, classification problems with a distribution dependent threshold have emerged. Prominent examples include classification with non-decomposable measures [116 , 209], classification with reject option [43, 120], and confidence set setup of multi-class classification [31 , 172], among others. A typical estimation algorithm in these scenarios is based on the plug-in strategy, which uses extra data to estimate the unknown threshold. Interestingly, in some setups a practitioner does not need to have access to two labeled samples and optimal estimation can be efficiently performed in semi-supervised manner [31, 44].

3.3.2 Optimal equal opportunity classifier

Let (X, S, Y) be a tuple on $ℝ^{d} \times {0, 1} \times {0, 1}$ having a joint distribution $ℙ$ . Here the vector $X \in ℝ^{d}$ is seen as the vector of features, S ∈ {0, 1} a binary sensitive variable and Y ∈ {0, 1} a binary output label that we wish to predict from the pair (X, S). We also assume that the distribution is non-degenerate in Y and S that is $ℙ (S = 1) \in (0, 1)$ and $ℙ (Y = 1) \in (0, 1)$ . A classifier g is a measurable function from $ℝ^{d} \times {0, 1}$ to {0, 1}, and the set of all such functions is denoted by $G$ . In words, each classifier receives a pair $(x, s) \in ℝ^{d} \times {0, 1}$ and outputs a binary prediction g (x, s) ∈ {0, 1}. For any classifier g we introduce its associated miss-classification risk as

$R (g) : = ℙ g (X, S) \neq Y .$ (30) A fair optimal classifier is formally defined as $\begin{matrix} g^{*} \in \underset{g \in G}{\arg \min} {R (g) g is fair} . \end{matrix}$

There are various definitions of fairness available in the literature, each having its critics and its supporter. In this work, we employ the following definition introduced in [76]. We refer the reader to this work as well as [2 , 139] for a discussion, motivation of this definition, and a comparison to other fairness definitions.

Definition 4. [Equal Opportunity [76]] A classifier (x, s) ↦ g (x, s) ∈ {0, 1} is called fair if $\begin{matrix} g (X, S) = 1 S = 1, Y = 1 & = g (X, S) = 1 S = 0, Y = 1 . \end{matrix}$

The set of all fair classifiers is denoted by $F (ℙ)$ .

Note, that the definition of fairness depends on the underlying distribution $ℙ$ and hence the whole class $F (ℙ)$ of the fair classifiers should be estimated. Further, notice that the class $F (ℙ)$ is non-empty as it always contains a classifier g (x, s) ≡0.

Using this notion of fairness we define an optimal equal opportunity classifier as a solution of the optimization problem

$\begin{matrix} ∥ & min_{g \in G} {R (g) : g (X, S) = 1 Y = 1, S = 1 \\ = g (X, S) = 1 Y = 1, S = 0} . \end{matrix}$ (31)

We now introduce an assumption on the regression function that plays an important role in establishing the form of the optimal fair classifier.

Assumption 1. For each s ∈ {0, 1} we require the mapping t ↦ η (X, S) ≤ tS = s to be continuous on (0, 1), where for all $(x, s) \in ℝ^{d} \times {0, 1}$ , we let the regression function $\begin{matrix} η (x, s) & : = Y = 1 X = x, S = s \\ = Y X = x, S = s . \end{matrix}$

Moreover, for every s ∈ {0, 1}, we assume that η (X, s) ≥1/2S = s > 0.

The first part of Assumption 3.3.2 is achieved by many distributions and has been introduced in various contexts, see e.g., [31 , 191] and references therein. It says that, for every s ∈ {0, 1} the random variable η (X, s) does not have atoms, that is, the event ensη (X, s) = t has probability zero. The second part of the assumption states that the regression function η (X, s) must surpass the level 1/2 on a set of non-zero measure. Informally, returning to scholarship example mentioned in the introduction, this assumption means that there are individuals from both groups who are more likely to be offered a scholarship based on their curriculum.

In the following result we establish that the optimal equal opportunity classifier is obtained by recalibrating the Bayes classifier.

Proposition 2. [Optimal Rule] Under Assumption 3.3.2 an optimal classifier g^* can be obtained for all $(x, s) \in ℝ^{d} \times {0, 1}$ as $\begin{matrix} g^{*} (x, 1) = 1 \leq η (x, 1) 2 - \frac{θ^{*}}{ℙ Y = 1, S = 1}, \\ g^{*} (x, 0) = 1 \leq η (x, 0) 2 + \frac{θ^{*}}{ℙ Y = 1, S = 0} \end{matrix}$ where $θ^{*} \in ℝ$ is determined from the equation $\begin{matrix} \frac{E_{X | S = 1} [η (X, 1) 1 \leq η (X, 1) 2 - \frac{θ^{*}}{ℙ Y = 1, S = 1}]}{Y = 1 S = 1} \\ = \frac{E_{X | S = 0} [η (X, 0) 1 \leq η (X, 0) 2 + \frac{θ^{*}}{ℙ Y = 1, S = 0}]}{Y = 1 S = 0} . \end{matrix}$

Furthermore it holds that absθ^* ≤ 2.

The proof is reported in [32].

Before proceeding further, let us define a notion of unfairness, which plays a key role in the statistical analysis; it is sometimes referred to as difference of equal opportunity (DEO) in the literature [49].

Definition 5. [Unfairness] For any classifier g we define its unfairness as $\begin{matrix} Δ (g, ℙ) & = | g (X, S) = 1 S = 1, Y = 1 & - g (X, S) = 1 S = 0, Y = 1 | . \end{matrix}$

A principal goal of this paper is to construct a classification algorithm $\hat{g}$ which satisfies $\begin{matrix} \underset{asymptotically fair}{\underset{︸}{E [Δ (\hat{g}, ℙ)] \to 0}}, and \underset{asymptotically optimal}{\underset{︸}{E [R (\hat{g})] \to R (g^{*})}}, \end{matrix}$ where the expectations are taken with respect to the distribution of data samples. As we shall see the estimator is built from independent sets of labeled and unlabeled samples. Hence the convergence above is meant to hold as both samples grow to infinity.

3.3.3 Proposed procedure

In this section, we present the proposed plug-in algorithm and begin to study its theoretical properties.

We assume that we have at our disposal two datasets, labeled $D_{n}$ and unlabeled $D_{N}$ defined as $D_{n} = {(X_{i}, S_{i}, Y_{i})}_{i = 1}^{n} \overset{i . i . d .}{\sim} ℙ,$ and $D_{N} = {(X_{i}, S_{i})}_{i = n + 1}^{n + N} \overset{i . i . d .}{\sim} ℙ_{(X, S)},$ where $ℙ_{(X, S)}$ is the marginal distribution of the vector (X, S). We additionally assume that the estimator $\hat{η}$ of the regression function is constructed based on $D_{n}$ , independently of $D_{N}$ . Let us denote by ${\hat{E}}_{X | S = 1}, {\hat{E}}_{X | S = 0}$ expectations taken w.r.t. the empirical distributions induced by $D_{N}$ , that is, $\begin{matrix} {\hat{ℙ}}_{X | S = s} = \frac{1}{| {(X, S) \in D_{N} : S = s} |} \sum_{(X, S) \in D_{N} S = s} δ_{X}, \end{matrix}$ for all s ∈ {0, 1}, and by ${\hat{E}}_{S}$ expectation taken w.r.t. the empirical measure of S, that is, ${\hat{ℙ}}_{S} = \frac{1}{N} \sum_{(X, S) \in D_{N}} δ_{S}$ .

Remark 4. In theory, the empirical distributions might be not well defined, since they are only valid if the unlabeled dataset $D_{N}$ is composed of features from both groups. We show how to bypass this problem theoretically in supplementary material. Nevertheless, this remark has little to no impact in practice and in most situations these quantities are well defined.

Based on the estimator $\hat{η}$ and the unlabeled sample $D_{N}$ , let us introduce the following estimators for each s ∈ {0, 1} $\begin{matrix} \hat{ℙ} (Y = 1, S = s) & : = {\hat{E}}_{X | S = s} [\hat{η} (X, s)] {\hat{ℙ}}_{S} (S = s) . \end{matrix}$

Using the above estimators a straightforward procedure to mimic the optimal classifier g^* provided by Proposition 2 is to employ a plug-in rule $\hat{g}$ , obtained by replacing all the unknown quantities by either their empirical versions or their estimates. Specifically, we let $\hat{g}$ at $(x, s) \in ℝ^{d} \times {0, 1}$ as $\begin{matrix} \hat{g} (x, 1) = 1 \leq \hat{η} (x, 1) 2 - \frac{\hat{θ}}{\hat{ℙ} Y = 1, S = 1}, & \hat{g} (x, 0) = 1 \leq \hat{η} (x, 0) 2 + \frac{\hat{θ}}{\hat{ℙ} Y = 1, S = 0} . \end{matrix}$

It remains to define the value of $\hat{θ}$ , clearly it is desirable to mimic the condition that is satisfied by θ^* in Proposition 2. To this end, we make use of the unlabeled data $D_{N}$ and of the estimator $\hat{η}$ previously built from the labeled dataset $D_{n}$ . Consequently, we define a data-driven version of unfairness $Δ (g, ℙ)$ , which allows to construct an approximation $\hat{θ}$ of the true value θ^*.§

Definition 6. (Empirical unfairness) For any classifier g, an estimator $\hat{η}$ based on $D_{n}$ , and unlabeled sample $D_{N}$ the empirical unfairness is defined as $\begin{matrix} \hat{Δ} (g, ℙ) = \frac{{\hat{E}}_{X | S = 1} \hat{η} (X, 1) g (X, 1)}{{\hat{E}}_{X | S = 1} \hat{η} (X, 1)} - \frac{{\hat{E}}_{X | S = 0} \hat{η} (X, 0) g (X, 0)}{{\hat{E}}_{X | S = 0} \hat{η} (X, 0)} . \end{matrix}$

Notice that the empirical unfairness $\hat{Δ} (g, ℙ)$ is data-driven, that is, it does not involve unknown quantities. One might wonder why it is an empirical version of the quantity $Δ (g, ℙ)$ in Definition 5 and what is the reason to introduce it. The definition reveals itself when we rewrite the population of unfairness $Δ (g, ℙ)$ using 3 the identity $\begin{matrix} ∥ & g (X, S) = 1 S = s, Y = 1 \\ = \frac{g (X, S) = 1, Y = 1 S = s}{Y = 1 S = s} \\ = \frac{E_{X | S = s} [η (X, s) g (X, s)]}{E_{X | S = s} [η (X, s)]} . \end{matrix}$

Using the above expression we can rewrite $\begin{matrix} Δ (g, ℙ) = \frac{E_{X | S = 1} [η (X, 1) g (X, 1)]}{E_{X | S = 1} [η (X, 1)]} - \frac{E_{X | S = 0} [η (X, 0) g (X, 0)]}{E_{X | S = 0} [η (X, 0)]} . \end{matrix}$

Hence, the passage from the population unfairness to its empirical version in Definition 6 formally reduces to substituting “hats” to all the unknown quantities.

Using Definition 6, a logical estimator $\hat{θ}$ of θ^* can be obtained as $\begin{matrix} \hat{θ} \in \underset{θ \in [- 2, 2]}{\arg \min} \hat{Δ} ({\hat{g}}_{θ}, ℙ), \end{matrix}$ where, for all θ ∈ [-2, 2], ${\hat{g}}_{θ}$ is defined at $(x, s) \in ℝ^{d} \times {0, 1}$ as $\begin{matrix} {\hat{g}}_{θ} (x, 1) = 1 \leq \hat{η} (x, 1) 2 - \frac{θ}{\hat{ℙ} Y = 1, S = 1}, \\ {\hat{g}}_{θ} (x, 0) = 1 \leq \hat{η} (x, 0) 2 + \frac{θ}{\hat{ℙ} Y = 1, S = 0} . \end{matrix}$

In this case, the algorithm $\hat{g}$ that we propose is such that $\hat{g} \equiv {\hat{g}}_{\hat{θ}}$ . It is crucial to mention that since the quantity $\hat{Δ} ({\hat{g}}_{θ}, ℙ)$ is empirical, then there might be no θ which delivers zero for the empirical unfairness. This is exactly the reason we perform a minimization of this quantity.

Remark 5. Even though we believe that the introduction of the unlabeled sample is one of the strong points of the approach, this sample may not be available on some benchmark datasets. In this case, we can simply randomly split the data into two parts disregarding labels in one of them, or alternatively we can use the same sample twice. The second path is not directly justified by the theoretical results, yet, let us suggest the following intuitive explanation for this approach. On the first and the second steps, the procedure approximates two independent parts of the distribution $ℙ$ of the random tuple (X, S, Y). Indeed, following the factorization $ℙ = ℙ_{Y | X, S} \otimes ℙ_{(X, S)}$ , the first step of the procedure approximates $ℙ_{Y | X, S}$ , whereas the second step is aimed at $ℙ_{(X, S)}$ which is independent from $ℙ_{Y | X, S}$ . In the experiments it is possible to exploit the same set of data for both $D_{n}$ and $D_{N}$ , since no unlabelled sample are always available and splitting the dataset would reduce the quality of the trained model because the datasets have a small sample size.

3.3.4 Consistency

In this section we establish that the proposed procedure is consistent. To present the theoretical results we impose two assumptions on the estimator $\hat{η}$ and demonstrate how to satisfy them in practice.

Assumption 2. The estimator $\hat{η}$ which is constructed on $D_{n}$ satisfies for all s ∈ {0, 1}

$E_{D_{n}} E_{X | S = s} η (X, S) - \hat{η} (X, S) \to 0$ as n→ ∞;

There exists a sequence c_n,N > 0 satisfying $\frac{1}{c_{n, N} \sqrt{N}} = o_{n, N} 1$ and c_n,N = o_n,N such that $E_{X | S = s} [\hat{η} (X, S)] \geq c_{n, N}$ almost surely.

Remark 6. There are two parts in Assumption 2, the first one requires a consistent estimator in ℓ₁ norm. This first assumption is rather weak, since there are many different available consistent estimators for the regression function in the literature, including the Maximum likelihood estimator [191] for Gaussian Generative Model, local polynomial estimator [10] for β-Hölder smooth regression function η (· , s), regularized logistic regression [183] for Generalized Linear Model, k-Nearest Neighbors estimator [46] for Lipschitz regression function η (· , s), and random forest type estimators in various settings [9 , 174].

The second part of Assumption 2 means that the quanitity $E_{X | S = s} [\hat{η} (X, s)]$ is lower bounded by a positive term vanishing as N, n grow to infinity. This condition can be introduced artificially to any predefined estimator. Indeed, assume that we have a consistent estimator $\tilde{η}$ and let $\hat{η} (x, s) = max {\tilde{η} (x, s), c_{n, N}}$ , then the second item of the assumption is satisfied in even a stronger form. Moreover, this estimator $\hat{η}$ remains consistent, since using the triangle inequality and the fact that $\hat{η} (x, s) - \tilde{η} (x, s) \leq c_{n, N}$ for all $x \in ℝ^{d}$ , we have $\begin{matrix} E_{D_{n}} E_{X | S = s} η (X, s) - \hat{η} (X, s) \\ \leq E_{D_{n}} E_{X | S = s} η (X, s) - \tilde{η} (X, s) + c_{n, N} \to 0 . \end{matrix}$

Additionally, we impose one more condition on the estimator $\hat{η}$ that was already successfully used in the context of confidence set classification [31].

Assumption 3. The estimator $\hat{η}$ is such that for all s ∈ {0, 1} the mapping $\begin{matrix} t \mapsto \hat{η} (X, s) \leq t S = s, \end{matrix}$ is continuous on (0, 1) almost surely.

In the proposed settings this assumption allows us to show that the value of $\hat{Δ} (\hat{g}, ℙ)$ cannot be large, that is, the empirical unfairness of the proposed procedure is small or zero. As we shall see, a control on the empirical unfairness $\hat{Δ} (\hat{g}, ℙ)$ in Definition 6 is crucial in proving that the proposed procedure $\hat{g}$ achieves both asymptotic fairness and risk consistency.

Remark 7. Assumption 3 is equivalent to say that there are no atoms in the regression function. It can be fulfilled by a simple modification of any preliminary estimator, by adding a small deterministic “noise”, the amplitude of which must be decreasing with n, N in order to preserve statistical consistency.

The remarks suggest that both Assumptions 2 and 3 can be easily satisfied in a variety of practical settings and the most demanding part of these assumptions is the consistency of $\hat{η}$ .

The next result establishes the statistical consistency of the proposed algorithm.

Theorem 2. (Asymptotic properties) Under Assumptions 1, 2, and 3 the proposed algorithm satisfies $\begin{matrix} lim_{n, N \to \infty} E_{(D_{n}, D_{N})} [Δ (\hat{g}, ℙ)] = 0 \end{matrix}$ and $\begin{matrix} lim_{n, N \to \infty} E_{(D_{n}, D_{N})} [R (\hat{g})] \leq R (g^{*}) . \end{matrix}$

The proof is reported in [32].

Remark 8. Let us mention that it is possible to present the result in a finite sample regime, since the proof of consistency is based on non-asymptotic theory of empirical processes. However, the actual rate of convergence depends on the rate of ℓ₁-norm estimation of the regression function η, which can vary significantly from one setup to another. That is why we decided to present the result in the asymptotic sense.

4 Methods for learning fair representations

Let us consider a composition of models f (g (x)) where $x \in ℝ^{d}$ is a vector of raw features (an element of the input space), $g : ℝ^{d} \to ℝ^{r}$ is a function mapping the input space into a new one, that we refer to as the representation. In other words, the function g synthesizes the information needed to solve a particular task (or a set of tasks) by learning a function f, chosen from a set of possible functions.

In the current literature [16 , 202] - with fair representation we refer to the concept of learning a representation function g, which does not discriminate subgroups in the data. Namely, g is conditionally independent of subgroup membership. This approach is different from most commonly used approaches [49 , 197], in which the focus is to solve a task (or a set of tasks) without discriminating subgroups in the data, regardless of the fairness of the representation itself. That is, in the previously mentioned work a fair model $f : ℝ^{r} \to ℝ$ is learned directly from the raw data, without performing any explicit representation extraction.

Note that these method could be considered a special case of the Pre-Processing methods, but the conceptual difference is that here, the representation, is not a deterministic mapping but it is a learned, problem dependent, mapping.

In particular, in [16 , 186], the authors propose different neural networks architectures together with modified learning strategies able to learn a representation that obscures or removes the sensitive variable. In the general case, all these methods have an input, a target variable (i.e., the task at hand) and a binary sensitive variable. The objective is to learn a representation that: preserves information about the input space; is useful for predicting the target; is approximately independent of the sensitive variable. In practice, these methods pursue the goal of making the generated model act randomly when the internal representation is exploited to predict the sensitive variable. In this sense, no actual constraint is directly imposed on the internal representation, but only on the output of the model.

In [92], instead, the authors show how to formulate the problem of counterfactual inference as a domain adaptation problem, and more specifically a covariate shift problem [165]. The authors derive two new families of representation algorithms for counterfactual inference. The first one is based on linear models and variable selection, and the other one on deep learning. The authors show that learning representations that encourage similarity (i.e., balance) between the treatment and control populations lead to better counterfactual inference; this is in contrast to many methods which attempt to create balance by re-weighting samples.

Finally, in [202], the authors learn a representation of the data that is a probability distribution over clusters where learning the cluster of a datapoint contains no-information about the sensitive variable, namely fair clustering. In this sense, the clustering is learned to be fair and also discriminative for the prediction task at hand.

4.1 Method

In this section, we present a method to learn a shared fair representation from multiple tasks [152]. We consider T supervised learning tasks (each could be a binary classification or regression problem). Each task t ∈ {1, …, T} is identified by a probability distribution μ_t on $X \times S \times Y$ , where $X \subset R^{d}$ is the set of non-sensitive input variables, $S = {1, 2}$ is the set of values of a binary sensitive variable 4 and $Y$ is the output space which is either {-1, 1} for binary classification or $Y \subset R$ for regression. We let $z_{t} = (x_{t, i}, s_{t, i}, y_{t, i})_{i = 1}^{m} \in (X \times S \times Y)^{m}$ be the training sequence for task t, which is sampled independently from μ_t. The goal is to learn a predictive model $f_{t} : X \times S \to Y$ for each task t ∈ {1, …, T}.

Depending on the application at hand, the model may include (i.e., $f : X \times S \to Y$ ) or not (i.e., $f : X \to Y$ ) the sensitive feature in its functional form. In the following we consider the case that the functions f_t are linear 5 , and to simplify the presentation we consider the case that s is not included in the functional form, that is, f_t (x) = 〈w_t, x〉, where $w_{t} \in ℝ^{d}$ is a vector of parameters. The case in which both x and s are used as predictors is obtained by adding two more components to x, representing the one-hot encoding of s, and letting $w_{t} \in ℝ^{d + 2}$ .

A general multitask learning formulation (MTL) is based on minimizing the multitask empirical error plus a regularization term which leverages similarities between the tasks. A natural choice for the regularizer which is considered in this section is given by the trace norm, namely the sum of the singular values of the matrix $W = [w_{1} \dots w_{T}] \in ℝ^{d \times T}$ . It is well know, that this problem is equivalent to the matrix factorization problem,

$\begin{matrix} min_{A, B} \frac{1}{Tm} \sum_{t = 1}^{T} \sum_{i = 1}^{m}^{(y_{t, i} - 〈 {b^{t}, A}_{⊤} x_{t, i} 〉) 2} \\ + \frac{λ}{2} (∥ A ∥_{F}^{2} + ∥ B ∥_{F}^{2}), \end{matrix}$ (32) where $A = [a_{1} \dots a_{r}] \in ℝ^{d \times r}$ and $B = [b_{1} \dots b_{T}] \in ℝ^{r \times T}$ and ∥ · ∥ _F is the Frobenius norm, see e.g., [180] and references therein. Here $r \in ℕ$ is the number of factors, that is the upper bound on the rank of W = AB. If r ≥ min(d, T) then Problem (33) is equivalent to trace norm regularization [61, 184], see e.g., [152] and references therein 6 . We follow the formulation of Equation (33) since it can easily be solved by gradient descent or alternate minimization as we discuss next. Once the problem is solved the estimated parameters of the function w_t for the tasks’ linear models are simply computed as w_t = Ab_t. We also note that for simplicity the problem is stated with the square loss function, but the observations extended to the general case of proper convex loss functions.

Note that the method can be interpreted as a 2-layer network with linear activation functions. Indeed, the matrix applied to an input vector $x \in ℝ^{d}$ induces the linear representation . We would like this representation to be fair w.r.t. the sensitive feature. Specifically, we require that each component of the representation vector satisfies the demographic parity constraint [61, 184] on each task. This means that, for every measurable subset $C \subset ℝ^{r}$ , and for every t ∈ {1, …, T}, we require that

$ℙ (A^{⊤} x_{t} \in C | s = 1) = ℙ (A^{⊤} x_{t} \in C | s = 2),$ (33) that is the two conditional distributions are the same. We relax this constraint by requiring, for every t ∈ {1, …, T}, that both distributions have the same mean. Furthermore, we compute the means from empirical data. For each training sequence $z \in (X \times Y)^{T}$ and $s \in S$ , we use the notation I_s (z) = {(x_i, y_i) : s_i = s}, define the empirical conditional means

$c (z) = \frac{1}{| I_{1} (z) |} \sum_{i \in I_{1} (z)} x_{i} - \frac{1}{| I_{2} (z) |} \sum_{i \in I_{2} (z)} x_{i},$ (34) and then relax the constraint of Equation (34) to

$A^{⊤} c (z_{t}) = 0 .$ (35)

This is a crude approximation since it corresponds to requiring the first order moment of the two distributions to be the same. However, as shown in [152], it works well in practice and has the major advantage of turning a non-convex constraint in a convex one. We note that a similar approximation has been considered in [153] in the case of fair regression, and reported to be empirically effective.

Based on the above reasoning, we propose to learn a fair linear representation as a solution to the optimization problem $\begin{matrix} min_{A, B} & \frac{1}{Tm} \sum_{t = 1}^{T} \sum_{i = 1}^{m} {(y_{t, i} - 〈 b_{t}, A ⊤ x_{t, i} 〉)}^{2} \\ + \frac{λ}{2} (∥ A ∥_{F}^{2} + ∥ B ∥_{F}^{2}) \end{matrix}$ (36) $\begin{matrix} + \frac{λ}{2} (∥ A ∥_{F}^{2} + ∥ B ∥_{F}^{2}) \\ F^{⊤} c (z_{t}) = 0, t \in {, ..., T}, \end{matrix}$ (37) where we used the shorthand notation c_t = c (z_t). There are many methods to tackle Problem (37). A natural approach is based on alternate minimization. We discuss the main steps below. Let y_t = [y_t,1, …, y_t,m^]⊤, the vector formed by the outputs of task t, and let X_t = [x_t,1^⊤, …, x_t,m⊤^]⊤, the data matrix for task t.

When we regard A as fixed and solve w.r.t. B, then Problem (37) can be reformulated as

$\begin{matrix} min_{B} & {∥ [\begin{matrix} y_{1} \\ ⋮ \\ y_{T} \end{matrix}] - [\begin{matrix} X_{1} A & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & \dots & 0 & X_{T} A \end{matrix}] [\begin{matrix} b_{1} \\ ⋮ \\ b_{T} \end{matrix}] ∥}^{2} \\ + λ {∥ [\begin{matrix} b_{1} \\ ⋮ \\ b_{T} \end{matrix}] ∥}^{2}, \end{matrix}$ (37) which can be easily solved. In particular note that the problem decouples across the tasks, and each task specific problem amounts running ridge regression on the data transformed by the representation matrix . When instead B is fixed and we solve w.r.t. A, Problem (37) can be reformulated as

$\begin{matrix} min_{A} {∥ [\begin{matrix} y_{1} \\ ⋮ \\ y_{T} \end{matrix}] - [\begin{matrix} b_{1, 1} X_{1} & \dots & b_{1, r} X_{1} \\ ⋮ \\ b_{t, 1} X_{T} & \dots & b_{t, r} X_{T} \end{matrix}] [\begin{matrix} a_{1} \\ ⋮ \\ a_{r} \end{matrix}] ∥}^{2} \\ + λ {∥ [\begin{matrix} a_{1} \\ ⋮ \\ a_{r} \end{matrix}] ∥}^{2}, s . t . [\begin{matrix} a_{1}^{T} \\ ⋮ \\ a_{r}^{T} \end{matrix}] \circ [\begin{matrix} c_{1}, \dots, c_{T} \end{matrix}] = 0, \end{matrix}$ (38) where ∘ is the Kronecker product for partitioned tensors (or Tracy-Singh product). Consequently by alternating minimization we can solve the original problem. Note also that we may relax the equality constraint as $\frac{1}{T} \sum_{t = 1}^{T} ∥^{A ⊤} c (z_{t}) ∥^{2} \leq ε$ , where ε is some tolerance parameter. In fact, this may be required when the vectors c (z_t) span the all input space. In this case we may also add a soft constraint in the regularizer.

We conclude this section by noting that if demographic parity is satisfied at the representation level, that is, Equation (34) holds true, then every model built from such representation will satisfy demographic parity as well. Likewise if the representation satisfies the convex relaxation of Equation (36), then it will also hold that 〈w_t, c (z^{_t)〉=〈b_t,A⊤} c (z_t) 〉=0, that is the task weight vectors will satisfy the first order moment approximation of demographic parity. More importantly, as we will show in the next section, if the tasks are randomly observed, then demographic parity will also be satisfied on future tasks with high probability. In this sense the method can be interpreted as learning a fair transferable representation.

4.2 Learning bound

In this section, we study the learning ability of the proposed method. We consider the setting of learning-to-learn [12], in which the training tasks (and their corresponding datasets) used to find a fair data representation are regarded as random variables from a meta-distribution. The learned representation matrix A is then transferred to a novel task, by applying ridge regression on the task dataset, in which the input x is transformed as . In [136] a learning bound is presented, linking the average risk of the method over tasks from the meta-distribution (the so-called transfer risk) to the multi-task empirical error on the training tasks. This result quantifies the good performance of the representation learning method when the number of tasks grow and the data distribution on the raw input data is intrinsically high dimensional (hence learning is difficult without representation learning). We extend this analysis to the setting of algorithmic fairness, in which the performance of the algorithm is evaluated both relative to risk and the fairness constraint. We show that both quantities can be bounded by their empirical counterparts evaluated on the training tasks.

To present the result we introduce some more notation. We let $E_{μ} (w)$ and $E_{z} (w)$ be the expected and empirical errors of a weight vector w, that is $E_{μ} (w) = E_{(x, y) \sim μ} [(y - 〈 w, x 〉)^{2}],$ (39) $E_{z} (w) = \frac{1}{m} \sum_{i = 1}^{m} (y_{i} - 〈 w, x_{i} 〉)^{2} .$ (40)

Furthermore, for every matrix $A \in ℝ^{d \times r}$ and for every data sample $z = (x_{i}, y)_{i = 1}^{m}$ , we define $b_{A} (z) = arg min_{b \in ℝ^{r}} \frac{1}{m} \sum_{i = 1}^{m} (y_{i} - 〈 b, A ⊤ x_{i} 〉)^{2} + λ ∥ b ∥^{2}$ be the minimizer of ridge regression with modified data representation, that is where “⁺” is the pseudo-inverse operation.

Theorem 3. Let A be the representation learned by solving Problem (33) and renormalized so that ∥A ∥ _F = 1. Let tasks μ₁, …, μ_T be independently sampled from a meta-distribution ρ, and let z_t be sampled from $μ_{t}^{m}$ for t ∈ {1, ⋯ , T}. Assume that the input marginal distribution of random tasks from ρ is supported on the unit sphere and that the outputs are in the interval [-1, 1], almost surely. Let r = min(d, T). Then, for any δ ∈ (0, 1] it holds with probability at least 1 - δ in the drawing of the datasets z₁, …, z_T, that $\begin{matrix} E_{μ \sim ρ} E_{z \sim μ^{m}} R_{μ} (w_{A} (z)) - \frac{1}{T} \sum_{t = 1}^{T} R_{z_{t}} (w_{A} (z_{t})) \\ \leq \frac{4}{λ} \sqrt{\frac{∥ \hat{C} ∥_{\infty}}{m}} + \frac{24}{λ m} \sqrt{\frac{ln \frac{8 mT}{δ}}{T}} \\ + \frac{14}{λ} \sqrt{\frac{ln (mT) ∥ \hat{C} ∥_{\infty}}{T}} + \sqrt{\frac{2 ln \frac{4}{δ}}{T}}, \end{matrix}$ (41) and $\begin{matrix} E_{μ \sim ρ} E_{z \sim μ^{m}} ∥ Ac (z) ∥^{2} - \frac{1}{T} \sum_{t = 1}^{T} ∥ Ac (z_{t}) ∥^{2} \\ \leq 96 \frac{ln \frac{8 r^{2}}{δ}}{T} + 6 \sqrt{\frac{∥ \hat{Σ} ∥_{\infty} ln \frac{8 r^{2}}{δ}}{T}} . \end{matrix}$ (42)

The proof is reported in [152].

We need to make some remarks on the above result. The first bound in Theorem 4.2 improves Theorem 2 in [136]. The improvement is due to the introduction of the empirical total covariance in the second term in the RHS of the inequality. The result in [136] instead contains the term $\sqrt{1 / T}$ , which can be considerably larger when the raw input is distributed on a high dimensional manifold. The bounds in Theorem 4.2 can be extended to hold with variable sample size per task. In order to simplify the presentation, we assume that all datasets are composed of the same number of points m. The general setting can be addressed by letting the sample size be a random variable and introducing the slightly different definition of the transfer risk in which we also take the expectation w.r.t. the sample size. The hyperparameter λ is regarded as fixed in the analysis. In practice it will be chosen by cross-validation. The bound on fairness measure contains two terms in the right hand side, in the spirit of Bernstein’s inequality. The slow term $O (1 / \sqrt{T})$ contains the spectral norm of the covariance of difference of means across the sensitive groups. Notice that ∥Σ ∥ _∞ ≤ 1 but it can be much smaller when the means are close to each other, that is, when the original representation is already approximately fair.

5 Discussion and conclusions

Since machine learning based systems and products are reaching many aspects of everyday life, an increase in concern about the ethical issues that may rise from the adoption of these technologies started to emerge. Contemporary researchers have started to investigate methods to mitigate the possible side effects of these technologies and a new area of machine learning has recently emerged that studies how to address disparate treatment caused by algorithmic errors and bias in the data. In this work we tried to describe the state of the art on algorithmic fairness using statistical learning theory, machine learning, and the deep leaning tools to be able to learn fair models and data representation. The central question that we tried to answer is how to ensure that the learned model does not treat subgroups in the population unfairly. While solving such an issue requires an interdisciplinary effort, fundamental progress can only be achieved through a radical change in the machine learning paradigm. In particular, we first show that it is possible to learn fair models with pre/in/post-processing techniques able to ensure that the outputs of a learned model satisfy a particular notion of fairness. After this step, we show that forcing the fairness in the model output is not enough to reach fairness. In modern contexts, models are not learned from scratch since datasets are too complex or small in cardinality. Consequently, we need to exploit correlation between tasks in order to solve a new task. This can be done using a shared representation, in the transfer/multitask/lifelong setting, which allows to learn more effectively all the tasks. In the context of this work, the representation should also be fair, in the sense that each possible model learned from this representation should be, as a consequence, fair. For this reason, in this work, we also reviewed methods to build a fair representation.

This field or research is very interdisciplinary and relatively new and this work surely does not exhaustively review all the ideas around fairness. For example, all the legal issues about fairness have not been taken into account and the collaboration between society, politician, lawyers, and AI researchers is the necessary future to let this field of research be useful in real applications. Another interesting field of research, which will become increasingly important in the future, is the field of causal inference which provides a precious tool to reason about and deal with fairness, especially in complex unfairness scenarios. Another limitation of this manuscript is that it did not discuss fairness in a temporal setting, namely the consequences of current decisions in the long term. This area of research has only recently started to be explored but will have to play a critical role in the development of truly fair techniques.

Acknowledgments

This work has been partially supported by the Amazon AWS Machine Learning Research Award and the European Union’s Horizon 2020 research and innovation programme under the NGI_TRUST grant agreement no 825618 - Third-party project AMNESIA Assessment of fairness of NGI AI-based future interactive technologies.

Footnotes

See for example the Caffe Model Zoo:

This problem has been faced in deep in []. We did not face this issue in details in this paper since it is mainly a legal issue. In fact, removing the sensitive feature from the model functional form of the model does not ensure fairness since other features may be correlated to the sensitive one, nevertheless including or not these features is just a technical issue (which has been treated in a different section of this paper).

Note additionally that for all s ∈ {0, 1} we can write Y = 1, g (X, s) =1 ≡ Yg (X, s), since both Y and g are binary.

The method naturally extends to multiple sensitive variables but for ease of presentation we consider only the binary case.

The method naturally extend to the non linear case [].

If r < min(d, T) then Problem (33) is equivalent to trace norm regularization plus a rank constraint.

References

Adler

, Falk

, Friedler

S.A.

, Nix

, Rybeck

, Scheidegger

, Smith

and Venkatasubramanian

, Auditing black-box models for indirect influence, Knowledge and Information Systems 54(1) (2018), 95–122.

Agarwal

, Beygelzimer

, Dudik

, Langford

, Wallach

, A reductions approach to fair classification, In, International Conference on Machine Learning (2018).

AINowInstitute, Litigating algorithms: Challenging government use of algorithmic decision systems, 2016. URL https://ainowinstitute.org/litigatingalgorithms.pdf.

Alabi

, Immorlica

, Kalai

A.T.

, Unleashing linear optimizers for group-fair learning and optimization, arXiv preprint arXiv:1804.04503 (2018).

Ali

, Zafar

M.B.

, Singla

, Gummadi

K.P.

, Lossaversively fair classification, In AAAI/ACM Conference on AI, Ethics and Society (2019).

Amrieh

E.A.

, Hamtini

, Aljarah

, Students’academic performance data set, Available at https://www.kaggle.com/aljarah/xAPI-Edu-Data, (2015).

Angwin

, Larson

, Mattu

, Kirchner

, Machine Bias: There’s software used across the country to predict future criminals, And it’s Biased Against Blacks (2016). URL https://www.propublica.org/article/machine-biasrisk-assessmentsin-criminal-sentencing.

Argyriou

, Evgeniou

and Pontil

, Convex multi-task feature learning, Machine Learning 73(3) (2008), 243–272.

Arlot

, Genuer

, Analysis of purely random forests bias, arXiv preprint arXiv:1407.3939 (2014).

10.

Audibert

J.Y.

and Tsybakov

, Fast learning rates for plug-in classifiers, The Annals of Statistics 35(2) (2007), 608–633.

11.

Bartlett

P.L.

and Mendelson

, Rademacher and gaussian complexities: Risk bounds and structural results, Journal of Machine Learning Research 3(Nov) (2002), 463–482.

12.

Baxter

, A model of inductive bias learning, Journal of Artificial Intelligence research 12 (2000), 149–198.

13.

Bechavod

, Ligett

, Penalizing unfairness in binary classification, arXiv preprint arXiv:1707.00044v3 (2018).

14.

Berk

, Heidari

, Jabbari

, Joseph

, Kearns

, Morgenstern

, Neel

, Roth

, A convex framework for fair regression, arXiv preprint arXiv:1706.02409 (2017).

15.

Berk

, Heidari

, Jabbari

, Kearns

, Roth

, Fairness in criminal justice risk assessments: The state of the art, Sociological Methods & Research (2018).

16.

Beutel

, Chen

, Zhao

, Chi

E.H.

, Data decisions and theoretical implications when adversarially learning fair representations, arXiv preprint arXiv:1707.00075 (2017).

17.

Bogen

, Rieke

, Help wanted: An examination of hiring algorithms, equity and bias, Upturn Technical Report (2018).

18.

Bonchi

, Hajian

, Mishra

and Ramazzotti

, Exposing the probabilistic causal structure of discrimination, International Journal of Data Science and Analytics 3(1) (2017), 1–21.

19.

Borwein

, Lewis

A.S.

, Convex Analysis and Nonlinear Optimization: Theory and Examples, Springer (2010).

20.

Breiman

, Consistency for a simple model of random forestsTechnical report, Statistics Department University Of California At Berkeley (2004).

21.

Byanjankar

, Heikkilä

, Mezei

, Predicting credit risk in peer-to-peer lending: A neural network approach, In IEEE Symposium Series on Computational Intelligence (2015).

22.

Calders

and Verwer

, Three naive bayes approaches for discrimination-free classification, Data Mining and Knowledge Discovery 21(2) (2010), 277–292.

23.

Calders

, Kamiran

, Pechenizkiy

, Building classifiers with independency constraints, In IEEE International Conference on Data Mining Workshops (2009).

24.

Calders

, Karim

, Kamiran

, Ali

, Zhang

, Controlling attribute effect in linear regression, In IEEE International Conference on Data Mining (2013).

25.

Calmon

, Wei

, Vinzamuri

, Ramamurthy

K.N.

, Varshney

K.R.

, Optimized pre-processing for discrimination prevention, In Neural Information Processing Systems (2017).

26.

Chiappa

, Gillam

P.S.

, Path-specific

, counterfactual fairness, In AAAI Conference on Artificial Intelligence (2019).

27.

Chiappa

, Isaac

W.S.

, A causal bayesian networks viewpoint on fairness, In IFIP International Summer School on Privacy and Identity Management (2018).

28.

Chierichetti

, Kumar

, Lattanzi

, Vassilvitskii

, Fair clustering through fairlets, In Neural Information Processing Systems (2017).

29.

Chouldechova

, Fair prediction with disparate impact: A study of bias in recidivism prediction instruments, Big Data 5(2) (2017), 153–163.

30.

Chouldechova

, Putnam-Hornstein

, Benavides-Prado

, Fialko

and Vaithianathan

, A case study of algorithmassisted decision making in child maltreatment hotline screening decisions, Proceedings of Machine Learning Research 81 (2018), 134–148.

31.

Chzhen

, Denis

, Hebiri

, Minimax semisupervised confidence sets for multi-class classification, arXiv preprint arXiv:1904.12527 (2019).

32.

Chzhen

, Hebiri

, Denis

, Oneto

, Pontil

, Leveraging labeled and unlabeled data for consistent fair binary classification, In Advances in Neural Information Processing Systems (NIPS) (2019).

33.

Chzhen

, Hebiri

, Denis

, Oneto

, Pontil

, Leveraging labeled and unlabeled data for consistent fair binary classification, In Neural Information Processing Systems (2019).

34.

Ciliberto

, Stamos

, Pontil

, Reexamining low rank matrix factorization for trace norm regularization, arXiv preprint arXiv:1706.08934 (2017).

35.

Corbett-Davies

, Goel

, The measure and mismeasure of fairness: A critical review of fair machine learning, arXiv preprint arXiv:1808.00023 (2018).

36.

Corbett-Davies

, Pierson

, Feller

, Goel

, Huq

, A computer program used for bail and sentencing decisions was labeled biased against blacks, It’s Actually not that Clear (2016). URL https://www.washingtonpost.com/news/monkeycage/wp/2016/10/17/can-an-algorithm-beracist-our-analysis-is-more-cautiousthanpropublicas/?utm term=.8c6e8c1cfbdf.

37.

Corbett-Davies

, Pierson

, Feller

, Goel

, Huq

, Algorithmic decision making and the cost of fairness, In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017).

38.

Cortez

, Student performance data set. Available at https://archive.ics.uci.edu/ml/datasets/Student+Performance, (2014).

39.

Cotter

, Gupta

, Jiang

, Srebro

, Sridharan

, Wang

, Woodworth

, You

, Training wellgeneralizing classifiers for fairness metrics and other datadependent constraints, arXiv preprint arXiv:1807.00028 (2018).

40.

Cotter

, Jiang

, Sridharan

, Two-player games for efficient non-convex constrained optimization, In Algorithmic Learning Theory (2019).

41.

Dawid

, Fundamentals of statistical causality, Technical Report (2007).

42.

, Fauw, J.R. Ledsam, B. Romera-Paredes, S. Nikolov, N. Tomasev, S. H. Askham, X. Glorot, B. O’Donoghue, D. Visentin, G. Van Den Driessche, B. Lakshminarayanan, C. Meyer, F. Mackinder, S. Bouton, K. Ayoub, R. Chopra, D. King, A. Karthikesalingam, C.O. Hughes, R. Raine, J. Hughes, D. A. Sim, C. Egan, A. Tufail, H. Montgomery, D. Hassabis, G. Rees, T. Back, P.T. Khaw, M. Suleyman, J. Cornebise, P.A. Keane and O. Ronneberger, Clinically applicable deep learning for diagnosis and referral in retinal disease, Nature Medicine 24(9) (2018), 1342–1350 Blackwell.

43.

Denis

, Hebiri

, Consistency of plug-in confidence sets for classification in semi-supervised learning, arXiv preprint arXiv:1507.07235 (2015).

44.

Denis

and Hebiri

, Confidence sets with expected sizes for multiclass classification, Journal of Machine Learning Research 18(1) (2017), 3571–3598.

45.

N.Y.P. Department. Stop, question and frisk data set. Available at https://www1.nyc.gov/site/nypd/stats/reports-analysis/stopfrisk.page, (2012).

46.

Devroye

, The uniform convergence of nearest neighbor regression function estimators and their application in optimization, IEEE Transactions on Information Theory 24(2) (1978), 142–151.

47.

Donahue

, Jia

, Vinyals

, Hoffman

, Zhang

, Tzeng

, Darrell

, Decaf: A deep convolutional activation feature for generic visual recognition, In International Conference on Machine Learning (2014).

48.

Donini

, Ben-David

, Pontil

, Shawe-Taylor

, An efficient method to impose fairness in linear models, In NIPS Workshop on Prioritising Online Content (2017).

49.

Donini

, Oneto

, Ben-David

, Shawe-Taylor

J.S.

, Pontil

, Empirical risk minimization under fairness constraints, In Neural Information Processing Systems (2018).

50.

Dwork

, Hardt

, Pitassi

, Reingold

, Zemel

, Fairness through awareness, In Innovations in Theoretical Computer Science Conference (2012).

51.

Dwork

, Immorlica

, Kalai

A.T.

, Leiserson

M.D.M.

, Decoupled classifiers for group-fair and efficient machine learning, In Conference on Fairness, Accountability and Transparency (2018).

52.

Edwards

, Storkey

, Censoring representations with an adversary, arXiv preprint arXiv:1511.05897 (2015).

53.

Eubanks

, Automating Inequality: How High-Tech Tools Profile, Police and Punish the Poor. St. Martin’s Press (2018).

54.

Fehrman

, Egan

, Mirkes

E.M.

, Drug consumption data set. Available at https://archive.ics.uci.edu/ml/datasets/Drug+consumption+%28quantified%29, (2016).

55.

Feldman

, Computational fairness: Preventing machine-learned discrimination, 2015. URL https://scholarship.tricolib.brynmawr.edu/handle/10066/17628.

56.

Feldman

, Friedler

S.A.

, Moeller

, Scheidegger

, Venkatasubramanian

, Certifying and removing disparate impact, In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015).

57.

Fish

, Kun

, Lelkes

, Fair boosting: a case study, In Workshop on Fairness, Accountability and Transparency in Machine Learning (2015).

58.

Fish

, Kun

, Lelkes

A.D.

, A confidence-based approach for balancing fairness and accuracy, In SIAM International Conference on Data Mining (2016).

59.

Fitzsimons

, Ali

A.A.

, Osborne

, Roberts

, Equality constrained decision trees: For the algorithmic enforcement of group fairness, arXiv preprint arXiv:1810.05041 (2018).

60.

Fukuchi

, Kamishima

and Sakuma

, Prediction with model-based neutrality, IEICE TRANSACTIONS on Information and Systems 98(8) (2015), 1503–1516.

61.

Gajane

, Pechenizkiy

, On formalizing fairness in prediction with machine learning, arXiv preprint arXiv:1710.03184 (2017).

62.

Genuer

, Variance reduction in purely random forests, Journal of Nonparametric Statistics 24(3) (2012), 543–562.

63.

Ghassami

, Khodadadian

, Kiyavash

, Fairness in supervised learning: An information theoretic approach, In IEEE International Symposium on Information Theory (2018).

64.

Gillen

, Jung

, Kearns

, Roth

, Online learning with an unknown fairness metric, In Neural Information Processing Systems (2018).

65.

Goh

, Cotter

, Gupta

, Friedlander

M.P.

, Satisfying real-world goals with dataset constraints, In Neural Information Processing Systems (2016).

66.

Goldstein

, School effectiveness data set. Available at http://www.bristol.ac.uk/cmm/learning/support/datasets/, (1987).

67.

Gordaliza

, Del Barrio

, Fabrice

, Jean-Michel

, Obtaining fairness using optimal transport theory, In International Conference on Machine Learning (2019).

68.

Gretton

, Borgwardt

, Rasch

, Schölkopf

, Smola

A.J.

, A kernel method for the two-sample-problem, In Neural Information Processing Systems (2007).

69.

Grgić-Hlača

, Zafar

M.B.

, Gummadi

K.P.

, Weller

, On fairness, diversity and randomness in algorithmic decision making, arXiv preprint arXiv:1706.10208 (2017).

70.

Guvenir

H.A.

, Acar

and Muderrisoglu

, Arrhythmia data set, Available at https://archive.ics.uci.edu/ml/datasets/Arrhythmia[datasets/Arrhythmia], (1998).

71.

Hajian

and Domingo-Ferrer

, A methodology for direct and indirect discrimination prevention in data mining, IEEE Transactions on Knowledge and Data Engineering 25(7) (2012), 1445–1459.

72.

Hajian

, Domingo-Ferrer

, Martinez-Balleste

, Rule protection for indirect discrimination prevention in data mining, In International Conference on Modeling Decisions for Artificial Intelligence (2011).

73.

Hajian

, Monreale

, Pedreschi

, Domingo-Ferrer

, Giannotti

, Injecting discrimination and privacy awareness into pattern discovery, In IEEE International Conference on Data Mining Workshops (2012).

74.

Hajian

, Domingo-Ferrer

and Farrás

, Generalizationbased privacy preservation and discrimination prevention in data publishing and mining, Data Mining and Knowledge Discovery 28(5-6) (2014), 1158–1188.

75.

Hajian

, Domingo-Ferrer

, Monreale

, Pedreschi

and Giannotti

, Discrimination-and privacy-aware patterns, Data Mining and Knowledge Discovery 29(6) (2015), 1733–1782.

76.

Hardt

, Price

, Srebro

, Equality of opportunity in supervised learning, In Neural Information Processing Systems (2016).

77.

Harper

F.M.

, Konstan

J.A.

, Movielens data set, Available at https://grouplens.org/datasets/movielens/, (2016).

78.

Hashimoto

T.B.

, Srivastava

, Namkoong

, Liang

, Fairness without demographics in repeated loss minimization, arXiv preprint arXiv:1806.08010 (2018).

79.

, Pan

, Jin

, Xu

, Liu

, Xu

, Shi

, Atallah

, Herbrich

, Bowers

, Candela

J.Q.

, Practical lessons from predicting clicks on ads at facebook, In International Workshop on Data Mining for Online Advertising (2014).

80.

Hébert-Johnson

, Kim

M.P.

, Reingold

, Rothblum

G.N.

, Calibration for the (computationally-identifiable) masses, arXiv preprint arXiv:1711.08513 (2017).

81.

Heidari

, Ferrari

, Gummadi

, Krause

, Fairness behind a veil of ignorance: A welfare analysis for automated decision making, In Neural Information Processing Systems (2018).

82.

Heidari

, Loi

, Gummadi

K.P.

, Krause

, A moral framework for understanding of fair ml through economic models of equality of opportunity, arXiv preprint arXiv:1809.03400 (2018).

83.

Henelius

, Puolamäki

, Boström

, Asker

and Papapetrou

, A peek into the black box: exploring classifiers by randomization, Data Mining and Knowledge Discovery 28(5-6) (2014), 1503–1529.

84.

Hoffman

, Kahn

L.B.

and Li

, Discretion in hiring, The Quarterly Journal of Economics 133(2) (2018), 765–800.

85.

Hofmann

, Statlog (german credit data) data set. Available at https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data), (1994).

86.

, Chen

, Fair classification and social welfare, arXiv preprint arXiv:1905.00147 (2019).

87.

Hussain

, Dahan

N.A.

, Ba-Alwib

F.M.

and Ribata

, Student s performance data set, Available at, (2018)–academic https://archive.ics.uci.edu/ml/datasets/Student+Academics+Performance.

88.

Jabbari

, Joseph

, Kearns

, Morgenstern

, Roth

, Fairness in reinforcement learning, In International Conference on Machine Learning (2017).

89.

Jagielski

, Kearns

, Mao

, Oprea

, Roth

S.S.-M.

, Ullman

, Differentially private fair learning, arXiv preprint arXiv:1812.02696 (2018).

90.

Janosi

, Steinbrunn

, Pfisterer

, Detrano

, Heart disease data set, Available at https://archive.ics.uci.edu/ml/datasets/Heart+Disease, (1988).

91.

Jiang

, Pacchiano

, Stepleton

, Jiang

, Chiappa

, Wasserstein fair classification, arXiv preprint arXiv:1907.12059 (2019).

92.

Johansson

, Shalit

, Sontag

, Learning representations for counterfactual inference, In International conference on machine learning (2016).

93.

Johndrow

J.E.

and Lum

, An algorithm for removing sensitive information: application to race-independent recidivism prediction, The Annals of Applied Statistics 13(1) (2019), 189–220.

94.

Johnson

K.D.

, Foster

D.P.

, Stine

R.A.

, Impartial predictive modeling: Ensuring fairness in arbitrary models, arXiv preprint arXiv:1608.00528 (2016).

95.

Joseph

, Kearns

, Morgenstern

, Neel

, Roth

, Rawlsian fairness for machine learning, arXiv preprint arXiv:1610.09559 (2016).

96.

Joseph

, Kearns

, Morgenstern

J.H.

, Roth

, Fairness in learning: Classic and contextual bandits, In Neural Information Processing Systems (2016).

97.

Kamiran

, Calders

, Classifying without discriminating, In International Conference on Computer, Control and Communication (2009).

98.

Kamiran

, Calders

, Classification with no discrimination by preferential sampling, In Machine Learning Conference (2010).

99.

Kamiran

and Calders

, Data preprocessing techniques for classification without discrimination, Knowledge and Information Systems 33(1) (2012), 1–33.

100.

Kamiran

, Calders

, Pechenizkiy

, Discrimination aware decision tree learning, In IEEE International Conference on Data Mining (2010).

101.

Kamiran

, Karim

, Zhang

, Decision theory for discrimination-aware classification, In IEEE International Conference on Data Mining (2012).

102.

Kamiran

, Žliobaitė

and Calders

, Quantifying explainable discrimination and removing illegal discrimination in automated decision making, Knowledge and Information Systems 35(3) (2013), 613–644.

103.

Kamishima

, Akaho

, Sakuma

, Fairness-aware learning through regularization approach, In International Conference on Data Mining Workshops (2011).

104.

Kamishima

, Akaho

, Asoh

, Sakuma

, Enhancement of the neutrality in recommendation, In ACM conference on Recommender Systems (2012).

105.

Kamishima

, Akaho

, Asoh

, Sakuma

, Fairnessaware classifier with prejudice remover regularizer, In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2012).

106.

Kamishima

, Akaho

, Asoh

, Sakuma

, The independence of fairness-aware classifiers, In IEEE International Conference on Data Mining Workshops (2013).

107.

Kearns

, Neel

, Roth

, Wu

Z.S.

, Preventing fairness gerrymandering: Auditing and learning for subgroup fairness, In International Conference on Machine Learning (2018).

108.

Kilbertus

, Carulla

M.R.

, Parascandolo

, Hardt

, Janzing

, Schölkopf

, Avoiding discrimination through causal reasoning, In Neural Information Processing Systems (2017).

109.

Kim

, Reingold

, Rothblum

, Fairness through computationally-bounded awareness, In Neural Information Processing Systems (2018).

110.

Kim

M.P.

, Ghorbani

, Zou

, Multiaccuracy: Black-box post-processing for fairness in classification, In AAAI/ACM Conference on AI, Ethics and Society (2019).

111.

Kleinberg

, Mullainathan

, Raghavan

, Inherent trade-offs in the fair determination of risk scores, In Innovations in Theoretical Computer Science Conference (2016).

112.

Kohavi

, Becker

, Census income data set, Available at https://archive.ics.uci.edu/ml/datasets/census+income, (1996).

113.

Komiyama

, Shimao

, Two-stage algorithm for fairness-aware machine learning, arXiv preprint arXiv:1710.04924 (2017).

114.

Komiyama

, Takeda

, Honda

, Shimao

, Nonconvex optimization for regression with fairness constraints, In International Conference on Machine Learning (2018).

115.

[115] Kourou

, Exarchos

T.P.

, Exarchos

K.P.

, Karamouzis

M.V.

and Fotiadis

D.I.

, Machine learning applications in cancer prognosis and prediction, Computational and Structural Biotechnology Journal 13 (2015), 8–17.

116.

Koyejo

, Natarajan

, Ravikumar

, Dhillon

, Consistent multilabel classification, In Neural Information Processing Systems (2015).

117.

Kusner

M.J.

, Loftus

, Russell

, Silva

, Counterfactual fairness, In Neural Information Processing Systems (2017).

118.

Lan

, Huan

, Discriminatory transfer, arXiv preprint arXiv:1707.00780 (2017).

119.

Larson

, Mattu

, Kirchner

, Angwin

, Propublica compas risk assessment data set, Available at https://github.com/propublica/compas-analysis, (2016).

120.

Lei.

, Classification with confidence. Biometrika, 101(4): 755– 769, 2014.

121.

Lim

T.S.

, Contraceptive method choice data set, Available at https://archive.ics.uci.edu/ml/datasets/Contraceptive+Method+Choice, (1997).

122.

Liu

, Luo

, Wang

, Tang

, CelebA data set, Available at http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html, (2015).

123.

Louizos

, Swersky

, Li

, Welling

, Zemel

, The variational fair autoencoder, arXiv preprint arXiv:1511.00830 (2015).

124.

Lum

, Johndrow

, A statistical framework for fair predictive algorithms, arXiv preprint arXiv:1610.08077 (2016).

125.

Luo

, Liu

, Koprinska

, Chen

, Discriminationaware association rule mining for unbiased data analytics, In International Conference on Big Data Analytics and Knowledge Discovery pages 108– 120. Springer, (2015).

126.

Luong

B.T.

, Ruggieri

, Turini

, k-nn as an implementation of situation testing for discrimination discovery and prevention, In ACM SIGKDD international conference on Knowledge discovery and data mining (2011).

127.

D.S.

, Correll

, Wittenbrink

, Chicago face data set, Available at https://chicagofaces.org/default/, (2015).

128.

Madras

, Creager

, Pitassi

, Zemel

, Learning adversarially fair and transferable representations, arXiv preprint arXiv:1802.06309 (2018).

129.

Madras

, Pitassi

, Zemel

, Predict responsibly: improving fairness and accuracy by learning to defer, In Neural Information Processing Systems (2018).

130.

Malekipirbazari

and Aksakalli

, Risk assessment in social lending via random forests, Expert Systems with Applications 42(10) (2015), 4621–4631.

131.

Mancuhan

, Clifton

, Discriminatory decision policy aware classification, In IEEE International Conference on Data Mining Workshops (2012).

132.

Mancuhan

and Clifton

, Combating discrimination using bayesian networks, Artificial Intelligence and Law 22(2) (2014), 211–238.

133.

Manisha

, Gujar

, A neural network framework for fair classifier, arXiv preprint arXiv:1811.00247 (2018).

134.

Mary

, Calauzenes

, El

, Karoui, Fairness-aware learning for continuous attributes and treatments, In International Conference on Machine Learning (2019).

135.

Maurer

, A note on the pac bayesian theorem, arXiv preprint cs/0411099 (2004).

136.

Maurer

, Transfer bounds for linear feature learning, Machine Learning 75(3) (2009), 327–350.

137.

McNamara

, Ong

C.S.

, Williamson

R.C.

, Provably fair representations, arXiv preprint arXiv:1710.04394 (2017).

138.

McNamara

, Ong

C.S.

, Williamson

, Costs and benefits of fair representation learning, In AAAI Conference on Artificial Intelligence, Ethics and Society (2019).

139.

Menon

A.K.

, Williamson

R.C.

, The cost of fairness in binary classification, In FAT (2018).

140.

Merler

, Ratha

, Feris

R.S.

, Smith

J.R.

, Diversity in faces data set, Available at https://research.ibm.com/artificial-intelligence/trustedai/diversity-in-faces/#highlights, (2019).

141.

Mitchell

, Potash

, Barocas

, Prediction-based decisions and fairness: A catalogue of choices, assumptions and definitions, arXiv preprint arXiv:1811.07867 (2018).

142.

Moro

, Cortez

, Rita

, Bank marketing data set, Available at https://archive.ics.uci.edu/ml/datasets/bank+marketing, (2014).

143.

Nabi

, Shpitser

, Fair inference on outcomes, In AAAI Conference on Artificial Intelligence (2018).

144.

Nabi

, Malinsky

, Shpitser

, Learning optimal fair policies, arXiv preprint arXiv:1809.02244 (2018).

145.

Narasimhan

, Learning with complex loss functions and constraints, In International Conference on Artificial Intelligence and Statistics (2018).

146.

Network

H.P.

, Heritage health date set, Available at https://www.kaggle.com/c/hhp/data, (2011).

147.

Noriega-Campero

, Bakker

M.A.

, Garcia-Bulle

, Pentland

, Active fairness in algorithmic decision making, In AAAI/ACM Conference on AI, Ethics and Society (2019).

148.

B. of Labor Statistics, National longitudinal surveys of youth data set, Available at https://www.bls.gov/nls/, (2019).

149.

Olfat

, Aswani

, Spectral algorithms for computing fair support vector machines, arXiv preprint arXiv:1710.05895 (2017).

150.

Oneto

, Siri

, Luria

, Anguita

, Dropout prediction at university of genoa: a privacy preserving data driven approach, In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2017).

151.

Oneto

, Donini

, Elders

, Pontil

, Taking advantage of multitask learning for fair classification, In AAAI/ACM Conference on AI, Ethics and Society (2019).

152.

Oneto

, Donini

, Maurer

, Pontil

, Learning fair and transferable representations, arXiv preprint arXiv:1906.10673 (2019).

153.

Oneto

, Donini

, Pontil

, General fair empirical risk minimization, arXiv preprint arXiv:1901.10080 (2019).

154.

C.P. Wine quality data set. Available at https://archive.ics.uci.edu/ml/datasets/Wine+Quality, (2009).

155.

Papamitsiou

and Economides

A.A.

, Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence, Journal of Educational Technology & Society 17(4) (2014), 49–64.

156.

Pearl

, Causality: models, reasoning and inference, Springer (2000).

157.

Pearl

, Glymour

, Jewell

N.P.

, Causal inference in statistics: A primer, John Wiley & Sons (2016).

158.

Pedreschi

, Ruggieri

, Turini

, Measuring discrimination in socially-sensitive decision records, In SIAM International Conference on Data Mining (2009).

159.

Pedreshi

, Ruggieri

, Turini

, Discrimination-aware data mining, In ACM SIGKDD international conference on Knowledge discovery and data mining (2008).

160.

Pérez-Suay

, Laparra

, Mateo-García

, Muñoz-Marí

, Gómez-Chova

, Camps-Valls

, Fair kernel learning, In Machine Learning and Knowledge Discovery in Databases (2017).

161.

Perlich

, Dalessandro

, Raeder

, Stitelman

and Provost

, Machine learning for targeted display advertising: Transfer learning in action, Machine Learning 95(1) (2014), 103–127.

162.

Peters

, Janzing

, Schölkopf

, Elements of causal inference: foundations and learning algorithms, MIT press (2017).

163.

Pleiss

, Raghavan

, Wu

, Kleinberg

, Weinberger

K.Q.

, On fairness and calibration, In Neural Information Processing Systems (2017).

164.

Quadrianto

, Sharmanska

, Recycling privileged learning and distribution matching for fairness, In Neural Information Processing Systems (2017).

165.

Quionero-Candela

, Sugiyama

, Schwaighofer

, Lawrence

N.D.

, Dataset shift in machine learning, The MIT Press (2009).

166.

Qureshi

, Kamiran

, Karim

, Ruggieri

, Causal discrimination discovery through propensity score analysis, arXiv preprint arXiv:1608.03735 (2016).

167.

Raff

, Sylvester

, Gradient reversal against discrimination, arXiv preprint arXiv:1807.00392 (2018).

168.

Raff

, Sylvester

, Mills

, Fair forests: Regularized tree induction to minimize model bias, In AAAI/ACM Conference on AI, Ethics and Society (2018).

169.

Redmond

, Communities and crime data set. Available at http://archive.ics.uci.edu/ml/datasets/communities+and+crime, (2009).

170.

Rosenberg

, Levinson

, Trump’s catch-anddetain policy snares many who call the u.s. home, (2018). URL https://www.reuters.com/investigates/special-report/usaimmigration-court.

171.

Russell

, Kusner

M.J.

, Loftus

, Silva

, When worlds collide: integrating different counterfactual assumptions in fairness, In Neural Information Processing Systems (2017).

172.

Sadinle

, Lei

and Wasserman

, Least ambiguous setvalued classifiers with bounded error levels, Journal of the American Statistical Association (2018), 1–12.

173.

Schölkopf

, Herbrich

, Smola

, A generalized representer theorem, In Computational Learning Theory (2001).

174.

Scornet

, Biau

and Vert

, Consistency of random forests, Ann Statist 43(4) (2015), 1716–1741. 08.

175.

Shalev-Shwartz

, Ben-David

, Understanding machine learning: From theory to algorithms, Cambridge University Press (2014).

176.

Shawe-Taylor

, Cristianini

, Kernel methods for pattern analysis, Cambridge University Press (2004).

177.

Smola

A.J.

, Schölkopf

, Learning with Kernels, MIT Press (2001).

178.

Song

, Kalluri

, Grover

, Zhao

, Ermon

, Learning controllable fair representations, arXiv preprint arXiv:1812.04218 (2018).

179.

Speicher

, Heidari

, Grgic-Hlaca

, Gummadi

K.P.

, Singla

, Weller

, Zafar

M.B.

, A unified approach to quantifying algorithmic unfairness: Measuring individual & group unfairness via inequality indices, In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018).

180.

Srebro

, Learning with matrix factorizations, (2004).

181.

Strack

, DeShazo

J.P.

, Gennings

, Olmo

J.L.

, Ventura

, Cios

K.J.

and Clore

J.N.

, Diabetes 130-us hospitals for years – data set. Available at, (2014)–https://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospitals+for+years+1999-2008.

182.

Vaithianathan

, Maloney

, Putnam-Hornstein

and Jiang

, Children in the public benefit system at risk of maltreatment: Identification via predictive modeling, American Journal of Preventive Medicine 45(3) (2013), 354–359.

183.

Van

, de Geer, High-dimensional generalized linear models and the lasso, The Annals of Statistics 36(2) (2008), 614–645.

184.

Verma

, Rubin

, Fairness definitions explained, In IEEE/ACM International Workshop on Software Fairness (2018).

185.

Wadsworth

, Vera

, Piech

, Achieving fairness through adversarial learning: an application to recidivism prediction, arXiv preprint arXiv:1807.00199 (2018).

186.

Wang

, Koike-Akino

, Erdogmus

, Invariant representations from adversarially censored autoencoders, arXiv preprint arXiv:1805.08097 (2018).

187.

Wightman

L.F.

, Law school admissions, Available at https://www.lsac.org/data-research, (1998).

188.

Williamson

R.C.

, Menon

A.K.

, Fairness risk measures, arXiv preprint arXiv:1901.08665 (2019).

189.

Woodworth

, Gunasekar

, Ohannessian

M.I.

, Srebro

, Learning non-discriminatory predictors, In Computational Learning Theory, (2017).

190.

, Wu

, Using loglinear model for discrimination discovery and prevention, In IEEE International Conference on Data Science and Advanced Analytics, (2016).

191.

Yan

, Koyejo

, Zhong

, Ravikumar

, Binary classification with karmic, threshold-quasi-concave metrics, In International Conference on Machine Learning (2018).

192.

Yang

, Stoyanovich

, Measuring fairness in ranked outputs, In International Conference on Scientific and Statistical Database Management, (2017).

193.

Yang

, Minimax nonparametric classification: Rates of convergence, IEEE Transactions on Information Theory 45(7) (1999), 2271–2284.

194.

Yao

, Huang

, Beyond parity: Fairness objectives for collaborative filtering, In Neural Information Processing Systems, (2017).

195.

Yeh

I.C.

, Lien

C.H.

, Default of credit card clients data set, Available at https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients, (2016).

196.

Yona

, Rothblum

, Probably approximately metricfair learning, In International Conference on Machine Learning, (2018).

197.

Zafar

M.B.

, Valera

, Gomez Rodriguez

, Gummadi

K.P.

, Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment, In International Conference onWorld Wide Web, (2017).

198.

Zafar

M.B.

, Valera

, Gomez Rodriguez

, Gummadi

K.P.

, Fairness constraints: Mechanisms for fair classification, In International Conference on Artificial Intelligence and Statistics, (2017).

199.

Zafar

M.B.

, Valera

, Rodriguez

, Gummadi

, Weller

, From parity to preference-based notions of fairness in classification, In Neural Information Processing Systems, (2017).

200.

Zafar

M.B.

, Valera

, Gomez-Rodriguez

and Gummadi

K.P.

, Fairness constraints: A flexible approach for fair classification, Journal of Machine Learning Research 20(75) (2019), 1–42.

201.

Zehlike

, Hacker

, Wiedemann

, Matching code and law: Achieving algorithmic fairness with optimal transport, arXiv preprint arXiv:1712.07924, (2017).

202.

Zemel

, Wu

, Swersky

, Pitassi

, Dwork

, Learning fair representations, In International Conference on Machine Learning, (2013).

203.

Zhang

B.H.

, Lemoine

, Mitchell

, Mitigating unwanted biases with adversarial learning, In AAAI/ACM Conference on AI, Ethics and Society, (2018).

204.

Zhang

, Bareinboim

, Fairness in decision-making - the causal explanation formula, In AAAI Conference on Artificial Intelligence, (2018).

205.

Zhang

and Wu

, Anti-discrimination learning: a causal modeling-based framework, International Journal of Data Science and Analytics 4(1) (2017), 1–16.

206.

Zhang

, Wu

, A causal framework for discovering and removing direct and indirect discrimination, arXiv preprint arXiv:1611.07509, (2016).

207.

Zhang

, Wu

, Achieving non-discrimination in data release, In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2017).

208.

Zhang

, Wu

, Achieving non-discrimination in prediction, arXiv preprint arXiv:1703.00060, (2017).

209.

Zhao

M.J.

, Edakunni

, Pocock

and Brown

, Beyond fano’s inequality: bounds on the optimal f-score, ber and cost-sensitive risk and their implications, Journal of Machine Learning Research 14 (2013), 1033–1090.

210.

Zliobaite

, Kamiran

, Calders

, Handling conditional discrimination, In IEEE International Conference on Data Mining, (2011).