Fairness notions and metrics#

In brief#

The term fairness is defined as the quality or state of being fair; or a lack of favoritism towards one side. The notions of fairness, and quantitative measures of them (fairness metrics), can be distinguished based on the focus on individuals, groups and sub-groups.

More in Detail#

The term fairness is defined as the quality or state of being fair; or a lack of favoritism towards one side. However, like Bias, fairness can mean different concepts to different peoples, different contexts, and different disciplines. The definition of fairness in various disciplines is detailed in [1]. An unfair model produces results that are skewed towards particular individuals or groups. The primary sources of this unfairness are the presence of biases. There are two important categories of biases which play crucial role in fairness; (i) technical bias and (ii) social bias. Technical biases can be traced back to the sources , but social biases are very difficult to fix as these are a matter of politics, perspectives, and shifts in prejudices and preconceptions that can take years to change [3]. Most of the state-of-the-art techniques tackle technical errors, but it cannot resolve the root causes of bias. Based on this observation, Sandra et al. [3] proposed three responses concerning algorithmic bias and resulting social inequality. The first is not an active choice as it allows the system to get worse and do nothing to fix biases. Second, incorporate techniques to fix technical errors and maintain a status quo to ensure that the system do not make it worse. Much works in fairness focused on this option, called ‘bias preserving fairness’, maintains a status quo as a baseline, aligns with the formal equality of EU non-discrimination law. Finally, ‘bias transforming fairness’, the third response focuses on the substantive equality of EU non-discrimination which can only be achieved by accounting for historical (social) inequalities. As argued in [3], users (developers,deployers etc.) should give preference to ‘bias transforming’ fairness metrics, when a fairness metric is used to make substantive decisions about people in contexts where significant disparity has been previously observed.

The notions of fairness fall under individuals, groups and sub-groups. Individual fairness ensures that similar individuals should be treated similarly. It accounts for the distance measures to evaluate the similarity of individuals [4, 5]. On the other hand, group fairness compares quantities at the group level primarily identified by protective features such as gender, ethnicity etc. etc. [6, 7]. Sub-group fairness is more rigid than group fairness as this ensures fairness concerning one or more structured sub-groups defined by sensitive features, interpolates between individual and group fairness notions [8]. According to [9], it is impossible to satisfy all of the above notions, leading to conflicts between fairness definitions. Therefore, one suggestion could be to select appropriate fairness criteria and use those based on the application and deployment. Another concern has risen in [10], temporal aspects of fairness notions may harm the sensitive groups over time if not updated.

Some widely used fairness metrics: In order to recall some widely used fairness metrics we need to introduce some notation. Let \(V\), \(A\), and \(X\) be three random variables representing, respectively, the total set of features, the sensitive features, and the remaining features describing an individual such that \(V=(X,A)\) and \(P(V=v_i)\) represents the probability of drawing an individual with a vector of values \(v_i\) from the population. For simplicity, we focus on the case where \(A\) is a binary random variable where \(A=0\) designates the protected group, while \(A=1\) designates the non-protected group. Let \(Y\) represent the actual outcome and \(\hat{Y}\) represent the outcome returned by the prediction algorithm. Without loss of generality, assume that \(Y\) and \(\hat{Y}\) are binary random variables where \(Y=1\) designates a positive instance, while \(Y=0\) a negative one. Typically, the predicted outcome \(\hat{Y}\) is derived from a score represented by a random variable \(S\) where \(P[S = s]\) is the probability that the score value is equal to \(s\).

Statistical parity [11] is one of the most commonly accepted notions of fairness. It requires the prediction to be statistically independent of the sensitive feature \((\hat{Y} \perp A)\). In other words, the predicted acceptance rates for both protected and unprotected groups should be equal. Statistical parity implies that
\(\displaystyle \frac{TP+FP}{TP+FP+FN+TN}\) 1
is equal for both groups. A classifier Ŷ satisfies statistical parity if:
\(\label{eq:sp} P[\hat{Y} \mid A = 0] = P[\hat{Y} \mid A = 1].\)

Conditional statistical parity [12] is a variant of statistical parity obtained by controlling on a set of resolving features2. The resolving features (we refer to them as \(R\)) among \(X\) are correlated with the sensitive feature \(A\) and give some factual information about the label at the same time leading to a legitimate discrimination. Conditional statistical parity holds if:
\(\label{eq:csp} P[\hat{Y}=1 \mid R=r,A = 0] = P[\hat{Y}=1 \mid R=r,A = 1] \quad \forall r \in range(R).\)

Equalized odds [13] considers both the predicted and the actual outcomes. The prediction is conditionally independent from the protected feature, given the actual outcome \((\hat{Y} \perp A \mid Y)\). In other words, equalized odds requires that both sub-populations to have the same true positive rate \(TPR = \frac{TP}{TP+FN}\) and false positive rate \(FPR = \frac{FP}{FP+TN}\):
\(\label{eq:eqOdds} P[\hat{Y} = 1 \mid Y=y,\; A=0] = P[\hat{Y}=1 \mid Y= y,\; A=1] \quad \forall{ y \in \{0,1\}}.\)

Because equalized odds requirement is rarely satisfied in practice, two variants can be obtained by relaxing its equation. The first one is called equal opportunity [13] and is obtained by requiring only TPR equality among groups:
\(\label{eq:eqOpp} P[\hat{Y}=1 \mid Y=1,A = 0] = P[\hat{Y}=1\mid Y=1,A = 1].\)
As \(TPR\) does not take into consideration \(FP\), equal opportunity is completely insensitive to the number of false positives.

The second relaxed variant of equalized odds is called predictive equality [12] which requires only the FPR to be equal in both groups:
\(\label{eq:predEq} P[\hat{Y}=1 \mid Y=0,A = 0] = P[\hat{Y}=1\mid Y=0,A = 1].\)
Since \(FPR\) is independent from \(FN\), predictive equality is completely insensitive to false negatives.

Conditional use accuracy equality [14] is achieved when all population groups have equal positive predictive value \(PPV=\frac{TP}{TP+FP}\) and negative predictive value \(NPV=\frac{TN}{FN+TN}\). In other words, the probability of subjects with positive predictive value to truly belong to the positive class and the probability of subjects with negative predictive value to truly belong to the negative class should be the same. By contrast to equalized odds, one is conditioning on the algorithm’s predicted outcome not the actual outcome. In other words, the emphasis is on the precision of prediction rather than its recall:
\(\label{eq:condUseAcc} P[Y=y\mid \hat{Y}=y ,A = 0] = P[Y=y\mid \hat{Y}=y,A = 1] \quad \forall{ y \in \{0,1\}}.\)

Predictive parity [15] is a relaxation of conditional use accuracy equality requiring only equal \(PPV\) among groups: $\(\label{eq:predPar} P[Y=1 \mid \hat{Y} =1,A = 0] = P[Y=1\mid \hat{Y} =1,A = 1]\)$ Like predictive equality, predictive parity is insensitive to false negatives.

Overall accuracy equality [14] is achieved when overall accuracy for both groups is the same. This implies that

\[\label{eq:accuracy} \frac{TP+TN}{TP+FN+FP+TN}\]

is equal for both groups:

\[\label{eq:ovAcc} P[\hat{Y} = Y | A = 0] = P[\hat{Y} = Y | A = 1]\]

Treatment equality [14] is achieved when the ratio of FPs and FNs is the same for both protected and unprotected groups:
\(\label{eq:treatEq} \frac{FN}{FP}\)A=0 \(= \frac {FN}{FP}\)A=1

Total fairness [14] holds when all aforementioned fairness notions are satisfied simultaneously, that is, statistical parity, equalized odds, conditional use accuracy equality (hence, overall accuracy equality), and treatment equality. Total fairness is a very strong notion which is very difficult to hold in practice.

Balance [9] uses the score (\(S\)) from which the outcome \(Y\) is typically derived through thresholding.
Balance for positive class focuses on the applicants who constitute positive instances and is satisfied if the average score \(S\) received by those applicants is the same for both groups:
\(\label{eq:balPosclass} E[S \mid Y =1,A = 0)] = E[S \mid Y =1,A = 1].\)
Balance of negative class focuses instead on the negative class:
\(\label{eq:balNegclass} E[S \mid Y =0,A = 0] = E[S \mid Y =0,A = 1].\)

Calibration [15] holds if, for each predicted probability score \(S=s\), individuals in all groups have the same probability to actually belong to the positive class:
\(\label{eq:calib} P[Y =1 \mid S =s,A = 0] = P[Y =1 \mid S =s,A = 1] \quad \forall s \in [0,1].\)

Well-calibration [9] is a stronger variant of calibration. It requires that (1) calibration is satisfied, (2) the score is interpreted as the probability to truly belong to the positive class, and (3) for each score \(S=s\), the probability to truly belong to the positive class is equal to that particular score:
\(\label{eq:wellCalib} P[Y =1 \mid S =s,A = 0] = P[Y =1 \mid S =s,A = 1] = s \quad \forall \; {s \in [0,1]}.\)

Fairness through awareness [11] implies that similar individuals should have similar predictions. Let \(i\) and \(j\) be two individuals represented by their attributes values vectors \(v_i\) and \(v_j\). Let \(d(v_i,v_j)\) represent the similarity distance between individuals \(i\) and \(j\). Let \(M(v_i)\) represent the probability distribution over the outcomes of the prediction. For example, if the outcome is binary (\(0\) or \(1\)), \(M(v_i)\) might be \([0.2,0.8]\) which means that for individual \(i\), \(P[\hat{Y}=0]) = 0.2\) and \(P[\hat{Y}=1] = 0.8\). Let \(d_M\) be a distance metric between probability distributions. Fairness through awareness is achieved iff, for any pair of individuals \(i\) and \(j\):
\(d_M(M(v_i), M(v_j)) \leq d(v_i, v_j)\)
In practice, fairness through awareness assumes that the similarity metric is known for each pair of individuals [16]. That is, a challenging aspect of this approach is the difficulty to determine what is an appropriate metric function to measure the similarity between two individuals. Typically, this requires careful human intervention from professionals with domain expertise [17].

Process fairness  [18] (or procedural fairness) can be described as a set of subjective fairness notions that are centered on the process that leads to outcomes. These notions are not focused on the fairness of the outcomes, instead they quantify the fraction of users that consider fair the use of a particular set of features. They are subjective as they depend on user judgments which may be obtained by subjective reasoning.

A natural approach to improve process fairness is to remove all sensitive (protected or salient) features before training classifiers. This simple approach connects process fairness to fairness through unawareness. However, there is a trade-off to manage since dropping out sensitive features may impact negatively classification performance [19].

Nonstatistical fairness metrics: Recently, further metrics have been proposed and that differ from the previous in that they do not fully rely on statistical considerations, and take into account domain knowledge, that is not directly observable from data, require expert input, or reason about hypothetical situations. As they fall out of the scope of this chapter, we will not further dwell into these and simply mention a few to the interested reader: total effect [20] (that is the “causal” version of statistical parity and measures the effect of changing the value of an attribute, taking into account a given causal graph), effect of treatment of the treated [20] (that relies on counterfactuals with respect to sensitive features and measures the difference between the probabilities of instances and their counterfactuals), and counterfactual fairness [17] (which is a fine-grained variant of the previous but with respect to the set all features).

Discussion: As the above fairness metrics often conflict, and it is not possible to be fair according to all of these definitions, it is a challenge to choose the relevant metric to focus on. While still very much an open research area, some suggestions on how one can deal with conflicts between fairness metrics can be found in [2, 21]. Indeed, fairness metrics frequently conflict with other metrics such as accuracy and privacy. [22] show that in a credit scoring case enforcing fairness metrics can lead to significant drops in accuracy and, thus, maximum profit. This is unavoidable: improvements on fairness often result in lower accuracy, and research on the Pareto frontier for this trade-off is now emerging [23, 24]. Similarly, there is a trade-off between fairness and privacy, as fairness metrics typically require sensitive information in order to be used. As a result, fairness affects privacy (and vice versa), for example in facial recognition [25] and medical applications [26].

Finally, there is a connection between fairness and Justice, seen in for example Rawls’ work on Justice as Fairness [27]. And indeed, a range of theories of (distributive) justice describe how benefits and burdens should be distributed (cf. the entry on ). As such, they can be seen as guiding the outcomes of algorithms even if they describe what these distributions should be in society as a whole. Yet, as [3] argue at length, there is little overlap between theories of distributive justice and fairness metrics. Non-comparative notions of justice are not captured by fairness metrics, nor are notions such as Rawls’ difference principle, on which the right distribution is the one where the worst off have the highest absolute level of benefits. Fairness metrics have focused more on discrimination than on justice.



Deirdre K Mulligan, Joshua A Kroll, Nitin Kohli, and Richmond Y Wong. This thing called fairness: disciplinary confusion realizing a value in technology. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW):1–36, 2019.


Michele Loi and Markus Christen. Choosing how to discriminate: navigating ethical trade-offs in fair algorithmic design for the insurance sector. Philosophy & Technology, 34(4):967–992, 2021.


Sandra Wachter, Brent Mittelstadt, and Chris Russell. Bias preservation in machine learning: the legality of fairness metrics under eu non-discrimination law. W. Va. L. Rev., 123:735, 2020.


Philips George John, Deepak Vijaykeerthy, and Diptikalyan Saha. Verifying individual fairness in machine learning models. In UAI, volume 124 of Proceedings of Machine Learning Research, 749–758. AUAI Press, 2020.


Asia J. Biega, Krishna P. Gummadi, and Gerhard Weikum. Equity of attention: amortizing individual fairness in rankings. In SIGIR, 405–414. ACM, 2018.


Yu Cheng, Zhihao Jiang, Kamesh Munagala, and Kangning Wang. Group fairness in committee selection. ACM Transactions on Economics and Computation (TEAC), 8(4):1–18, 2020.


Vincent Conitzer, Rupert Freeman, Nisarg Shah, and Jennifer Wortman Vaughan. Group fairness for the allocation of indivisible goods. In AAAI, 1853–1860. AAAI Press, 2019.


Michael J. Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. Preventing fairness gerrymandering: auditing and learning for subgroup fairness. In ICML, volume 80 of Proceedings of Machine Learning Research, 2569–2577. PMLR, 2018.


Jon M. Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determination of risk scores. In ITCS, volume 67 of LIPIcs, 43:1–43:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.


Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. Delayed impact of fair machine learning. In IJCAI, 6196–6200. ijcai.org, 2019.


Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, 214–226. 2012.


Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. In KDD, 797–806. ACM, 2017.


Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In NIPS, 3315–3323. 2016.


Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal justice risk assessments: the state of the art. Sociological Methods & Research, 50:3–44, 2018.


Alexandra Chouldechova. Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.


Michael P. Kim, Omer Reingold, and Guy N. Rothblum. Fairness through computationally-bounded awareness. In NeurIPS, 4847–4857. 2018.


Matt J. Kusner, Joshua R. Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In NIPS, 4066–4076. 2017.


Nina Grgic-Hlaca, Muhammad Bilal Zafar, Krishna P. Gummadi, and Adrian Weller. Beyond distributive fairness in algorithmic decision making: feature selection for procedurally fair learning. In AAAI, 51–60. AAAI Press, 2018.


Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P. Gummadi. Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In WWW, 1171–1180. ACM, 2017.


Judea Pearl. Causality. Cambridge university press, 2009.


Michelle Seng Ah Lee, Luciano Floridi, and Jatinder Singh. Formalising trade-offs beyond algorithmic fairness: lessons from ethical philosophy and welfare economics. AI Ethics, 1(4):529–544, 2021.


Nikita Kozodoi, Johannes Jacob, and Stefan Lessmann. Fairness in credit scoring: assessment, implementation and profit implications. European Journal of Operational Research, 297(3):1083–1094, 2022.


Susan Wei and Marc Niethammer. The fairness-accuracy pareto front. Statistical Analysis and Data Mining: The ASA Data Science Journal, 2020.


Annie Liang, Jay Lu, and Xiaosheng Mu. Algorithmic design: fairness versus accuracy. arXiv preprint arXiv:2112.09975, 2021. URL: https://arxiv.org/abs/2112.09975.


Alice Xiang. Being'seen'vs.'mis-seen': tensions between privacy and fairness in computer vision. Harvard Journal of Law & Technology, Forthcoming, 2022.


Andrew Chester, Yun Sing Koh, Jörg Wicker, Quan Sun, and Junjae Lee. Balancing utility and fairness against privacy in medical data. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 1226–1233. IEEE, 2020.


John Rawls. Justice as fairness: A restatement. Harvard University Press, 2001.


Matthias Kuppler, Christoph Kern, Ruben L Bach, and Frauke Kreuter. Distributive justice and fairness metrics in automated decision-making: how much overlap is there? arXiv preprint arXiv:2105.01441, 2021. URL: https://arxiv.org/abs/2105.01441.


Faisal Kamiran, Indre Zliobaite, and Toon Calders. Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl. Inf. Syst., 35(3):613–644, 2013.

This entry was written by Resmi Ramachandranpillai, Fredrik Heintz, Stefan Buijsman, Miguel Couceiro, Guilherme Alves, Karima Makhlouf, and Sami Zhioua.


\(TP,FP,FN,\) and \(TN\) stand for: true positives, false positives, false negatives, and true negatives, respectively.


Called explanatory features in [29].