What is this?
Responsible use
Analysis methodology
Data pipeline
For experts

Some system performance metrics, which indicate obtained benefits like correct or favorable operation, were found unevenly distributed across the population. These biases occurred in at least one prediction class and at least one way of aggregating the comparison among multiple groups. Expert assessment is needed to help understand which biases may be considered unfair. The biased metrics are:

the true negative rate/specificity (tnr)
the true rejection ratio (true negatives compared to all) (trr)
the Matthews correlation coefficient (mcc)
the true acceptance ratio (true positives compared to all) (tar)
the positive rate (pr)
the accuracy (acc)
the geometric mean of tpr and tnr - accounts for class imbalance (gmi)
the f1 score (f1)
the Cohen's Kappa score (kappa)
the lift ratio (tpr divided by pr) (lift)
the true positive rate/recall/sensitivity/hit rate (tpr)
the positive predictive value/precision (ppv)

Fairness does not end after producing AI outputs.

šŸ’” Continue interacting with stakeholders to assert that their idea of fairness is correctly implemented.
šŸ’” Monitor the outputs of deployed systems by rerunning the analysis on updated models and datasets.
šŸ’” Test model and dataset variations for multiple sensitive characteristics and parameters.

Keep a balance between justifying outputs as part of a fair process and accommodating constructive criticism. Do not over-rely on technical justification, and ensure meaningful human oversight whenever AI systems are deployed in decision-making, high-stakes, or rights-impacting contexts. Human oversight prevents overreliance on imperfect models, catches context-specific errors, and enables ethical judgment, accountability, and recourse for affected people.

Groups were compared pairwise. Values deviating more than 0.050 from their ideal target were counted as problematic. These deviations guide where deeper inspection is needed. The result is considered biased if it lays 0.050 away from its ideal target that would indicate fairness. For example, the ideal target is 0 for differences between measure values, and 1 for values that should be large (e.g., the minimum accuracy across all groups). Some metrics have no known ideal values.

The analysis considered 7 protected groups:
education unknown
education tertiary
education secondary
education primary
marital married
marital single
marital divorced

csv

tabular data with custom formatting

path: /home/maniospas/Documents/mammoth-commons/data/bank/bank.csv
delimiter: ;
numeric: age, duration, campaign, pdays, previous
categorical: marital, job, education, default, housing, contact, loan, poutcome
label: y
skip invalid lines: True
Uses pandas to load a CSV file that contains custom specification of numeric, categorical, and predictive data columns. Each row corresponds to a different data sample, with the first one sometimes holding column names (this is automatically detected).


ONNX

a serialized machine learning model

path: /home/maniospas/Documents/mammoth-commons/data/model.onnx
trained with sensitive: True
Loads an inference model stored in the ONNx format, which is a generic cross-platform way of representing AI models with a common set of operations. The loaded model should be compatible with the dataset being analysed, for example having been trained on the same tabular data columns.
Technical details and how to export a model to this format. Several machine learning frameworks can export to ONNx. The latter supports several different runtimes, but this module's implementation selects the `CPUExecutionProvider` runtime to run on to maintain compatibility with most machines. For inference in GPUs, prefer storing and loading models in formats that are guaranteed to maintain all features that could be included in the architectures of respective frameworks; this can be achieved with other model loaders. Here are some quick links on how to export ONNx models from popular frameworks:

Summary of measures.
NameDescription
accthe accuracy
tprthe true positive rate/recall/sensitivity/hit rate
ppvthe positive predictive value/precision
f1the f1 score
gmithe geometric mean of tpr and tnr - accounts for class imbalance
mccthe Matthews correlation coefficient
kappathe Cohen's Kappa score
prthe positive rate
tarthe true acceptance ratio (true positives compared to all)
trrthe true rejection ratio (true negatives compared to all)
liftthe lift ratio (tpr divided by pr)
tnrthe true negative rate/specificity

Summary of reductions.
NameDescription
minthe minimum
maxerrorthe maximum deviation from the ideal value
wmeanthe weighted average
meanthe average
gmthe geometric mean
maxrelthe maximum relative difference
maxdiffthe maximum difference
ginithe gini coefficient
stdx2the standard deviation x2

FairBench
class yes class no
minmaxerrorwmeanmeangmmaxrelmaxdiffginistdx2minmaxerrorwmeanmeangmmaxrelmaxdiffginistdx2
acc0.8560.1440.8850.8820.8820.052 0.8560.1440.8850.8820.8820.052
tpr010.0100.01101 0.523
ppv010.5340.452010.8330.3610.6100.8570.1430.8860.8830.8820.053
f101 0.021010.0500.521 0.9220.078 0.9370.937
gmi01 0.00801 0.573 0.8550.145 0.8820.8810.052
mcc-0.0181 0.056 10.1490.5190.102-0.0181 0.056 10.1490.5190.102
kappa-0.00610.0160.016 1 0.597 -0.00610.0160.016 1 0.597
pr 1 0.411
tar 1 0.554 0.0560.050
trr 0.0560.050 1 0.554
lift 150.3184
tnr 010.0100.01101 0.523
FairBench

fairness modelcard

This is a modelcard that contains popular fairness stamps.
These are obtained from analysis that compares several groups.
Stamps contain caveats and recommendation that should be considered during practical adoption. They are only a part of the full analysis that has been conducted, so consider also viewing the full generated report to find more prospective biases.


Computations cover several cases.

worst accuracy

This stamp is the accuracy of analysis that compares several groups..

Details
This is the minimum benefit the system brings to any group.

Caveats and recommendations
• The worst case is a lower bound but not an estimation of overall performance.
• There may be different distributions of benefits that could be protected.
• Ensure continuous monitoring and re-evaluation as group dynamics and external factors evolve.
• Ensure that high worst accuracy translates to meaningful benefits across all groups in the real-world context.
• Seek input from affected groups to understand the impact of errors and to inform remediation strategies.

Distribution

Computations cover several cases.
class yes
This is branch class yes.
0.856 min acc
Obtained from 7 values

class no
This is branch class no.
0.856 min acc
Obtained from 7 values

differential fairness

This stamp is the accuracy of analysis that compares several groups..

Details
The worst deviation of accuracy ratios from 1 is reported, so that value of 1 indicates disparate impact, and value of 0 disparate impact mitigation.

Caveats and recommendations
• Disparate impact may not always be an appropriate fairness consideration, and may obscure other important fairness concerns or create new disparities.
• Always consider trade-offs with overall or minimum accuracy, as the easiest way to "optimize" for this measure would be to degrade accuracy for all groups to the lowest level among groups.
• Ensure continuous monitoring and re-evaluation as group dynamics and external factors evolve.

Distribution

Computations cover several cases.
class yes
This is branch class yes.
0.052 maxrel acc
Obtained from 7 values

class no
This is branch class no.
0.052 maxrel acc
Obtained from 7 values


FairBench
class yes min class yes maxerror class yes wmean class yes mean class yes gm class yes maxrel class yes maxdiff class yes gini class yes stdx2 class no min class no maxerror class no wmean class no mean class no gm class no maxrel class no maxdiff class no gini class no stdx2
acctprppvf1gmimcckappaacctprppvf1gmimcckappaacctprppvkappaacctprppvf1gmimcckappaacctprppvf1gmieducation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorcededucation unknowneducation tertiaryeducation secondaryeducation primarymarital marriedmarital singlemarital divorced
education unknown0.8980000000.8980000000.8980000.8980000000.8980000
education tertiary0.8600.0260.8330.0500.0220.1320.0420.8600.0260.8330.0500.0220.1320.0420.8600.0260.8330.0420.8600.0260.8330.0500.0220.1320.0420.8600.0260.8330.0500.022
education secondary0.8940.0040.5000.0080.0020.0380.0060.8940.0040.5000.0080.0020.0380.0060.8940.0040.5000.0060.8940.0040.5000.0080.0020.0380.0060.8940.0040.5000.0080.002
education primary0.9030000-0.018-0.0060.9030000-0.018-0.0060.90300-0.0060.9030000-0.018-0.0060.9030000
marital married0.9010.0070.5000.0140.0040.0510.0110.9010.0070.5000.0140.0040.0510.0110.9010.0070.5000.0110.9010.0070.5000.0140.0040.0510.0110.9010.0070.5000.0140.004
marital single0.8610.0120.6670.0240.0080.0760.0190.8610.0120.6670.0240.0080.0760.0190.8610.0120.6670.0190.8610.0120.6670.0240.0080.0760.0190.8610.0120.6670.0240.008
marital divorced0.8560.0260.6670.0500.0170.1120.0390.8560.0260.6670.0500.0170.1120.0390.8560.0260.6670.0390.8560.0260.6670.0500.0170.1120.0390.8560.0260.6670.0500.017
acc 0.8980.8600.8940.9030.9010.8610.856 0.8980.8600.8940.9030.9010.8610.8560.8980.8600.8940.9030.9010.8610.8560.8980.8600.8940.9030.9010.8610.8560.8980.8600.8940.9030.9010.8610.8560.8980.8600.8940.9030.9010.8610.8560.8980.8600.8940.9030.9010.8610.856
pr 00.0040.0010.0030.0010.0030.006 00.0040.0010.0030.0010.0030.006
tpr 00.0260.00400.0070.0120.026 00.0260.00400.0070.0120.026
ppv 00.8330.50000.5000.6670.66700.8330.50000.5000.6670.66700.8330.50000.5000.6670.66700.8330.50000.5000.6670.6670.8980.8600.8940.9050.9020.8620.8570.8980.8600.8940.9050.9020.8620.8570.8980.8600.8940.9050.9020.8620.8570.8980.8600.8940.9050.9020.8620.8570.8980.8600.8940.9050.9020.8620.8570.8980.8600.8940.9050.9020.8620.857
f1 00.0500.00800.0140.0240.05000.0500.00800.0140.0240.05000.0500.00800.0140.0240.050 0.9460.9240.9440.9490.9480.9250.9220.9460.9240.9440.9490.9480.9250.922 0.9460.9240.9440.9490.9480.9250.9220.9460.9240.9440.9490.9480.9250.922
gmi 00.0220.00200.0040.0080.017 00.0220.00200.0040.0080.017 0.8980.8590.8940.9020.9010.8610.8550.8980.8590.8940.9020.9010.8610.855 0.8980.8590.8940.9020.9010.8610.8550.8980.8590.8940.9020.9010.8610.8550.8980.8590.8940.9020.9010.8610.855
tar 00.0040.00000.0010.0020.004 00.0040.00000.0010.0020.004 0.8980.8560.8930.9030.9000.8600.8520.8980.8560.8930.9030.9000.8600.852
trr 0.8980.8560.8930.9030.9000.8600.8520.8980.8560.8930.9030.9000.8600.852 00.0040.00000.0010.0020.004 00.0040.00000.0010.0020.004
lift 0540544054054405405440540544
mcc 00.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.112 00.1320.038-0.0180.0510.0760.112 00.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.11200.1320.038-0.0180.0510.0760.112
kappa 00.0420.006-0.0060.0110.0190.039 00.0420.006-0.0060.0110.0190.039 00.0420.006-0.0060.0110.0190.03900.0420.006-0.0060.0110.0190.03900.0420.006-0.0060.0110.0190.03900.0420.006-0.0060.0110.0190.039 00.0420.006-0.0060.0110.0190.039 00.0420.006-0.0060.0110.0190.039
tnr 00.0260.00400.0070.0120.02600.0260.00400.0070.0120.02600.0260.00400.0070.0120.02600.0260.00400.0070.0120.02600.0260.00400.0070.0120.02600.0260.00400.0070.0120.026 00.0260.00400.0070.0120.026