We used IBM’s AIF360 library to checks for common types of bias and found the following:
false positive rate ratioFairness does not end after producing AI outputs.
💡 Continue interacting with stakeholders to assert that their idea of fairness is correctly implemented.Keep a balance between justifying outputs as part of a fair process and accommodating constructive criticism. Do not over-rely on technical justification, and ensure meaningful human oversight whenever AI systems are deployed in decision-making, high-stakes, or rights-impacting contexts. Human oversight prevents overreliance on imperfect models, catches context-specific errors, and enables ethical judgment, accountability, and recourse for affected people.
Each fairness metric provided by AIF360 is computed across 7 groups, each of which is compared to the rest of the population. No group intersections are accounted for. We check whether notions of bias exceed 0.05 in a scale 0-1 where 0 represents biased systems, or whether notions of fairness are lesser than 0.95 in a scale 0-1 where 1 represents fair systems.
Considered groups are:% pip install --upgrade pandas % pip install --upgrade mammoth_commons import pandas as pd from mammoth_commons.externals import pd_read_csv from mammoth_commons.datasets import CSV # set parameters and load data (modify max_discrete as needed) path = ... max_discrete = 10 df = pd_read_csv(path, on_bad_lines="skip") # identify numeric and categorical columns num = [col for col in df if pd.api.types.is_any_real_numeric_dtype(df[col])] num = [col for col in num if len(set(df[col])) > max_discrete] num_set = set(num) cat = [col for col in df if col not in num_set] # convert to numpy data csv_dataset = CSV(df, num=num, cat=cat[:-1], labels=cat[-1]) X = X.astype(np.float32) y = df[cat[-1]]
| Metric | Value |
|---|---|
| Accuracy | 0.885 |
| Average Abs Odds Difference | 0.000 |
| Average Odds Difference | 0.000 |
| Base Rate | 0.885 |
| Between All Groups CV | 0.026 |
| Between All Groups GEI | 0.000 |
| Between All Groups Theil Index | 0.000 |
| Bias Amplification | 2.819 |
| Coefficient of Variation | 0.286 |
| Disparate Impact | 1.000 |
| Equal Opportunity Difference | 0.000 |
| Equalized Odds Difference | 0.000 |
| Error Rate | 0.115 |
| False Discovery Rate | 0.115 |
| False Negative Rate | 0.000 |
| False Negative Rate Difference | 0.000 |
| False Omission Rate | 0.000 |
| False Omission Rate Difference | 0.000 |
| False Positive Rate | 1.000 |
| False Positive Rate Difference | 0.000 |
| False Positive Rate Ratio | 1.000 |
| Gen. Entropy Index | 0.041 |
| Gen. Equalized Odds Difference | 0.000 |
| Gen. False Negative Rate | 0.000 |
| Gen. False Positive Rate | 0.000 |
| Gen. True Negative Rate | 1.000 |
| Gen. True Positive Rate | 1.000 |
| Negative Predictive Value | 0.000 |
| Num False Negatives | 0.000 |
| Num False Positives | 521.000 |
| Num Gen. False Negatives | 0.000 |
| Num Gen. False Positives | 0.000 |
| Num Gen. True Negatives | 521.000 |
| Num Gen. True Positives | 4000.000 |
| Num Instances | 4521.000 |
| Num Negatives | 521.000 |
| Num Positives | 4000.000 |
| Num Pred. Negatives | 0.000 |
| Num Pred. Positives | 4521.000 |
| Num True Negatives | 0.000 |
| Num True Positives | 4000.000 |
| Positive Predictive Value | 0.885 |
| Selection Rate | 1.000 |
| Smoothed EDF | 1.226 |
| Statistical Parity Difference | 0.000 |
| Theil Index | 0.034 |
| True Negative Rate | 0.000 |
| True Positive Rate | 1.000 |
| True Positive Rate Difference | 0.000 |