MAI-BIAS analysis for sensitive attributes: marital, education
based on FairBench reporting on 16 March 2026
What is this?
Responsible use
Analysis methodology
Data pipeline
For experts
We analysed how the accuracy is
distributed in a model's outputs given a tested dataset by comparing several protected groups
pairwise.
The assessment depends on specific parameters provided as inputs.
Fairness does not end after producing AI outputs.
💡 Continue interacting with stakeholders to assert that their idea of fairness is correctly implemented.
💡 Monitor the outputs of deployed systems by rerunning the analysis on updated models and datasets.
💡 Test model and dataset variations for multiple sensitive characteristics and parameters.
Keep a balance between justifying outputs as part of a fair process
and accommodating constructive criticism. Do not over-rely on technical justification, and ensure
meaningful human oversight whenever AI systems are deployed in decision-making,
high-stakes, or rights-impacting contexts. Human oversight prevents overreliance on imperfect models,
catches context-specific errors, and enables ethical judgment, accountability, and recourse for
affected people.
The max relative difference of the accuracy
is obtained across all protected groups, by comparing them pairwise.
The result is considered biased if it lays 0.200 away from its ideal target
that would indicate fairness. For example, the ideal target is 0 for differences between measure values,
and 1 for values that should be large (e.g., the minimum accuracy across all groups).
Some metrics have no known ideal values.
The analysis considered 12 protected groups:
marital divorced - education primary marital divorced - education secondary marital divorced - education tertiary marital divorced - education unknown marital single - education primary marital single - education secondary marital single - education tertiary marital single - education unknown marital married - education primary marital married - education secondary marital married - education tertiary marital married - education unknown
tabular data with common formatting
path: /home/maniospas/Documents/mammoth-commons/data/bank.csv max discrete: 10
Uses pandas to load
a CSV file that contains numeric, categorical, and predictive data columns.
This automatically detects the characteristics of the dataset being loaded,
namely the delimiter that separates the columns, and whether each column contains
numeric or categorical data.
The last categorical column is used as the dataset label. To load the file maintaining
more control over options (e.g., a subset of columns, a different label column) use the
custom csv loader instead.
How to replicate this data loader during AI creation?
If you want to train a model while using the same loading mechanism as this dataset,
run the following Python script. This uses supporting methods from the lightweight
mammoth-commons core to retrieve numpy
arrays *X,y* holding dataset features and categorical labels respectively.
% pip install --upgrade pandas
% pip install --upgrade mammoth_commons
import pandas as pd
from mammoth_commons.externals import pd_read_csv
from mammoth_commons.datasets import CSV
# set parameters and load data (modify max_discrete as needed)
path = ...
max_discrete = 10
df = pd_read_csv(path, on_bad_lines="skip")
# identify numeric and categorical columns
num = [col for col in df if pd.api.types.is_any_real_numeric_dtype(df[col])]
num = [col for col in num if len(set(df[col])) > max_discrete]
num_set = set(num)
cat = [col for col in df if col not in num_set]
# convert to numpy data
csv_dataset = CSV(df, num=num, cat=cat[:-1], labels=cat[-1])
X = X.astype(np.float32)
y = df[cat[-1]]
FairBench
pairwise maxrel acc
This is the accuracy of analysis that compares groups pairwise.
Caveats and recommendations
• This is a generic list of caveats that apply to all measures. • Non-quantitative criteria may also impact perceived fairness. • Choose carefully the criteria on when measures are considered close to their ideal values. • A single measure cannot decide whether a system is fair or biased without further investigation. It can at best indicate the absense of a particular bias. However, different measures are often at odds with each other, even when they have similar optima. • Consult with stakeholders to determine on which social and legal criteria systems should follow. This translates to choosing measures appropriate for the operating context.