Check for a specific bias
Consider any dataset and predictive model that is able to process such data. You can compute a standardized measure of bias by choosing parameters from several options. This measure aims to assess whether business benefits, such as model accuracy (how correct the system is) are equalized for all population groups. This includes intersectional groups (e.g., Black Women is one of the intersections of gender and race).
To compute a standardized measure, use the specific concerns module to make an assessment like the following. We provide a default threshold for the assessment obtained through MAMMOth's co-creation process, but this needs to be adjusted in new social or legal contexts based on stakeholder feedback. Other expert parameters include the type of predictive benefits to be safeguarded, how these are compared, etc.
Analyze a model's bias
Given a reasonably trustworthy dataset (that at most encoudes unintended biases) that serves as ground truth, you can use it to assess the biases of an AI model. Here we run a model card summary on one such model. This explores hundreds of potential biases across pairwise comparisons of sensitive attribute intersections, as well as performance measures and benefits to safeguard in the population.
Not all found issues are deal-breakers in each setting, especially since it is mathematically impossible to mitigate all conceivable definitions of bias simultaneously.
We also compute a business benefit measure summarizing model performance, for example that correlates to business revenue. This is typically accuracy, though here we used expert options to designate high precision as desired (not erring in positive classifications). Typically, one can investigate system variations to get a sense of trade-offs between performance and fairness.
Audit a dataset for representational imbalances
Biased datasets are problematic themselves, as extreme biases cannot be offset by models without concessions in predictive accuracy (or in safeguarding other kinds of business benefits). Furthermore, if models take no bias mitigation action, dataset biases can be encoded or even amplified during real-world usage.
For this reason, MAI-BIAS offers various means of auditing datasets. Here we show the outcome of
checking for representational imbalances, especially those that occur due to the intersection of multiple
sensitive and predictive attributes; excessive specialization may make some societal subgroups too small
or empty. This particular analysis both visualizes the intersectional imbalances, and provides
means of mitigating these biases in the for experts tab.
Face region importance
When it comes down to computer vision models, it is often desire-able to check whether their means of making predictions conform to human intuition. For example, if a model makes assessment based on proxy visual features often correlated with sensitive attribute value (e.g., wearing earrings as a proxy of gender) it can be considered biased. Or models may pick up on unrelated features, like image backgrounds.
One of MAI-BIAS modules can be used to investigate how computer vision models focus on facial features. These are shown as a heatmap, in which red areas indicate more focus and blue areas indicate indifference. Characteristic image regions are shown underneath for each area. Interpreting this result is done qualitatively.
Uncover biases in text
The advent of large language models (LLMs) has created a wave of malicious actors that leverage them to promote malicious opinions and disinformation. Furthermore, when these models are used to generate text, they may encode biases and stereotypes of their training datasets.
Yet, the same models can also help uncover biases in human-generated or LLM-generated text. One challenge in doing so is mitigating the effect of hallucinations (information that LLMs consider factual but is not) as well as the fact that outputs could vary depending on small input changes.
Some MAI-BIAS modules can be used to assess LLMs or use them to audit textual biases. Below is a report where a model developed in 2023 is used to uncover biases in a news report on the 2026 war on Iran.