Models
No model
![]()
focus on dataset bias
Signifies that the analysis should focus solely on the bias/fairness of the dataset. Different means are used to verify the latter. Also consider alternate models that can help analyze dataset biases, like the trivial predictor. Not auditing datasets from early on in system creation may irrevocably embed their biases in the dataflow in ways that are hard to catch, quantify, or mitigate later.
Ollama model

interacts with an ollama LLM
Allows interaction with a locally hosted ollama large
language model (LLM). The interaction can either aim to assess biases of that model, or to use it as an
aid in discovering qualitative biases in text.
A simple guide to get ollama running.
Set up this up in the machine where MAI-BIAS runs. Here, information required to prompt that model is provided.
brew install ollama
ollama --version
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
winget install Ollama.Ollama
Start the local service and keep it running while you work.
ollama serve
You only need to pull a model once; updates reuse most weights via deltas.
ollama pull llama3
Manual predictor
![]()
manual predictions for the dataset
Manually loaded predictions may have been produced by workflows external to the toolkit.
How to format the predictions.
Input comma-separated list of predictions that correspond to the data you are processing.
This is useful so that you can export the predictions directly from your testing code. If there are no
commas provided, this module's argument is considered to be the URL to a CSV file whose last column contains
the predictions. Other columns are ignored, but are allowed to give you flexibility. If your dataset also
has the last column as the prediction label (e.g., if it is loaded with auto csv) you will obtain analysis
for a perfect predictor.
Trivial predictor
![]()
uncovers biases when features are ignored
This is a deliberately biased predictor that ignores dataset features and decides on a fixed prediction based on the majority.
How does this work?
Creates a trivial predictor that returns the most common predictive label value among provided data.
If the label is numeric, the median is computed instead. This model servers as an informed baseline
of what happens even for an uninformed predictor. Several kinds of class biases may exist, for example
due to different class imbalances for each sensitive attribute dimension (e.g., for old white men
compared to young hispanic women).
Onnx
a serialized machine learning model
Loads an inference model stored in the ONNx format, which is a generic cross-platform way of representing AI models with a common set of operations. The loaded model should be compatible with the dataset being analysed, for example having been trained on the same tabular data columns.
Technical details and how to export a model to this format.
Several machine learning frameworks can export to ONNx. The latter
supports several different runtimes, but this module's implementation selects
the CPUExecutionProvider runtime to run on to maintain compatibility
with most machines.
For inference in GPUs, prefer storing and loading models in formats
that are guaranteed to maintain all features that could be included in the architectures
of respective frameworks; this can be achieved with other model loaders.
Here are some quick links on how to export ONNx models from popular frameworks:
ParametersOnnx ensemble

boosted ensemble of weak learners
Enables predictions using a boosting ensemble mechanism, which combines multiple simple AI models (called weak learners) to obtain an improvement over their individual prediction accuracy. The MMM-fair library is used to load and process the ensemble.
Technical details.
Weak learners are often decision trees. However, this module allows any model converted to ONNX format and zipped inside a directory path along with other meta-informations (if any) stored in .npy format.
To load a model, users need to supply a zip file. This should include at least one or possibly many trained models, each saved in the ONNX format, as well as parameters, such as weights (often denoted as ‘alphas’), that define each learner’s contribution to the final model. For an example of preparing this file, please see our notebook.
The module recommends using the MMM-Fair models. MMM-Fair is a fairness-aware machine learning framework designed to support high-stakes AI decision-making under competing fairness and accuracy demands. The three M’s stand for: • Multi-Objective: Optimizes across classification accuracy, balanced accuracy, and fairness (specifically, maximum group-level discrimination). • Multi-Attribute: Supports multiple protected groups (e.g., race, gender, age) simultaneously, analyzing group-specific disparities. • Multi-Definition: Evaluates and compares fairness under multiple definitions—Demographic Parity (DP), Equal Opportunity (EP), and Equalized Odds (EO). MMM-Fair enables developers, researchers, and decision-makers to explore the full spectrum of possible trade-offs and select the model configuration that aligns with their social or organizational goals. For theoretical understanding of MMM-Fair, it is recomended to read the published scientific article that introduced the foundation of the MMM-algorithms.
How to create an AI ensemble with MMM-fair?
Create and integrate your own model on intended data follow the instructions given in the library's Pypi package guidance.
Torch

deep learning model (for GPU)
Loads a PyTorch deep learning model that comprises a Python code initializing the architecture, and a file of trained parameters.
ParametersTorch2onnx

deep learning model (for CPU)
Loads a PyTorch deep learning model that comprises code initializing the architecture, and a file of trained parameters. The result is however converted into the ONNx format to support processing by analysis methods that are not compatible with GPU computations.
ParametersFair node ranking

fairness-aware node ranking algorithm
Constructs a node ranking algorithm that is a variation non-personalized PageRank. The base algorithm is often computes a notion of centrality/structural importance for each node in the graph, and employs a diffusion parameter in the range [0, 1). Find more details on how the algorithm works based on the following seminal paper:
Page, L. (1999). The PageRank citation ranking: Bringing order to the web. Technical Report.
The base node ranking algorithm is enriched by fairness-aware interventions implemented
by the pygrank library. The latter
may run on various computational backends, but numpy is selected due to its compatibility
with a broad range of software and hardware. All implemented algorithms transfer node score
mass from over-represented groups of nodes to those with lesser average mass using different
strategies that determine the redistribution details. Fairness is imposed in terms of centrality
scores achieving similar score mass between groups. The three available strategies are
described here:
nonedoes not employ any fairness intervention and runs the base algorithm.uniformapplies a uniform rank redistribution strategy.originaltries to preserve the order of original node ranks by distributing more score mass to those.
Mitigation ranking

mitigate ranking disparities
This is a hyper-fair algorithm for mitigating researcher ranking disparities.
Technical details.
The algorithm utilizes a sampling technique based on Statistical Parity; it aims to ensure equitable treatment
across different groups by mitigating bias in the ranking process. Additionally, it compares
the results of this fair ranking with a standard ranking derived from one of the numerical columns.