Datasets
Auto csv
Loads a CSV file that contains numeric, categorical, and predictive data columns. This automatically detects the characteristics of the dataset being loaded, namely the delimiter that separates the columns, and whether each column contains numeric or categorical data. A pandas CSV reader is employed internally. The last categorical column is used as the dataset label. To load the file using different options (e.g., a subset of columns, a different label column) use the custom csv loader instead.
ParametersUci
Loads a dataset from the UCI Machine Learning Repository (www.uci.org) containing numeric, categorical, and predictive data columns. The dataset is automatically downloaded from the repository, and basic preprocessing is applied to identify the column types. The specified target column is treated as the predictive label. To customize the loading process (e.g., use a different target column, load a subset of features, or handle missing data differently), additional parameters or a custom loader can be used.
ParametersRead any
Loads a dataset for analysis from either a pre-loaded pandas DataFrame or a file in one of the supported formats: .csv
, .xls
, .xlsx
, .xlsm
, .xlsb
, .odf
, .ods
, .json
, .html
, or .htm
.
The module accepts either a raw DataFrame or a file path (local or URL). If a file path is provided, the data is automatically loaded using the appropriate pandas function based on the file extension. Basic preprocessing is applied to infer column types, and the specified target column is treated as the predictive label.
To customize the loading process (e.g., load a subset of columns, handle missing values, or change column type inference), additional parameters or a custom loader function may be provided.
The Data loader module is recommended to load and process local data also while training models which are intented to be tested using the ONNXEnsemble module.
ParametersCustom csv
Loads a CSV file that contains numeric, categorical, and predictive data columns separated by a user-defined delimiter. Each row corresponds to a different data sample, with the first one sometimes holding column names (this is automatically detected). To use all data in the file and automate discovery of numerical and categorical columns, as well as of delimiters, use the auto csv
loader instead. Otherwise, set here all loading parameters. A pandas CSV reader is employed internally.
Csv rankings
This is a Loader to load .csv files with information about researchers The Path
should be given relative to your locally running instance (e.g.: ./data/researchers/Top_researchers.csv) The Delimiter
should match the CSV file you have (e.g.: '|')
Path Delimiter
Researchers
This is a Loader to load .csv files with information about researchers The papers path
and papers affiliations
should be given relative to your locally running instance (e.g.: ./data/researchers/Top_researchers.csv) The Delimiter
should match the CSV file you have (e.g.: '|')
Graph
Loads the edges of a graph organized as rows of a comma-delimited file.
ParametersImages
Loads image data from a CSV file holding their sensitive and predictive attribute data, as well as paths relative to a root directory. Loaded images are subjected to a Python transformation.
ParametersImage pairs
Loads image pairs declared in a CSV file. The expected format is to have the first image's identifier in the first column, and the second image's identifier in the second column, Sensitive attributes can be selected from the rest of the columns. The images identifiers read from the columns are transformed to loading paths by string specifications that can contain the symbols: {root} to refer to the root directory, {col} to refer to the column name, and {id} to refer to the column entry.
ParametersFree text
Sets a free text that can be used by text-based AI to perform various kinds of analysis, such as detecting biases and sentiment. Some modules may also use this text as a prompt to feed into large language models (LLMs). You may optionally provide a website's URL (starting with http: or https: to retrieve its textual contents.
Parameters