Wage classification [scikit-learn]ΒΆ
Giskard is an open-source framework for testing all ML models, from LLMs to tabular models. Donβt hesitate to give the project a star on GitHub βοΈ if you find it useful!
In this notebook, youβll learn how to create comprehensive test suites for your model in a few lines of code, thanks to Giskardβs open-source Python library.
Use-case:
Binary classification to predict whether a person makes over 50K a year or not given their demographic variation.
Outline:
Detect vulnerabilities automatically with Giskardβs scan
Automatically generate & curate a comprehensive test suite to test your model beyond accuracy-related metrics
Install dependenciesΒΆ
Make sure to install the giskard
[ ]:
%pip install giskard --upgrade
Import librariesΒΆ
[1]:
from pathlib import Path
from urllib.request import urlretrieve
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from giskard import Model, Dataset, scan, testing
Define constantsΒΆ
[ ]:
# Constants
RANDOM_SEED = 0
TEST_RATIO = 0.2
DROP_FEATURES = ["education", "native-country", "occupation", "marital-status", "educational-num"]
CATEGORICAL_FEATURES = ["workclass", "relationship", "race", "gender"]
NUMERICAL_FEATURES = [
"age",
"fnlwgt",
"capital-gain",
"capital-loss",
"hours-per-week",
]
TARGET_COLUMN = "income"
# Paths.
DATA_URL = (
"https://giskard-library-test-datasets.s3.eu-north-1.amazonaws.com/wage_classification_dataset-adult.csv.tar.gz"
)
DATA_PATH = Path.home() / ".giskard" / "wage_classification_dataset" / "adult.csv.tar.gz"
Dataset preparationΒΆ
Load and preprocess dataΒΆ
[ ]:
def fetch_demo_data(url: str, file: Path) -> None:
"""Helper to fetch data from the FTP server."""
if not file.parent.exists():
file.parent.mkdir(parents=True, exist_ok=True)
if not file.exists():
print(f"Downloading data from {url}")
urlretrieve(url, file)
print(f"Data was loaded!")
def download_data(**kwargs) -> pd.DataFrame:
"""Download the dataset using URL."""
fetch_demo_data(DATA_URL, DATA_PATH)
_df = pd.read_csv(DATA_PATH, **kwargs)
return _df
def preprocess_data(df: pd.DataFrame) -> pd.DataFrame:
# Drop NaNs and columns.
df = df.dropna()
df = df.drop(columns=DROP_FEATURES)
return df
[ ]:
income_df = download_data()
income_df = preprocess_data(income_df)
Train-test splitΒΆ
[5]:
X_train, X_test, y_train, y_test = train_test_split(
income_df.drop(columns=TARGET_COLUMN), income_df[TARGET_COLUMN], test_size=TEST_RATIO, random_state=RANDOM_SEED
)
Wrap dataset with GiskardΒΆ
To prepare for the vulnerability scan, make sure to wrap your dataset using Giskardβs Dataset class. More details here.
[ ]:
raw_data = pd.concat([X_test, y_test], axis=1)
giskard_dataset = Dataset(
df=raw_data,
# A pandas.DataFrame that contains the raw data (before all the pre-processing steps) and the actual ground truth variable (target).
target=TARGET_COLUMN, # Ground truth variable.
name="salary_data", # Optional.
cat_columns=CATEGORICAL_FEATURES,
# List of categorical columns. Optional, but is a MUST if available. Inferred automatically if not.
)
Model buildingΒΆ
Define preprocessing pipelineΒΆ
[7]:
preprocessor = ColumnTransformer(
transformers=[
("num", StandardScaler(), NUMERICAL_FEATURES),
("cat", OneHotEncoder(handle_unknown="ignore", sparse_output=False), CATEGORICAL_FEATURES),
]
)
Build estimatorΒΆ
[ ]:
pipeline = Pipeline(steps=[("preprocessor", preprocessor), ("classifier", RandomForestClassifier())])
pipeline.fit(X_train, y_train)
# Accuracy score.
train_metric = pipeline.score(X_train, y_train)
test_metric = pipeline.score(X_test, y_test)
print(f"Train accuracy: {train_metric:.2f}")
print(f"Test accuracy: {test_metric:.2f}")
Wrap model with GiskardΒΆ
To prepare for the vulnerability scan, make sure to wrap your model using Giskardβs Model class. You can choose to either wrap the prediction function (preferred option) or the model object. More details here.
[ ]:
giskard_model = Model(
model=pipeline,
# A prediction function that encapsulates all the data pre-processing steps and that could be executed with the dataset used by the scan.
model_type="classification", # Either regression, classification or text_generation.
name="salary_cls", # Optional.
classification_labels=pipeline.classes_, # Their order MUST be identical to the prediction_function's output order.
feature_names=X_train.columns, # Default: all columns of your dataset.
)
# Validate wrapped model.
wrapped_predict = giskard_model.predict(giskard_dataset)
wrapped_test_metric = accuracy_score(y_test, wrapped_predict.prediction)
print(f"Wrapped Test accuracy: {wrapped_test_metric:.2f}")
Detect vulnerabilities in your modelΒΆ
Scan your model for vulnerabilities with GiskardΒΆ
Giskardβs scan allows you to detect vulnerabilities in your model automatically. These include performance biases, unrobustness, data leakage, stochasticity, underconfidence, ethical issues, and more. For detailed information about the scan feature, please refer to our scan documentation.
[ ]:
results = scan(giskard_model, giskard_dataset)
[11]:
display(results)
Generate comprehensive test suites automatically for your modelΒΆ
Generate test suites from the scanΒΆ
The objects produced by the scan can be used as fixtures to generate a test suite that integrate all detected vulnerabilities. Test suites allow you to evaluate and validate your modelβs performance, ensuring that it behaves as expected on a set of predefined test cases, and to identify any regressions or issues that might arise during development or updates.
[12]:
test_suite = results.generate_test_suite("My first test suite")
test_suite.run()
2024-05-29 14:14:33,648 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,654 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (6902, 10) executed in 0:00:00.041066
Executed 'Overconfidence on data slice β`hours-per-week` < 41.500β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b723070>, 'threshold': 0.4973034997131383, 'p_threshold': 0.5}:
Test failed
Metric: 0.5
2024-05-29 14:14:33,676 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,684 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (879, 10) executed in 0:00:00.017064
Executed 'Underconfidence on data slice β`age` >= 41.500 AND `age` < 45.500β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6c2b00>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}:
Test failed
Metric: 0.02
2024-05-29 14:14:33,714 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,717 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (3923, 10) executed in 0:00:00.018021
Executed 'Underconfidence on data slice β`relationship` == "Husband"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6c9480>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}:
Test failed
Metric: 0.02
2024-05-29 14:14:33,742 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,745 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (1335, 10) executed in 0:00:00.017211
Executed 'Underconfidence on data slice β`age` >= 48.500 AND `age` < 58.500β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6c2d40>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}:
Test failed
Metric: 0.02
2024-05-29 14:14:33,772 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,776 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (6485, 10) executed in 0:00:00.022294
Executed 'Underconfidence on data slice β`gender` == "Male"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6ca350>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}:
Test failed
Metric: 0.01
2024-05-29 14:14:33,790 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,792 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (1498, 10) executed in 0:00:00.010130
Executed 'Recall on data slice β`relationship` == "Own-child"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66d180>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.3
2024-05-29 14:14:33,807 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,809 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (576, 10) executed in 0:00:00.007495
Executed 'Recall on data slice β`workclass` == "?"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66f0a0>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.34
2024-05-29 14:14:33,827 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,829 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (2528, 10) executed in 0:00:00.013633
Executed 'Recall on data slice β`relationship` == "Not-in-family"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66ca90>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.35
2024-05-29 14:14:33,849 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,850 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (1076, 10) executed in 0:00:00.008265
Executed 'Recall on data slice β`relationship` == "Unmarried"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66ee90>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.38
2024-05-29 14:14:33,865 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,867 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (935, 10) executed in 0:00:00.008937
Executed 'Recall on data slice β`race` == "Black"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b7ef3d0>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.38
2024-05-29 14:14:33,880 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,882 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (721, 10) executed in 0:00:00.008114
Executed 'Recall on data slice β`workclass` == "Self-emp-not-inc"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b721120>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.39
2024-05-29 14:14:33,902 pid:72955 MainThread giskard.datasets.base INFO Casting dataframe columns from {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'} to {'age': 'int64', 'workclass': 'object', 'fnlwgt': 'int64', 'relationship': 'object', 'race': 'object', 'gender': 'object', 'capital-gain': 'int64', 'capital-loss': 'int64', 'hours-per-week': 'int64'}
2024-05-29 14:14:33,905 pid:72955 MainThread giskard.utils.logging_utils INFO Predicted dataset with shape (3284, 10) executed in 0:00:00.014940
Executed 'Recall on data slice β`gender` == "Female"β' with arguments {'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b7ef040>, 'threshold': 0.5253512132822478}:
Test failed
Metric: 0.52
2024-05-29 14:14:33,917 pid:72955 MainThread giskard.core.suite INFO Executed test suite 'My first test suite'
2024-05-29 14:14:33,917 pid:72955 MainThread giskard.core.suite INFO result: failed
2024-05-29 14:14:33,917 pid:72955 MainThread giskard.core.suite INFO Overconfidence on data slice β`hours-per-week` < 41.500β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b723070>, 'threshold': 0.4973034997131383, 'p_threshold': 0.5}): {failed, metric=0.5041237113402062}
2024-05-29 14:14:33,918 pid:72955 MainThread giskard.core.suite INFO Underconfidence on data slice β`age` >= 41.500 AND `age` < 45.500β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6c2b00>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}): {failed, metric=0.023890784982935155}
2024-05-29 14:14:33,918 pid:72955 MainThread giskard.core.suite INFO Underconfidence on data slice β`relationship` == "Husband"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6c9480>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}): {failed, metric=0.02013764975783839}
2024-05-29 14:14:33,918 pid:72955 MainThread giskard.core.suite INFO Underconfidence on data slice β`age` >= 48.500 AND `age` < 58.500β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6c2d40>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}): {failed, metric=0.01647940074906367}
2024-05-29 14:14:33,918 pid:72955 MainThread giskard.core.suite INFO Underconfidence on data slice β`gender` == "Male"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b6ca350>, 'threshold': 0.011710512846760161, 'p_threshold': 0.95}): {failed, metric=0.013415574402467233}
2024-05-29 14:14:33,919 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`relationship` == "Own-child"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66d180>, 'threshold': 0.5253512132822478}): {failed, metric=0.2962962962962963}
2024-05-29 14:14:33,919 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`workclass` == "?"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66f0a0>, 'threshold': 0.5253512132822478}): {failed, metric=0.3448275862068966}
2024-05-29 14:14:33,919 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`relationship` == "Not-in-family"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66ca90>, 'threshold': 0.5253512132822478}): {failed, metric=0.3490909090909091}
2024-05-29 14:14:33,919 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`relationship` == "Unmarried"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b66ee90>, 'threshold': 0.5253512132822478}): {failed, metric=0.38095238095238093}
2024-05-29 14:14:33,920 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`race` == "Black"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b7ef3d0>, 'threshold': 0.5253512132822478}): {failed, metric=0.38333333333333336}
2024-05-29 14:14:33,920 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`workclass` == "Self-emp-not-inc"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b721120>, 'threshold': 0.5253512132822478}): {failed, metric=0.391304347826087}
2024-05-29 14:14:33,920 pid:72955 MainThread giskard.core.suite INFO Recall on data slice β`gender` == "Female"β ({'model': <giskard.models.sklearn.SKLearnModel object at 0x17c037f40>, 'dataset': <giskard.datasets.base.Dataset object at 0x17afa3310>, 'slicing_function': <giskard.slicing.slice.QueryBasedSliceFunction object at 0x32b7ef040>, 'threshold': 0.5253512132822478}): {failed, metric=0.5231607629427792}
[12]:
Customize your suite by loading objects from the Giskard catalogΒΆ
The Giskard open source catalog will enable to load:
Tests such as metamorphic, performance, prediction & data drift, statistical tests, etc
Slicing functions such as detectors of toxicity, hate, emotion, etc
Transformation functions such as generators of typos, paraphrase, style tune, etc
To create custom tests, refer to this page.
For demo purposes, we will load a simple unit test (test_f1) that checks if the test F1 score is above the given threshold. For more examples of tests and functions, refer to the Giskard catalog.
[ ]:
test_suite.add_test(testing.test_f1(model=giskard_model, dataset=giskard_dataset, threshold=0.7)).run()