Fair and Private Data Preprocessing through Microaggregation

Gonzalez-Zelaya, V; Salas, J; Megias, D; Missier, P

doi:10.1145/3617377

Fair and Private Data Preprocessing through Microaggregation

Lookup NU author(s): Dr Vlad Gonzalez, Professor Paolo Missier ORCiD

Downloads

Published version [.pdf]

Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).

Abstract

Copyright © 2023 held by the owner/author(s).Privacy protection for personal data and fairness in automated decisions are fundamental requirements for responsible Machine Learning. Both may be enforced through data preprocessing and share a common target: data should remain useful for a task, while becoming uninformative of the sensitive information. The intrinsic connection between privacy and fairness implies that modifications performed to guarantee one of these goals, may have an effect on the other, e.g., hiding a sensitive attribute from a classification algorithm might prevent a biased decision rule having such attribute as a criterion. This work resides at the intersection of algorithmic fairness and privacy. We show how the two goals are compatible, and may be simultaneously achieved, with a small loss in predictive performance. Our results are competitive with both state-of-the-art fairness correcting algorithms and hybrid privacy-fairness methods. Experiments were performed on three widely used benchmark datasets: Adult Income, COMPAS, and German Credit.

Publication metadata

Author(s): Gonzalez-Zelaya V, Salas J, Megias D, Missier P

Publication type: Article

Publication status: Published

Journal: ACM Transactions on Knowledge Discovery from Data

Year: 2024

Volume: 18

Issue: 3

Print publication date: 01/04/2024

Online publication date: 09/12/2023

Acceptance date: 18/08/2023

Date deposited: 30/01/2024

ISSN (print): 1556-4681

ISSN (electronic): 1556-472X

Publisher: Association for Computing Machinery

URL: https://doi.org/10.1145/3617377

DOI: 10.1145/3617377

Data Access Statement: The datasets on which our experiments were run are available at: Adult Income: https://archive.ics.uci.edu/ml/datasets/adult COMPAS: https://github.com/propublica/compas-analysis German Credit: https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data) They were all prepared using the data_cleanup.ipynb notebook available at the project’s repository.

Altmetrics

See more details

ePrints

Fair and Private Data Preprocessing through Microaggregation

Downloads

Licence

Abstract

Publication metadata

Altmetrics

Share