Browse by author
Lookup NU author(s): Dr Maksim Kalameyets, Professor Ben FarrandORCiD, Dr Lei ShiORCiD
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
The United Nations’ Sustainable Development Goals (UN SDGs) prioritise inclusive and fair employment. However, AI-powered recruitment tools—particularly Large Language Models (LLMs)—raise concerns about potential demographic bias. This paper presents a controlled synthetic dataset and methodology to measure how sensitive attributes (e.g., race, gender, age) influence candidate rankings and pairwise comparisons in LLM-based hiring pipelines. Specifically, we generated a balanced dataset of 1,000 synthetic candidate profiles (each including a cover letter) and evaluated it using 28 frontier LLMs, including proprietary (e.g., OpenAI GPT, Gemini, Grok, Claude) and opensource (e.g., Llama, GigaChat) models. Synthetic data eliminates real-world demographic/occupational confounders, ensuring observed disparities reflect only LLMs’ intrinsic behaviour. Results show professional attributes (e.g., skills, experience) are primary ranking drivers, with 76%–80% statistically significant; however, 8%–9% of demographic attributes exhibit persistent, significant biases across multiple LLMs.We develop a “bias map” quantifying LLM performance, emphasising that mitigating even minor biases in automated hiring is critical to avoid perpetuating employment inequities and uphold the UN SDGs’ inclusive vision.
Author(s): Jalilzade E, Kalameyets M, Malviya S, Owens R, Katsigiannis S, Farrand B, Shi L
Publication type: Conference Proceedings (inc. Abstract)
Publication status: In Press
Conference Name: IEEE BigData 2025 - 1st International Workshop on Harnessing Big Data Analytics with Large Language Models
Year of Conference: 2025
Acceptance date: 25/11/2025
Publisher: IEEE