Gender Bias in Automated CV Evaluation: Evidence from Counterfactual Simulations Using Synthetic Data from Mexico
DOI:
https://doi.org/10.60758/laer.v37i.538Keywords:
Gender bias, Automated CV screening, Synthetic data experimentsAbstract
This paper investigates the presence of gender bias in automated CV evaluations conducted by a large language model (LLM). Using over 14,000 synthetically generated CVs representative of the Mexican labor market across six occupational categories, we implement a counterfactual design that isolates the effect of perceived gender by switching only the name and reported gender of each candidate. The analysis reveals systematic and occupation-specific biases: female candidates receive higher scores when presented as male in traditionally masculine roles (e.g., truck driver), while male candidates gain when reclassified as female in feminized occupations (e.g., nursing, elementary teaching). Notably, we document a statistically significant and operationally meaningful pro-female bias in the high-status Chief Financial Officer role. These asymmetries persist under deterministic prompting (temperature = 0), ruling out randomness as a confounder. Our design is contextually grounded, using names, educational institutions, and employers common in Mexico, and offers a scalable methodology for bias auditing in LLMs. The findings highlight the necessity of localized fairness assessments and raise concerns about the equity implications of deploying general-purpose AI tools in personnel selection.
References
Abid, A., Farooqi, M., & Zou, J. (2021). Persistent anti-Muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 298–306). Association for Computing Machinery. https://doi.org/10.1145/3461702.3462624
Arceo-Gomez, E., & Campos-Vazquez, R. M. (2014). Race and marriage in the labor market: A discrimination correspondence study in a developing country. American Economic Review, 104 (5), 376–380. https://doi.org/10.1257/aer.104.5.376
Arceo-Gomez, E., & Campos-Vazquez, R. M. (2019). Double discrimination: Is discrimination in job ads accompanied by discrimination in callbacks? Journal of Economics, Race, and Policy, 2 (2), 82–94. https://doi.org/10.1007/s41996-019-00031-3
Armstrong, B., Hernández, L., & Rivera, D. (2024). Bias in CV Evaluation by Large Language Models: Evidence from Gender-Swapped Simulations. Economics of AI Review, 12(1), 45–72.
Bertrand, M. and Mullainathan, S. 2004. Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94 (4): 991–1013. https://doi.org/10.1257/0002828042002561
Bertrand, M., & Duflo, E. (2017). Field experiments on discrimination. In A. Banerjee & E. Duflo (Eds.), Handbook of Economic Field Experiments (Vol. 1, pp. 309–393). North-Holland. https://doi.org/10.1016/bs.hefe.2016.08.004
Blommaert, L., Coenders, M., & van Tubergen, F. (2014). Ethnic discrimination in recruitment and decision makers’ features: Evidence from laboratory experiment and survey data using a student sample. Social Indicators Research, 116, 731–754. https://doi.org/10.1007/s11205-013-0329-4
Blommaert, L., van Tubergen, F., & Coenders, M. (2012). Implicit and explicit interethnic attitudes and ethnic discrimination in hiring. Social Science Research, 41, 61–73. https://doi.org/10.1016/j.ssresearch.2011.09.007
Bravo, D., Sanhueza, C., & Urzúa, S. (2011). An Experimental Study of Labor Market Discrimination: Gender, Social Class and Neighborhood in Chile. IBD Working Paper No. 226, http://dx.doi.org/10.2139/ssrn.1815907
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research, 81, 77–91. https://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a.pdf
Campos-Vazquez, R. M., & Gonzalez, E. (2020). Obesity and hiring discrimination. Economics & Human Biology, 37, 100850. https://doi.org/10.1016/j.ehb.2020.100850
Chang, X. (2023). Gender bias in hiring: An analysis of the impact of Amazon's recruiting algorithm. Advances in Economics, Management and Political Sciences, 23(1), 134–140. https://doi.org/10.54254/2754-1169/23/20230367
Chaturvedi, S., & Chaturvedi, R. (2025). Who gets the callback? Generative AI and gender bias. arXiv preprint arXiv:2504.21400. https://arxiv.org/pdf/2504.21400
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Ding, L., Smith, J., Wang, Y., & Lee, K. (2024). Probing social bias in labor market text generation by ChatGPT: A masked language model approach. In Proceedings of the Neural Information Processing Systems Conference (NeurIPS 2024). https://papers.nips.cc/paper_files/paper/2024/hash/fce2d8a485746f76aac7b5650db2679d-Abstract-Conference.html
Expansión. (2023). Las 500 empreas más importantes de México. Expansión. https://expansion.mx/las-500-empresas-mas-importantes-mexico
Feng, X., Dou, L., Li, E., Wang, Q., Wang, H., Guo, Y., ... & Kong, L. (2024). A survey on large language model-based social agents in game-theoretic scenarios. arXiv preprint, arXiv:2412.03920. https://doi.org/10.48550/arXiv.2412.03920
Galarza, F. B., & Yamada, G. (2014). Labor market discrimination in Lima, Peru: Evidence from a field experiment. World Development, 58, 83–94. https://doi.org/10.1016/j.worlddev.2014.01.003
Guo, F. (2023). GPT in game theory experiments. arXiv preprint, arXiv:2305.05516. https://doi.org/10.48550/arXiv.2305.05516
Howard, S., & Borgella, A. M. (2019). Are Adewale and Ngochi more employable than Jamal and Lakeisha? The influence of nationality and ethnicity cues on employment-related evaluations of Blacks in the United States. The Journal of Social Psychology, 160(4), 509–519. https://doi.org/10.1080/00224545.2019.1687415
Instituto Nacional de Estadística y Geografía. (2023). Estadística de Nacimientos Registrados (serie 2000–2023) [Conjunto de datos]. INEGI. https://www.inegi.org.mx/programas/natalidad/
King, E. B., Madera, J. M., Hebl, M. R., & Knight, J. L. (2006). What’s in a name? A multiracial investigation of the role of occupational stereotypes in selection decisions. Journal of Applied Social Psychology, 36(5), 1145–1159. https://doi.org/10.1111/j.0021-9029.2006.00035.x
Kiritchenko, S., & Mohammad, S. M. (2018). Examining gender and race bias in two hundred sentiment analysis systems. arXiv preprint, arXiv:1805.04508.
Kotek, H., Dockum, R., and Sun, D. (2023). Gender bias and stereotypes in large language models. In Proceedings of the ACM Collective Intelligence Conference. 12-24.
Kotek, H., Zhang, Y., Zhou, P., & Smith, N. A. (2023). Stereotypical Bias Amplification in Large Language Models. Proceedings of the ACL 2023, 5112–5124.
Kübler, D., Schmid, J., & Stüber, R. (2018). Gender discrimination in hiring across occupations: A nationally-representative vignette study. Labour Economics, 55, 215–229. https://doi.org/10.1016/j.labeco.2018.10.002
Lippens, L. (2024). Computer says ‘no’: Exploring systemic bias in ChatGPT using an audit approach. Computers in Human Behavior: Artificial Humans, Volume 2, Issue 1, January–July 2024, 100054. https://doi.org/10.1016/j.chbah.2024.100054
Lippens, L., Dalle, A., D'hondt, F., Verhaeghe, P. & Baert, S. (2023). Understanding ethnic hiring discrimination: A contextual analysis of experimental evidence, Labour Economics, 85, 1-19, https://doi.org/10.1016/j.labeco.2023.102453
Martíınez-Alfaro, A., Silverio-Murillo, A., & Balmori-de-la-Miyar, J. (2024). What’s in a name? Evidence of transgender labor discrimination in Mexico. Journal of Economic Behavior & Organization, 227, 106738. https://doi.org/10.1016/j.jebo.2024.106738
Moss-Racusin, C. A., Dovidio, J. F., Brescoll, V. L., Graham, M. J., & Handelsman, J. (2012). Science faculty’s subtle gender biases favor male students. Proceedings of the National Academy of Sciences, 109(41), 16474–16479. https://doi.org/10.1073/pnas.1211286109
Nogales, R., Córdova, P., & Urquidi, M. (2020). The impact of university reputation on employment opportunities: Experimental evidence from Bolivia. The Economic and Labour Relations Review, 31(4), 524–542. https://doi.org/10.1177/1035304620962265
OpenAI. (2024). Evaluating fairness in ChatGPT. Available in: https://openai.com/index/evaluating-fairness-in-chatgpt/
Quillian, L., Pager, D., Hexel, O., & Midtboen, A. H. (2017). Meta-analysis of field experiments shows no change in racial discrimination in hiring over time. Proceedings of the National Academy of Sciences, 114(41), 10870–10875. https://doi.org/10.1073/pnas.1706255114
Ross, J., Kim, Y., & Lo, A. W. (2024). LLM economicus? Mapping the behavioral biases of LLMs via utility theory. SSRN. https://doi.org/10.2139/ssrn.4926791
Society for Human Resource Management. (2022, April 12). Fresh SHRM research explores use of automation and AI in HR. SHRM. https://www.shrm.org/content/dam/en/shrm/topics-tools/news/technology/SHRM-2022-Automation-AI-Research.pdf
Torres, J., Herz, S., Pérez, A., & Barrón, M. (2024). Labor Market Discrimination Against Venezuelans in Peru: Evidence from a Correspondence Study. Economia, 47(94), 1-23. https://doi.org/10.18800/economia.202402.001
Venkit, P. N., Gautam, S., Panchanadikar, R., Huang, T., and Wilson, S. (2023). Nationality Bias in Text Generation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 116–122, Dubrovnik, Croatia. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-main.9
Venkit, P. N., Srinath, M., and Wilson. S. (2022). A Study of Implicit Bias in Pretrained Language Models against People with Disabilities. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 1324–1332. https://aclanthology.org/2022.coling-1.113
Verhaeghe, P. P. (2022). Correspondence studies. In K. F. Zimmermann (Ed.), Handbook of Labor, Human Resources and Population Economics. Springer. https://doi.org/10.1007/978-3-319-57365-6_306-1
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Edgar Cruz, Alejandro T. Moreno‑Okuno, Johanna Zamilpa

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
LAER Copyright and License
Authors submitting articles to Latin American Economic Review (LAER), automatically grant this journal a license to publish. Copyright of all published material remains with the authors, who can reuse it in future work without needing to make reference to LAER. Similarly, any other contribution of material to the website (for example text, photographs, graphics, video or audio) automatically grants us a right to publish. Copyright, however, remains with the author(s).
Authors release their work under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). This license allows anyone to copy, distribute and transmit the work, provided the use has no derivatives, is non-commercial and appropriate credit to the author(s) is given. (If you remix, transform, or build upon the material, you may not distribute the modified material.)
A human-readable summary of the licence:
https://creativecommons.org/licenses/by-nc-nd/4.0/
Full legal text: