기본 정보
연구 분야
프로젝트
논문
구성원
preprint|
인용수 0
·2025
Camellia: Benchmarking Cultural Biases in LLMs for Asian Languages
Tarek Naous, Anagha Savit, Carlos Rafael Catalan, Guo, Geyang, Jaehyeok Lee, K. Lee, Lheane Marie Dizon, Mengyu Ye, Neel Kothari, Singh, Sahajpreet, Sarah Masud, Tanish Patwa, Tran, Trung Thanh, Zohaib Khan, Alan Ritter, JinYeong Bak, Keisuke Sakaguchi, Tanmoy Chakraborty, Yuki Arase, Wei Xu
ArXiv.org
초록

As Large Language Models (LLMs) gain stronger multilingual capabilities, their ability to handle culturally diverse entities becomes crucial. Prior work has shown that LLMs often favor Western-associated entities in Arabic, raising concerns about cultural fairness. Due to the lack of multilingual benchmarks, it remains unclear if such biases also manifest in different non-Western languages. In this paper, we introduce Camellia, a benchmark for measuring entity-centric cultural biases in nine Asian languages spanning six distinct Asian cultures. Camellia includes 19,530 entities manually annotated for association with the specific Asian or Western culture, as well as 2,173 naturally occurring masked contexts for entities derived from social media posts. Using Camellia, we evaluate cultural biases in four recent multilingual LLM families across various tasks such as cultural context adaptation, sentiment association, and entity extractive QA. Our analyses show a struggle by LLMs at cultural adaptation in all Asian languages, with performance differing across models developed in regions with varying access to culturally-relevant data. We further observe that different LLM families hold their distinct biases, differing in how they associate cultures with particular sentiments. Lastly, we find that LLMs struggle with context understanding in Asian languages, creating performance gaps between cultures in entity extraction.

키워드
BenchmarkingAdaptation (eye)Context (archaeology)Asian IndianCultural diversityAsian americansCultural biasLanguages of Asia
타입
preprint
IF / 인용수
- / 0
게재 연도
2025

주식회사 디써클

대표 장재우,이윤구서울특별시 강남구 역삼로 169, 명우빌딩 2층 (TIPS타운 S2)대표 전화 0507-1312-6417이메일 info@rndcircle.io사업자등록번호 458-87-03380호스팅제공자 구글 클라우드 플랫폼(GCP)

© 2026 RnDcircle. All Rights Reserved.