Will a Google large language model (LLM) be ranked first as of 12 December 2025, according to LMArena's

The University Forecasting Challenge asks:

Will a Google large language model (LLM) be ranked first as of 12 December 2025, according to LMArena's "Text Arena"?

Started Oct 03, 2025 05:00PM UTC
Closed Dec 12, 2025 08:01AM UTC

Challenges

2025 University Forecasting Challenge In the News 2025

Tags

Business Technology

LMArena is an open-source platform for crowdsourced AI benchmarking, created by researchers from UC Berkeley SkyLab (SkyLab). For more information on how the leaderboard is constructed, see LMArena - Blog. The question will be suspended on 11 December 2025 and the outcome determined using the ranks as reported by LMArena at approximately 5:00p.m. ET on 12 December 2025 (LM Arena - Text Arena Leaderboard, see "Rank (UB)"). As of 30 September 2025, Google was ranked first, with its "gemini-2.5-pro" scoring 1456, followed by Anthropic's "claude-opus-4-1-20250805-thinking-16k" scoring 1449. In the event of a tie for first place by LLMs of different organizations, the LLM with the higher "Score" will be considered first, followed by the "Votes" total. If the named source changes the way it presents the data, further instructions will be provided.

Confused? Check our FAQ or ask us for help. To learn more about Good Judgment and Superforecasting, click here.

To learn more about how you can become a Superforecaster, see here. For other posts from our Insights blog, click here.

The question closed "Yes" with a closing date of 12 December 2025.

See our FAQ to learn about how we resolve questions and how scores are calculated.

Possible Answer	Correct?	Final Crowd Forecast
Yes		97%
No		3%

Crowd Forecast Profile

Participation Level
Number of Forecasters	47
Average for questions in their first 6 months: 134
Number of Forecasts	212
Average for questions in their first 6 months: 361

Accuracy
Participants in this question vs. all forecasters	average

Most Accurate

Relative Brier Score

Redratz55

-0.090421

-0.088957

NeverMindTheOtherSockPuppet2

-0.086877

Pirain

-0.085847

NeverMindTheSockPuppet

-0.078734

The University Forecasting Challenge asks: