Among the best (highest ranked) large language model (LLM) of each organization in the Chatbot Arena LLM Leaderboard, what will be the rank of Elon Musk's xAI as of 17 October 2025?

Started Apr 18, 2025 04:00PM UTC
Closed Oct 17, 2025 07:01AM UTC
Challenges
Tags

Elon Musk's artificial intelligence startup, xAI, has joined the likes of OpenAI, Google, and Anthropic to develop cutting edge LLMs (Guardian, Testing Catalog, x.ai). The question will be suspended on 16 October 2025 and the outcome determined using the "Arena Score" data as reported by the Chatbot Arena LLM Leaderboard at approximately 5:00PM ET on 17 October 2025 (Chatbot Arena, click the arrow to the right of "Arena Score" and see "Organization" column). As of 14 April 2025, xAI's rank was third as an organization with its "Grok-3-Preview-02-24" model ranked behind Google's best model [Gemini-2.5-Pro-Exp-03-25] and OpenAI's best model [ChatGPT-4o-latest (2025-03-26]). For more information on how the leaderboard is constructed, see (LMSYS Blog). In the event of an Arena Score tie for a rank among models from different organizations, the LLM with the higher number of "Votes" will be deemed to be ranked higher. If an organization has two or more models ranked higher than the highest-ranked model of xAI, that organization will only be counted once.

Confused? Check our FAQ or ask us for help. To learn more about Good Judgment and Superforecasting, click here.

To learn more about how you can become a Superforecaster, see hereFor other posts from our Insights blog, click here.

NOTE 27 May 2025: The source has changed the layout of its site. The "Text Arena" leaderboard will be used for resolution (https://lmarena.ai/leaderboard/text).


The question closed "5th or lower" with a closing date of 17 October 2025.

See our FAQ to learn about how we resolve questions and how scores are calculated.

Possible Answer Correct? Final Crowd Forecast
1st 1%
2nd 1%
3rd 1%
4th 2%
5th or lower 95%

Crowd Forecast Profile

Participation Level
Number of Forecasters 45
Average for questions older than 6 months: 160
Number of Forecasts 178
Average for questions older than 6 months: 475
Accuracy
Participants in this question vs. all forecasters average

Most Accurate

Relative Brier Score

1.
-0.243427
2.
-0.200004
3.
-0.174933
4.
-0.132179
5.
-0.094908

Recent Consensus, Probability Over Time

Files
Tip: Mention someone by typing @username