Frequently Asked Questions (FAQ)
Last updated: November 3, 2016
If you have questions about GJ Open, please read our FAQ below. If you still have questions about how GJ Open works, you can email us at [email protected].
GJ Open is a crowd forecasting site where you can hone your forecasting skills, learn about the world, and engage with other forecasters. On GJ Open, you can make forecasts about the likelihood of future events and learn how accurate you were and how your accuracy compares with the crowd. Unlike prediction markets and other forecasting sites, you can share your reasoning with other forecasters to challenge your assumptions.
GJ Open taps into the Wisdom of the Crowd. We believe in the wisdom of the crowd and hope to use that wisdom to better understand, and predict, the complex and ever evolving world that we live in.
GJ Open was born out of the Good Judgment Project, a multi-year research project which showed that the wisdom of the crowd could be applied to forecasting. Good Judgment Inc was founded to bring the science of forecasting learned in the Good Judgment Project to the public. GJ Open is designed for anyone and everyone to improve their forecasting skills, and is not a scientific research project.
If you're new to forecasting, we encourage you to watch a short video about probability forecasting.
When you're ready to begin, check out our active questions to start forecasting!
Competitions on GJ Open are called Challenges. Challenges are collections of questions organized by a theme or topic. Each challenge has its own leaderboard, which ranks forecasters by how much more accurate their forecasts were than the crowd.
We encourage all forecasters to watch our short video about scoring at training.goodjudgment.com/keepingscore
We report a few different numbers to quantify your forecasting accuracy and compare it to other users on the site. An important point that sometimes confuses forecasters is that lower scores always indicate better accuracy, like in golf. Our primary measure of accuracy is called the Accuracy Score, which compares your score to the crowd. The median score is like “par” in golf, so negative Accuracy Scores are “below par” or more accurate than the crowd, and positive scores are above par, or less accurate than the crowd.
On your profile page, next to each question you’ll see several columns. Here’s a more detailed explanation of each:
Brier Score: The Brier score was originally proposed to quantify the accuracy of weather forecasts, but can be used to describe the accuracy of any probabilistic forecast. Roughly, the Brier score indicates how far away from the truth your forecast was.
The Brier score is the squared error of a probabilistic forecast. To calculate it, we divide your forecast by 100 so that your probabilities range between 0 (0%) and 1 (100%). Then, we code reality as either 0 (if the event did not happen) or 1 (if the event did happen). For each answer option, we take the difference between your forecast and the correct answer, square the differences, and add them all together. For a yes/no question where you forecasted 70% and the event happened, your score would be (1 – 0.7)2 + (0 – 0.3)2 = 0.18. For a question with three possible outcomes (A, B, C) where you forecasted A = 60%, B = 10%, C = 30% and A occurred, your score would be (1 – 0.6)2 + (0 – 0.1)2 + (0 – 0.3)2 = 0.26. The best (lowest) possible Brier score is 0, and the worst (highest) possible Brier score is 2.
To determine your accuracy over the lifetime of a question, we calculate a Brier score for every day on which you had an active forecast, then take the average of those daily Brier scores and report it on your profile page. On days before you make your first forecast on a question, you do not receive a Brier score. Once you make a forecast on a question, we carry that forecast forward each day until you update it by submitting a new forecast.
The Brier Score listed in large font near the top of your profile page is the average of all of your questions’ Brier scores.
Median Score: The Median score is simply the median of all Brier scores from all users with an active forecast on a question (in other words, forecasts made on or before that day). Like with your Brier score, we calculate a Median score for each day that a question is open, and the Median score reported on the profile page is the average median score for those days when you had an active forecast. We also report the average across all questions on which you made forecasts (in parentheses under your overall Brier score).
Accuracy Score: The Accuracy Score is how we quantify how much more or less accurate you were than the crowd. It’s what we use to determine your position on leaderboards for Challenges and individual questions.
To calculate your Accuracy Score for a single question, we take your average daily Brier score and subtract the average Median daily Brier score of the crowd. Then, we multiply the difference by your Participation Rate, which is the percentage of possible days on which you had an active forecast. That means negative scores indicate you were more accurate than the crowd, and positive scores indicate you were less accurate than the crowd (on average).
For Challenges, we calculate your Accuracy Score for each question and add them together to calculate your cumulative Accuracy Score. On questions where you don’t make a forecast, your Accuracy Score is 0, so you aren’t penalized for skipping a question.
The Consensus Trend graph displays the median of the most recent 40% of the current forecasts from each forecaster. In other words, it reflects the consensus of the most recent 40% of forecasters. We’ve found in our experience with GJP that 40% provides a good mix of recent activity and historical perspective for our types of questions.
This means that the trend may not change very much when one or even many users make forecasts that differ from the trend. We do this deliberately in order to display a consensus forecast that is not overly-influenced by outlier forecasts but still reflects the most recent wisdom of the crowd.
Importantly, because it reflects some historical perspective, the Consensus Trend graph does not show the median of all forecasts on each day and might lag behind a little bit. Your Accuracy Score, however, is based only on the median Brier score of all active forecasts on each day – no matter when they were made. This means that even if you beat the consensus displayed for a given day, you might not beat the median forecast on that date. The purpose of the graph is not to anchor your forecast against the “number to beat,” but to provide an informative estimate of the general consensus.
Some forecasting questions require the assignment of probabilities across answer options that are arranged in a specific order. The most common examples in our forecasting tournament are questions that ask the likelihood that an event will occur during one of three or more date ranges or the likelihood that the value of a quantitative variable (such as the number of refugees or the price of a barrel of oil) will fall within one of three or more quantity ranges.
Our usual Brier scoring rule does not consider the order of the answer options and therefore does not give any credit for “near-misses.” Therefore, the usual rule treats a forecaster whose prediction is “wrong” as a matter of rounding error as being no more accurate than a forecaster whose prediction is off by an order of magnitude.
To address this issue, we have adopted a special “ordered categorical scoring rule” for questions with multiple answer options that are arranged in a special order. For more information on how scores are calculated, you can read this PDF document.
Conditional questions ask you to make predictions on whether an event will occur IF the condition presented occurs. If the condition does not occur, the question will be voided and not scored.
Open questions allow you to suggest new questions, share your opinion on a particular topic, or discuss something without making a forecast. Open questions are never scored for accuracy.
There is no way to delete a forecast or withdraw from a question. We do this in order to avoid situations where a forecaster withdraws or deletes their forecast when it becomes clear that they will receive a bad score. If you unfollow a question on which you’ve made a forecast, it does not affect your score – it only affects only your notifications and where you can find the question on the site.
The best way to suggest a new question is through one of our open question threads. Currently, you can suggest any question in our general open question. Check out all of our Open questions to see if you can suggest a question for a particular Challenge.
If you're interested in learning some of the strategies that Superforecasters use, we recommend viewing the videos available at training.goodjudgment.com and reading Superforecasting: The Art and Science of Prediction by Philip E. Tetlock and Dan Gardner.
Last updated: September 19, 2016
Our etiquette policy can be summarized as “Be kind, respect others, and stay on topic.” Most forecasters participate in this site to test and improve their forecasting ability, and our goal is to provide a fun and constructive place for everyone to do so. We don’t censor comments, usernames, or taglines that follow this guideline and do not violate our Terms of Service. As long as you participate in a manner that doesn’t disrespect other users or the goals of the site, we welcome you with open arms.
Some things that we don’t allow are: (1) meanness, threats, or hostile personal attacks against other users or the site administrators; (2) spamming, recruiting, or soliciting other users; (3) “crusading” or repeatedly steering discussion away from forecasting, particularly for the purposes of (1) and (2).
If you believe that another forecaster is violating this policy, we encourage you to flag their comment by clicking the green “Flag” text below their forecast, or to report their behavior to the site administrators by using the “Click here for help” popover in the lower left corner.
When a forecaster violates this policy, we will take the following actions:
- First minor infraction: Warning and reminder of the policy
- First significant infraction or any repeated infractions: Suspension for a week
- Repeated significant infractions, violations of the law, or specific threats against other forecasters will be resolved on a case-by-case basis, and may include a permanent ban from the site.
You can read more about our policies on user conduct in our Terms of Service.
You can flag any comments against our Terms of Service for attention by our system administrators. However, please don’t flag comments just because you disagree with them. If you find that you're unable to persuade another forecaster, let your forecast speak for itself. Eventually, your scores will show who was right and who was wrong.
Question Clarifications & Resolutions
Last updated: November 3, 2016
This section describes our policies related to forecasting questions. If you're confused about a specific forecasting question, please read our FAQ below. If you still have questions, you can email us at [email protected].
Generally, the outcome of the forecasting problem will be determined by credible open source media reporting (e.g., Reuters, BBC, AP). Some questions may specify that resolution will be determined by specific sources (e.g., UNHCR data, Moody's).
Assessing forecasting accuracy requires questions with clear answers that can be objectively scored, but the details of many important geo-political events are difficult to anticipate. Many different scenarios and situations can produce the same result and there is a tension between how specific we are in delineating which scenarios will resolve a question and the relevance of that question. In many cases it is possible to forecast on the larger, more meaningful events, even if it is impractical to specify all the potential ways in which those events can occur. In these cases - and especially when the relevant event has significant implications - focusing only on one or two well defined scenarios risks missing important ways in which the event can unfold, making those forecasts considerably less relevant and inconsistent with what has occurred in the real world. We strive to strike a balance by asking questions that can be objectively scored without making them unnecessarily narrow. For some events, very specific questions can be asked. Other events involve more uncertainty because the ways in which the event can occur are many and varied, even if the event itself is quite clear. We believe that crowd-sourced forecasts can still be useful in predicting these important events, so the we ask some questions with less specificity. After all, the value of good judgment lies in its ability to inform good decisions in an uncertain world.
Some forecasters will enjoy making predictions on these questions, even though they involve more uncertainty. Others will prefer to stick to the narrower questions that have more specific resolution criteria. We try to have a mix of both types of questions so that those who prefer strict falsifiability and stringent technical requirements can focus on those questions, while those that prefer the more uncertain (but often very relevant and important) questions can focus on those. The goal of GJ Open is to provide a forum where people can hone their forecasting skills while engaging on the important questions of the day: forecasting will be most rewarding if you chose the questions you are most comfortable with and whose topics are most intriguing to you.
Generally, media reports from outlets based in “free” nations (e.g., per Freedom House) are to be trusted above reports from outlets whose civil liberties and/or media freedom are contested. In addition, Western media outlets with large circulation and good reputations are treated as having high credibility. Reports from small, local outlets will be assigned less credibility as a rule, though this will be treated on a case-by-case basis. In other cases, minor sources may be deemed credible with regard to specific issues. For example, Syria’s official news agency (SANA) will not generally be considered credible, but it may suffice as a credible source for official announcements from the Syrian government.
For questions that ask about minimum, maximum, or total values over a specified time period, the question will be resolved based on the most recent data available on the question’s closing date.
For questions that ask about data conveyed by a specific report, the question will remain open until the relevant data is reported by the specified source.
In cases of substantial controversy or uncertainty about an outcome or the credibility of a source, We may take various steps, such as referring the question to our outside subject matter expert consultants or declaring the question invalid/void. In any case, we reserve the right to make the final decision, in our sole discretion, regarding the resolution of questions.
We will generally close questions based on when events have occurred, rather than on when events are reported in the media. Although we know that forecasters must take into account the likelihood of open source reporting when making predictions, our scoring remains consistent by focusing on when events themselves occur. This may, occasionally, mean that questions have retro-active closing dates, but by focusing on actual events rather than media coverage of those events our forecasts will be more relevant to decision makers.
The official closing date of the question will be listed in Question Information section after an event has occurred. Forecasts made through 11:59pm Pacific time on the day prior to the official closing will be scored for accuracy. Forecasts made after this time will not be scored.
If you believe that a question should close, please send us a resolution request by emailing [email protected] or clicking on the "request resolution" button at the top right of question information section. If you are unsure of whether an event has closed the question, you can email us at [email protected].
Sometimes, we keep questions open even after the outcome appears to be known, to confirm the answer or to allow you to continue to interact with one another about the question as events unfold. In these situations the question will be closed retroactively and the official closing date listed in the Question Information section. Only forecasts made through 11:59 PM PT the day prior to the official closing date will be counted when we calculate scores.
Time Zone: Unless otherwise specified, we will use Pacific Time to evaluate deadlines (e.g., if the forecasting question asks about an event occurring before 31 December 2015, the deadline is 23:59:59 Pacific on 31 December 2015).
“Before” vs “As Of”: When a question asks about a situation "as of" a certain date, the question and all answer options will generally remain open until the specified end date. The goal of these questions is to gauge how a situation will look at a certain point in time. For questions about events occurring "before" or "by" a certain date, the question and/or individual answer options will generally close as soon as an event has transpired prior to the specified end date.
No, only events that occur after the question’s launch date count towards the resolution of the question. The only exception is when a question asks for forecasts for a specific time period; i.e. "What will be the total number of X sold during 2016?” that may have begun prior to the release of the question. These questions will clearly state the timeframe that the forecast needs to address and may indicate that the tallying of the number of items sold or events began before the beginning of the forecasting question.
Unless otherwise specified, forecasting questions will be resolved when relevant data is initially released, and subsequent revisions to the data will not affect the question’s resolution.
In general the precision of our threshold will match the source’s precision, so rounding will not be an issue. When this does not hold, rounding will not be used (e.g., 1.011 will be considered greater than 1.01).
If the question itself specifies a deadline (i.e. Will North Korea conduct a missile test before 1 January 2016) then events that occur after the deadline will not count. Some questions do not include a specific deadline; in those cases, we set a likely closing date. For example, questions about election outcomes often have a provisional end date set to coincide with the election date. However, if the question is not resolved by then (e.g., parties take additional time to negotiate before naming a Prime Minister) the end date can, and will, be pushed back. In all cases the question will be closed as of 11:59pm Pacific on the day prior to the event occurring. See above for more details on deadlines.
No, we do not close any answer options until the question officially ends. We do, however, encourage you to update your forecasts when those bins become obsolete.
Military operations are incredibly complex. Because the details associated with these events are hard to anticipate beforehand, GJI tries to use simple, natural language to communicate the big-picture goal of the question (e.g., whether troops will be deployed, whether a ground offensive will be launched, whether key cities will be taken). Because of the complexity of military operations, these questions have a higher degree of uncertainty than other types of questions. In some cases, there may even be a grey-zone period during the course of an event unfolding in which the outcome is unclear. Having simple resolution criteria gives GJI's question team flexibility to evaluate all the details so that resolution decisions accurately reflect what is happening on the ground. Please keep this inherent uncertainty in mind when deciding whether to answer these questions and when making your forecasts.
We do not generally issue guidance on how we will score hypothetical scenarios because we want to retain flexibility to evaluate all of the relevant details of an event before making a decision. Some questions involve more inherent uncertainty. In those cases, brainstorming potential scenarios and thinking about the probabilities of those scenarios unfolding can be a useful analytics technique.
Sometimes, we ask questions where the answer might not be known before the challenge itself ends. For example, a question might ask about whether an event will happen before 2018, but will be part of a challenge that ends in 2017. Generally, we'll handle these questions in one of two ways:
1. If the answer to the question is known before the challenge ends (usually, because the event occurred), we'll score the question and use it to rank forecasters on the challenge leaderboard.
2. If the answer to the question is unknown when the challenge ends (usually, the event has not yet happened, but the question has not ended even though the challenge has), we will not score the question as part of the challenge, and we'll roll the question over into the next version of the challenge or a similar challenge.
The best resource is other forecasters on the site. If you are confused about a question, reach out to your fellow forecasters to get their perspective and insight. This can be immeasurably helpful in both understanding the question but in forecasting the question as well. The above FAQ also addresses specific issues that have arisen in the past. If these resources don't answer your question, please submit an official clarification request by emailing [email protected]. You can also request a clarification by clicking on the "ask us for help" hyperlink at the bottom of the Question Information section for the relevant question.
Good Judgment takes clarification requests seriously and will respond to your request after taking time to investigate the issue, and if necessary, consult with the question's sponsor. If an official clarification is released for all forecasters it will be added to the Question Information Section with the date of release.
When a question asks about an 'announcement', legal actions subsequent to the announcement occurrence (e.g. actions by a court) do not affect the resolution of a question. When a question asks about a law, policy, order, or similar act of a legal nature going into effect, subsequent legal actions (including, but not limited to, actions by a court) will only affect the resolution of the question if the subsequent legal actions prevent the act of a legal nature from taking effect. For instance, if a question asks about whether or not a piece of legislation will become law, that legislation becoming law would close the question irrespective of an injunction being granted by a court to prevent further enforcement of that law. However, if a question asks about a policy change and an executive order is signed which makes that change but an injunction is granted before the change takes effect, the question would not resolve.
Last updated: September 19, 2016
Badges are public awards you can earn by participating on GJ Open. We award new badges (and expire old ones) on the first of each month. The awarding of badges that require a minimum level of activity to earn will be based on your activity from the preceding calendar month. Badges do not affect how we score your forecasts.
Here is a list of all available badges on the site, and the rules for earning and keeping them:
Frequent Forecaster: For those who are dedicated to making forecasts and updating their beliefs. You will earn this badge if you made at least 10 forecasts in the previous calendar month. This badge will expire at the end of every month and be re-awarded based on the previous month's activity, so if you want to keep it, you have to stay active!
Influencer: The best forecasters don't just put their reputation on the line, but explain how they chose their probabilities. You will earn this badge if you made at least 2 comments (as rationales accompanying forecasts or replies to other forecasters) with at least 2 upvotes each in the previous calendar month. This badge will expire at the end of every month and be re-awarded based on the previous month's activity, so if you want to keep it, keep making high-quality comments!
Economist MVP (Most Valuable Predictor): You proved your value by out-forecasting the competition in The Economist's World in 2016 Challenge. This permanent badge was awarded to the top 20% (by Accuracy Score) of forecasters who answered at least 10 questions in the challenge.
Rationale badges: Show your work! Each rationale you include with a forecast counts toward your running total of rationales. You'll earn a badge for a total of 25, 50, and 100 rationales. Each month, we'll check again to see if you've earned the next level.
We'll be adding more badges to the site over time. If you have an idea for a badge, let us know via the "Click here for help" menu in the lower left corner of any page.
You can choose which badges will be displayed on your profile page, under your tag line, by editing your profile, navigating to Badges on the left side, and checking the Featured? box next to the badges you've earned. All badges that you've earned will be displayed on the Badges tab of your profile, regardless of whether you choose to feature them.