Daily Rules, Proposed Rules, and Notices of the Federal Government
The Census Bureau announces a prize competition under Section 105 of the America COMPETES Reauthorization Act of 2011, Public Law 111-358 (2011) to create a statistical model to predict the census mail return rate of small area geographic units based on their demographic characteristics. Census and survey participation rates vary considerably across geographic areas. For example, 2010 Census mail-form return rates varied across states from a high of 82 percent to a low of 65 percent. The causes of these differences in participation rates are many, but these causes have been found to be related to population and housing characteristics. Subpopulations may differ in their lifestyles and their attitudes toward census participation, and Census planners need to develop appropriate strategies to contact and gain respondent cooperation for timely and efficient data collection.
This competition is intended to develop a statistical model to predict census mail return rates at the Census block group level of geography. The Census Bureau will use this model for planning purposes for the decennial census and for demographic sample surveys. The model-based estimates of predicted mail return will be publicly released in a later version of the Census “planning database” containing updated demographic data.
The Census Bureau announced this competition on their public Web site on August 31, 2012. This notice is intended to formally announce the competition in the
Participants are encouraged to develop and evaluate different statistical approaches to propose the best predictive model for geographic units. The intent is to improve our current predictive analytics.
The challenge will be hosted at
(a) The 2010 Census mail form return rate will be used as the dependent measure in the model. Units of analysis are census block groups as defined by Census.
(b) The Census Return Rate Predictive Model is to be developed from the variables in our newly updated planning database, which includes selected 2010 Census and ACS 5-year estimates of characteristics that Census experience and the survey literature have found to be associated with enumeration difficulty.
(c) Participants can propose inclusion of additional variables not on the planning database as long as they meet the following criteria:
(i) Administrative data, such as school enrollment or other compiled data, publically available at no cost, and
(ii) The data are not proprietary information, such as commercial telephone and household characteristics lists, which require purchase from a vendor.
Participants are encouraged to notify the Census Bureau of additional data sources to be used before completion of the model to assure compliance with the criteria.
(d) The models will be evaluated as outlined in the
(e) Entry materials will include the model documentation, including the prediction equation, a description of the methodology used to create the prediction equation, and algorithm/code (e.g., R/Matlab/Python/SAS/etc.) to create the prediction equation. The documentation will provide a thorough understanding of the methods, and allow for replication in the future.
(a) Enter the competition through
(b) Agree to all terms of Kaggle.com;
(c) Participants may be individuals or teams. For purposes of this Notice, “Entrant” or “Entrants” refers to individual participants and each individual participating as a member of a team.
(a) Must have agreed to the rules of this competition;
(b) Are either (a) in the case of an entity, incorporated in and maintain a primary place of business in the United States, or (b) in the case of an individual, a citizen or permanent resident of the United States who are 18 years or older;
(c) Must not be a Federal entity or Federal employee acting within the scope of employment;
(d) Must assume risks, agree to indemnify, and waive claims against the
(e) Anyone whose job duties or official work capacity are closely related to the statistical model that is the subject of the competition is not eligible.
(a) The Census Bureau will monitor questions or discussion posted on the Kaggle.com competition site.
(b) Entrants may also direct questions to
(a) Until the last day of the competition, Entrants' scores and ranks on the Public Leaderboard on the Kaggle Web site will be calculated from the predicted results in an Entrants' submission and the ground truth of a validation dataset. At the close of the competition, the scores and associated ranks on the Public Leaderboard will be calculated from the predicted results and ground truth in the private testing dataset to confirm accuracy. The top-3 Entrant(s), based on the results using the private testing database, will be declared as tentative Prize Winners.
(b) A week before the end of the competition, there will be a visualization competition. The goal of this competition will be to create insightful visualizations from the data that was provided for the predictive modeling competition. There will be a single winner who will be chosen by Kaggle community vote on the Web site. This winner of the visualization competition will receive one thousand dollars as a prize.
(c) The evaluation metric that forms the basis for the Leaderboard scores will be displayed on the Web site. Because of variability in block group population counts, the evaluation metric may be weighted by the 2010 Census population block group count.
(d) As a condition of receipt of the prize, the winner(s) must deliver the algorithm's code and documentation to the Census Bureau. The source code must contain a description of resources required to build and run the algorithm. The individual winner, or each individual on a team should the winner be a team Entrant, will be required to complete, sign and return a Declaration of Eligibility, Non-Exclusive License, and Release form.
(e) The prize may be delivered by U.S. mail or electronically. To facilitate electronic delivery, the winner will need to submit financial account information sufficient to support electronic transfer of the prize.
(f) Regardless of the method of delivering the prize money, the Entrant(s) may be subject to Federal and/or state income taxation. Entrant(s) may be required to fill out tax and related forms before receiving the prize. Kaggle will provide necessary forms at the end of the challenge to the winning Entrants.
(g) For more information on judging and judging procedures, please refer to