Painting a Portrait of Canada: The 2021 Census of Population
6. Data processing

Skip to text

Text begins

Statistics Canada is committed to ensuring that high-quality information on Canadian communities from coast to coast to coast is easily accessible, available in a range of media formats and published as quickly as possible.

To achieve this, the millions of census questionnaires received electronically and by mail undergo a series of carefully designed and monitored processing steps. Statistical design, quality assurance and validation underpin every stage of data collection, processing and analysis.

Receipt and registration

Canada Post completes the initial registration of returned paper questionnaires by scanning the barcodes on the see-through portion of the return envelope. Canada Post employees do not have access to questionnaire answers. This important step ensures that census employees can follow up with non-responding households in a timely fashion.

Sealed questionnaires are sent to Statistics Canada’s Data Operations Centre (DOC), where registration is completed.

Questionnaires submitted online by respondents or completed over the phone with the help of a Census Help Line operator are registered automatically in the processing system.

Preparation of paper questionnaires

Paper questionnaires are removed from their envelopes by Statistics Canada employees and prepared for scanning.

Cutting: Questionnaire booklets are separated into single sheets and placed into batches to be scanned.

Transcription: Damaged questionnaires that do not meet the scanning requirements are transcribed onto a new questionnaire of the same type, then scanned.

Scanning: Questionnaires are converted to digital images using high-speed scanners.

Data capture

Automated data capture: Optical mark recognition and optical character recognition technologies are used to extract respondent data.

Manual keying is used when the automated recognition system detects inconsistencies in the responses. For example, inconsistencies might arise if the person’s handwriting is difficult to decipher.

Quality assurance: The agency conducts rigorous quality control of paper questionnaires to meet pre-set quality targets:

Verification of data capture: If the automated data capture technology identifies inconsistencies in the data, the responses are sent to a census employee to make verifications or corrections. Any differences are sent to an arbitrator who makes the final decision to ensure high-quality capture.

Check out: Once the paper questionnaires have been processed, they are checked out of the system. Check-out is a quality assurance process that ensures that the images and captured data are of sufficient quality for the paper questionnaires to no longer require manual keying.

Edits

An interactive process of automated and manual edits is performed to ensure that potential problems and inconsistencies are identified and resolved as paper questionnaires are captured and online questionnaires are received. Automated completion editing involves checking for completeness and consistency.

Blank and minimum content: A questionnaire identified as having no information or not enough questions answered is returned to collection for non-response follow-up by census employees.

Coverage edits: The number of usual residents in each household (or collective dwelling) is determined and the type of collective dwelling is confirmed or reclassified.

Failed edit follow-up

This processing stage identifies short-form questionnaire questions that require further coverage or content clarification. The coverage verification ensures that potential inconsistencies on who is included in a household are addressed. When necessary, operators in regional call centres contact households to ensure the appropriate people are enumerated and obtain missing information. The data are then sent back to the DOC and reintegrated into the system for further processing (e.g., coding). 

Coding

During the coding process, written responses are converted to numerical codes before they are tabulated. Written responses are assigned a numerical code using Statistics Canada reference files, code sets and standard classifications. The first stages of coding are automated. For the automated match process, reference files are built using actual responses from past censuses and are updated with new codes for the current census. Specially trained coders and experts resolve cases that cannot be coded automatically.

In 2016, over 67.8 million write-in answers were coded. Of these, approximately 87% were coded automatically.

Data loading

Once the data have successfully undergone all of the processing steps at the DOC, they are loaded into the response database. Data are loaded in three phases:

Edit and imputation

The data collected in any survey or census inevitably contain omissions or inconsistencies. These errors could be the result of respondents missing a question or could be generated during processing. The final editing process detects errors, and the imputation process corrects them.

In the edit and imputation phase, invalid or missing responses are adjusted and data are corrected. Statistics Canada’s imputation methods are consistent with internationally recognized statistical standards for large-scale imputation applications, such as a census.

As part of Statistics Canada’s research, the agency is exploring the greater use of administrative data in its imputation processes.

Accessing census records

Access to historical census records has been a matter of public discussion for decades and has generated considerable interest from genealogists, historians and archivists.

In 2005, following extensive engagement with Canadians, the Government of Canada amended the Statistics Act to eliminate ambiguities relating to the confidentiality of past census records, while also providing for the release of future census records.

The Statistics Act was amended to allow for the release of historical census records from 1911 to 2001. In addition, information obtained from each census after 2021 is to be released to Library and Archives Canada (LAC) 92 years after it was collected (e.g., census records from 2001 will be released in 2093).

For the 2006, 2011 and 2016 censuses, Canadians could choose whether their census records would be released publicly after 92 years. The person who completed the census questionnaire was asked to consult with all household members who were included in the questionnaire before answering the consent question.

LAC is responsible for making census records available. This is consistent with Statistics Canada’s commitment to providing open and accessible data. Researchers, historians and genealogists require this information to conduct research and help Canadians better understand their past.

Census records up to and including the 1916 Census are available either online or as microfilm copies through LAC. The 1921 Census records have also been released to the public (through www.ancestry.ca by LAC).

Preserving census records

Statistics Canada, in consultation with LAC, determines the best means for preserving census records.

A microfilm copy of the census questionnaires from 1921 to 2001 is held by Statistics Canada.

The 2006, 2011 and 2016 censuses and the 2011 National Household Survey (which replaced the census long-form questionnaire in 2011) were not microfilmed. Paper questionnaires were converted to digital images, and an archival data file containing all responses (including those submitted online) was created. The original paper questionnaires were shredded and destroyed.

For 2021, in line with government security guidelines, the original paper questionnaires for the Census of Population will be shredded once data processing is completed.

Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

Privacy notice

Date modified: