Data and Statistics: As easy as 1-2-3?
Transcription
Data and Statistics: As easy as 1-2-3?
What are we talking about? 1. Data 2. Statistics 3. Statistical Literacy A study or a survey Variables are measured Data The data is analyzed Statistics 1. Data 0 The direct result of research (a study or a survey) 0 Not display-ready 0 Needs detailed documentation 0 Needs to be processed 0 Use data if you need to do an in-depth analysis or research 2. Statistics 0 Numeric facts and figures 0 Produced when data is analyzed 0 Quantitative evidence: • Provide a description • Make a comparison • Identify a relationship 0 Presentation-ready 0 Usually presented as percentages, charts, or graphs 0 Needs definitions and classifications 0 Use statistics if you need quick facts to make a point Statistics are “real” only if they are derived from data. 3. Statistical Literacy 0 Ability to use numerical information in everyday life 0 Ability to think critically about numbers and statistics used in arguments, media, advertising, reports, etc. 0 Some knowledge of the process 0 Consider the source 0 Be aware of errors or biases 0 Ability to read, interpret, and describe numbers in statements, surveys, tables, and graphs 0 Understand terms and key concepts used 0 Conventions in creating graphs and charts; violations Key Concepts 0 Location 0 Variables 0 Unit of Observation 0 Universe There are 6 variables in this table. Unit of Observation 0 The entity for which data are collected and that statistics describe or summarize 0 Business surveys: 0 Company 0 Establishment 0 Social surveys: 0 Census family 0 Economic family 0 Household 0 Individuals Unit of Observation Unit of observation attributes: Smokers, education, age, sex. Universe 0 Universe is the group, collection, or population from which a representative sample is drawn. 0 Members of the universe have common or defining characteristics or features. 0 The universe includes all members of the unit of observation, while the sample consists of just those members from whom data are collected. Universe 0 Statistics Canada uses (Target Population) “target population” to describe each survey’s universe. Sample (Unit of Observation) Title Universe: Undergraduate; Full-time; Canada Variables: Average tuition; Discipline; Academic year; Province Statistical Metric: Dollars Date Footnote Producer Think Critically 0 Quality of the data Compared to what? 0 Sampling method 0 Non-response bias 0 Leading questions Says who? 0 Presentation of the statistics 0 Misleading 0 Percentage vs. absolute numbers 0 Rankings Since 0 Qualifiers when? Being a critical user of statistics 0 Currency 0 How recent is the statistic? 0 What time period is shown? 0 Reliability 0 Is the statistic based on quality data? 0 Is the data source provided? 0 Authority 0 Who published this statistic? 0 What is the source of the data? 0 Purpose / Point of View 0 What view of the data is shown in this statistic? 0 Fact, opinion, bias? 0 Are definitions provided (geography, time, social characteristics)? Consider the Data Source: A “real” statistic may be derived from poor quality data. 1. 2. 3. Image source: http://www.cardstock.com/ Bad sampling • Too small • Sample chosen Non-Response bias Leading questions Universe Bad Sampling (Target Population) Sample (Unit of Observation) 0 Sample can be too small 75% of UNB students come to the library 5 days per week Based on a sample of 4 UNB students 0 Some groups may be over or under represented Increase my sample to 200 UNB students Survey was given to students who were studying in the library Non-Response Bias 0 What makes some people respond to a survey and others not? 0 May be the survey topic 0 May be characteristics of people being surveyed Image source: http://www.cardstock.com/ Leading Questions 0 The wording of the question may cause one response to be given more often than another “Do you think the government should help people who are unable to find work?” “Do you feel you should be taxed so some people can get paid for staying home and doing nothing?” Image source: http://www.cardstock.com/ Good Data - Misleading Statistics 0 Numbers can be chosen to make the point you want 0 Percentage vs. Absolute numbers 0 Rankings 0 Qualifiers Image source: http://www.cardstock.com/ Percentage vs. Absolute Numbers 0 Which is more impressive? 0 Which is more useful? Company A and Company B both layoff 10% of their employees Company A had 200,000 employees. News report says: “Company A lays off 20,000 employees.” Company B had 100 employees. News report says: “Company B lays off 10% of employees.” Image source: http://www.dilbert.com/ Rankings 0 Based on comparisons rather than specific amounts. 0 Are categories clear? 0 What do we know about the actual amounts? Heart disease is ranked the # 1 cause of death in Canada. 1. 2. 3. 4. 5. Heart Disease Lung Cancer Colon Cancer Cerebrovascular Disease Breast Cancer 1. 2. 3. 4. 5. Cancer Heart Disease Cerebrovascular Disease Respiratory Disease Accidents Cancer is ranked the # 1 cause of death in Canada. Qualifiers 0 Some words may not seem important … but they are 0 Creating categories “land” “predator” “The brown bear is the largest land predator in the world.” Statistics are about definitions 0 Statistics are dependent on the concepts they summarize. 0 Numbers represent measurements or observations based on specific definitions. Image source: http://www.dilbert.com/ Statistics are about definitions Visible Minority Groups (15), Generation Status (4), Age Groups (9) and Sex (3) for the Population 15 Years and Over of Canada, Provinces, Territories, Census Metropolitan Areas and Census Agglomerations, 2006 Census - 20% Sample Data Geography = Canada Age groups (9) = Total - Age groups Sex (3) = Total - Sex Visible minority groups (15) Total - Population by visible minority groups Total visible minority population [5] Not a visible minority [11] Total - Generation status 1st generation 2nd generation 3rd generation or more 25664220 6124565 4006420 15533240 3922700 3273070 551740 97890 21741525 2851490 3454685 15435350 0 How is visible minority defined? 0 Are aboriginals among the visible minority in Canada? Statistics are about definitions http://www.statcan.gc.ca/concepts/definitions/minority-minorite1-eng.htm Statistics involve classifications 0 Some classifications are based on standards 0 Standard Geography classifications 0 North American Industrial Classification System (NAICS) 0 International Classification of Diseases (ICD) 0 Others are based on convention or practice 0 Classifications involve categories 0 Sex (Total, Male, Female) 0 Age groups 0 Classifications are shaped by their definitions Definitions and Metadata 0 Codebook 0 User’s Guide 0 Data Dictionary 0 Survey questionnaires 0 Instructions to interviewers Metadata Statistical Literacy: Hands-On Geography: Canada Unit of measure: Canadian dollars Variables: Average tuition; Full-time students; Discipline; Province; Academic year; Rows: 16 disciplines Columns: 5 academic years Cells: Average tuition fees Write out a statement comparing the number of days spent freshwater fishing versus saltwater fishing. http://www.census.gov/prod/2008pubs/fhw06-qkfact.pdf 1. “There were 433 million freshwater fishing days and 86 million saltwater fishing days.” 2. “There were more freshwater fishing days than saltwater fishing days.” 3. “The number of freshwater fishing days are 5 times more than the number of saltwater fishing days.” What percentage of Great Lakes anglers fished for perch? 1. 2. 2% 36% OR 40% (0.5 million/1.4 million) OR (.02/.05 = .4) 0 The percentages provided in the table do not match the percentage requested in the question. 0 “36% of all Great Lakes anglers fish for perch and 2% of all anglers fish the Great Lakes for perch.” http://www.census.gov/prod/2008pubs/fhw06-qkfact.pdf How many bird watchers participate in that activity both around the home and away from home? 0 Bird watching around the home and bird watching away from home are not mutually exclusive activities. 0 Answer: 41.8 + 19.9 – 47.7 = 14 million http://www.census.gov/prod/2008pubs/fhw06-qkfact.pdf Isaacson, M. (2012). Lost: Assessing student basic survival skills in the statistical wilderness using real data [PDF document ]. Retrieved from http://www.statlit.org/StatLit2012.htm Reference Desk Office Total 2009-10 5753 200 5953 2010-11 5556 1000 6556 2011-12 4123 929 5052 2012-13 2706 2008 4714 2013-14 2059 1998 4057