Data and Statistics: As easy as 1-2-3?

Transcription

Data and Statistics: As easy as 1-2-3?
What are we talking about?
1. Data
2. Statistics
3. Statistical Literacy
A study or a survey
Variables are measured
Data
The data is analyzed
Statistics
1. Data
0 The direct result of research (a study or a survey)
0 Not display-ready
0 Needs detailed documentation
0 Needs to be processed
0 Use data if you need to do an in-depth analysis or
research
2. Statistics
0 Numeric facts and figures
0 Produced when data is analyzed
0 Quantitative evidence:
• Provide a description
• Make a comparison
• Identify a relationship
0 Presentation-ready
0 Usually presented as percentages, charts, or graphs
0 Needs definitions and classifications
0 Use statistics if you need quick facts to make a point
Statistics are “real” only if they are
derived from data.
3. Statistical Literacy
0 Ability to use numerical information in everyday life
0 Ability to think critically about numbers and statistics used
in arguments, media, advertising, reports, etc.
0 Some knowledge of the process
0 Consider the source
0 Be aware of errors or biases
0 Ability to read, interpret, and describe numbers in
statements, surveys, tables, and graphs
0 Understand terms and key concepts used
0 Conventions in creating graphs and charts; violations
Key Concepts
0 Location
0 Variables
0 Unit of Observation
0 Universe
There are 6
variables in
this table.
Unit of Observation
0 The entity for which data are collected and that
statistics describe or summarize
0 Business surveys:
0 Company
0 Establishment
0 Social surveys:
0 Census family
0 Economic family
0 Household
0 Individuals
Unit of Observation
Unit of observation
attributes:
Smokers, education,
age, sex.
Universe
0 Universe is the group, collection, or population from
which a representative sample is drawn.
0 Members of the universe have common or defining
characteristics or features.
0 The universe includes all members of the unit of
observation, while the sample consists of just those
members from whom data are collected.
Universe
0 Statistics Canada uses
(Target
Population)
“target population” to describe
each survey’s universe.
Sample
(Unit of
Observation)
Title
Universe:
Undergraduate;
Full-time;
Canada
Variables:
Average tuition;
Discipline;
Academic year;
Province
Statistical
Metric:
Dollars
Date
Footnote
Producer
Think Critically
0 Quality of the data
Compared
to what?
0 Sampling method
0 Non-response bias
0 Leading questions
Says who?
0 Presentation of the statistics
0 Misleading
0 Percentage vs. absolute numbers
0 Rankings
Since
0 Qualifiers
when?
Being a critical user of statistics
0 Currency
0 How recent is the statistic?
0 What time period is shown?
0 Reliability
0 Is the statistic based on quality data?
0 Is the data source provided?
0 Authority
0 Who published this statistic?
0 What is the source of the data?
0 Purpose / Point of View
0 What view of the data is shown in this statistic?
0 Fact, opinion, bias?
0 Are definitions provided (geography, time, social characteristics)?
Consider the Data Source:
A “real” statistic may be derived from poor quality data.
1.
2.
3.
Image source: http://www.cardstock.com/
Bad sampling
• Too small
• Sample chosen
Non-Response bias
Leading questions
Universe
Bad Sampling
(Target
Population)
Sample
(Unit of
Observation)
0 Sample can be too small
75% of UNB students
come to the library
5 days per week
Based on a sample of 4
UNB students
0 Some groups may be over or under represented
Increase my sample to
200 UNB students
Survey was given to
students who were
studying in the library
Non-Response Bias
0 What makes some people respond to a survey and
others not?
0 May be the survey topic
0 May be characteristics of people being surveyed
Image source: http://www.cardstock.com/
Leading Questions
0 The wording of the question may cause one response
to be given more often than another
“Do you think the
government should help
people who are unable to
find work?”
“Do you feel you should be
taxed so some people can
get paid for staying home
and doing nothing?”
Image source: http://www.cardstock.com/
Good Data - Misleading Statistics
0 Numbers can be chosen to make the point you want
0 Percentage vs. Absolute numbers
0 Rankings
0 Qualifiers
Image source: http://www.cardstock.com/
Percentage vs. Absolute Numbers
0 Which is more impressive?
0 Which is more useful?
Company A and Company B
both layoff 10% of their employees
Company A had 200,000 employees.
News report says:
“Company A lays off 20,000 employees.”
Company B had 100 employees.
News report says:
“Company B lays off 10% of employees.”
Image source: http://www.dilbert.com/
Rankings
0 Based on comparisons rather than specific amounts.
0 Are categories clear?
0 What do we know about the actual amounts?
Heart disease is ranked the # 1 cause of death in Canada.
1.
2.
3.
4.
5.
Heart Disease
Lung Cancer
Colon Cancer
Cerebrovascular Disease
Breast Cancer
1.
2.
3.
4.
5.
Cancer
Heart Disease
Cerebrovascular Disease
Respiratory Disease
Accidents
Cancer is ranked the # 1 cause of death in Canada.
Qualifiers
0 Some words may not seem important … but they are
0 Creating categories
“land”
“predator”
“The brown bear is the largest
land predator in the world.”
Statistics are about definitions
0 Statistics are dependent on the concepts they
summarize.
0 Numbers represent measurements or observations
based on specific definitions.
Image source: http://www.dilbert.com/
Statistics are about definitions
Visible Minority Groups (15), Generation Status (4), Age Groups (9) and Sex (3) for the Population 15 Years and Over of Canada, Provinces,
Territories, Census Metropolitan Areas and Census Agglomerations, 2006 Census - 20% Sample Data
Geography = Canada
Age groups (9) = Total - Age groups
Sex (3) = Total - Sex
Visible minority groups (15)
Total - Population by visible minority groups
Total visible minority population [5]
Not a visible minority [11]
Total - Generation status
1st generation
2nd generation
3rd generation or more
25664220
6124565
4006420
15533240
3922700
3273070
551740
97890
21741525
2851490
3454685
15435350
0 How is visible minority defined?
0 Are aboriginals among the visible minority in Canada?
Statistics are about definitions
http://www.statcan.gc.ca/concepts/definitions/minority-minorite1-eng.htm
Statistics involve classifications
0 Some classifications are based on standards
0 Standard Geography classifications
0 North American Industrial Classification System (NAICS)
0 International Classification of Diseases (ICD)
0 Others are based on convention or practice
0 Classifications involve categories
0 Sex (Total, Male, Female)
0 Age groups
0 Classifications are shaped by their definitions
Definitions and Metadata
0 Codebook
0 User’s Guide
0 Data Dictionary
0 Survey questionnaires
0 Instructions to interviewers
Metadata
Statistical Literacy:
Hands-On
Geography:
Canada
Unit of measure:
Canadian dollars
Variables:
Average tuition;
Full-time students;
Discipline;
Province;
Academic year;
Rows:
16 disciplines
Columns:
5 academic
years
Cells:
Average tuition fees
Write out a statement comparing the number of
days spent freshwater fishing versus saltwater
fishing.
http://www.census.gov/prod/2008pubs/fhw06-qkfact.pdf
1. “There were 433 million freshwater fishing days and 86
million saltwater fishing days.”
2. “There were more freshwater fishing days than saltwater
fishing days.”
3. “The number of freshwater fishing days are 5 times more
than the number of saltwater fishing days.”
What percentage of Great Lakes anglers fished for
perch?
1.
2.
2%
36% OR 40%
(0.5 million/1.4 million) OR
(.02/.05 = .4)
0
The percentages
provided in the table do
not match the
percentage requested in
the question.
0
“36% of all Great Lakes
anglers fish for perch
and 2% of all anglers fish
the Great Lakes for
perch.”
http://www.census.gov/prod/2008pubs/fhw06-qkfact.pdf
How many bird watchers participate in that
activity both around the home and away from
home?
0 Bird watching around
the home and bird
watching away from
home are not mutually
exclusive activities.
0 Answer: 41.8 + 19.9 –
47.7 = 14 million
http://www.census.gov/prod/2008pubs/fhw06-qkfact.pdf
Isaacson, M. (2012). Lost: Assessing student basic survival skills in the statistical wilderness using real data
[PDF document ]. Retrieved from http://www.statlit.org/StatLit2012.htm
Reference
Desk
Office
Total
2009-10
5753
200
5953
2010-11
5556
1000
6556
2011-12
4123
929
5052
2012-13
2706
2008
4714
2013-14
2059
1998
4057

Similar documents