Making Inferences and Justifying Conclusions: Evaluating Models with Data and Simulation (CCSS.S-IC.2)
Common Core High School Statistics And Probability · Learn by Concept
Help Questions
Common Core High School Statistics And Probability › Making Inferences and Justifying Conclusions: Evaluating Models with Data and Simulation (CCSS.S-IC.2)
A city wants to estimate the average daily screen time among teenagers. Two samples are considered:
- Sample A: Teens who volunteer at a tech club meeting at the library.
- Sample B: Teens randomly selected from a city list and contacted by mail and follow-up phone calls.
Why might Sample A be biased?
Because it takes more time to collect than Sample B
Because it uses a large number of participants
Because the teens are chosen randomly from across the city
Because it includes only teens at a tech club, who may have different screen habits than other teens
Explanation
Sample A focuses on tech club attendees, a subgroup likely to have different screen habits, so it is not representative of all teens. Random selection from the full population is more representative.
A district wants to estimate the average number of hours students spend on homework per night. Two approaches are discussed:
- Approach 1: Post an online poll on the homework help website and use the responses.
- Approach 2: Randomly select 300 students from the district enrollment list and contact them directly to respond.
Why might Approach 1 be biased?
Because the poll could reach more students quickly
Because the poll is on the internet
Because only students who visit and choose to respond to the homework site are included, and they may differ from other students
Because random selection always takes longer than an online poll
Explanation
Approach 1 is a voluntary response sample from a specific website audience, which can systematically differ from the general student population, reducing representativeness.
A school wants to estimate the percent of students who regularly eat breakfast on school days. Two methods are proposed:
- Method 1: Stand by the cafeteria door during the first breakfast period and survey the first 80 students who walk in.
- Method 2: Use a random number generator to select 80 student ID numbers from the entire student roster and survey those students.
Which method is most representative?
Method 1: survey the first 80 students who walk into the cafeteria during the first breakfast period
Method 2: randomly select 80 student IDs from the entire roster and survey those students
Ask teachers to nominate students they think skip breakfast
Post a sign-up sheet for volunteers to be surveyed
Explanation
Method 2 uses a random sample drawn from the whole student body, reducing bias and increasing representativeness. The others are convenience or volunteer samples that overrepresent certain groups.
A PE teacher wants to estimate the proportion of students who can run a mile under 8 minutes. Two methods are considered:
- Method 1: Time whoever attends after-school track practice today.
- Method 2: Randomly select 100 students from the full school roster and time their mile during PE classes this week.
Which method is most representative?
Method 1: time students at after-school track practice
Method 2: randomly select 100 students from the roster and time them during PE classes
Ask for volunteers during morning announcements
Time students who are already wearing running shoes at lunch
Explanation
Randomly sampling from the entire student body during regular PE reaches a broad cross-section, making it representative. Track practice, volunteers, or convenience samples overrepresent more athletic or motivated students.
A school wants to estimate the proportion of students who bike to school. Method 1: Survey every 10th student entering the main door across the whole day. Method 2: Survey only the members of the after-school cycling club. Which method is most representative of all students?
Method 2, because club members know the most about biking.
Both methods are equally random.
Neither, because you must survey everyone to avoid bias.
Method 1, because it samples across the whole student body and times of day.
Explanation
Method 1 reduces selection bias by covering many students and times; Method 2 overrepresents bikers, so it is biased and not representative.
The student council wants to know what fraction of students support adding a second lunch period. Two methods are proposed:
- Method X: Survey students sitting in the cafeteria during lunch today.
- Method Y: Draw a simple random sample of 150 students from the school roster and survey them throughout the day.
Which method is most representative?
Method Y: randomly sample 150 students from the roster and survey them throughout the day
Method X: survey students sitting in the cafeteria today
Choose classes that volunteer to participate
Ask the principal to handpick students from each grade
Explanation
A simple random sample from the full roster gives all students an equal chance to be chosen, making it more representative. Cafeteria-only, volunteer, or handpicked samples introduce selection bias.
A model says a coin has $p=0.5$ for heads. In 5 flips, you observe T, T, T, T, T. Would this result alone cause you to question the model?
No; a run of 5 tails can happen by chance under a fair model.
Yes; a fair coin cannot produce five tails in a row.
Yes; a fair coin must give about half heads in any 5 flips.
Yes; runs only occur when a coin is biased.
Explanation
The probability is $(1/2)^5=1/32\approx 3.1%$: rare but possible. A single short run is not strong evidence against $p=0.5$.
A model says $30%$ of incoming emails are spam ($p=0.30$). In a random sample of 50 emails, 18 are spam. Is the model consistent with this result?
No; this is impossible under the model.
Yes; sample proportions vary around \$0.30$, and $18/50=0.36$ is plausible.
No; the sample must have exactly 15 spam to match the model.
Yes; any sample proportion above \$0.30$ proves the model.
Explanation
Under $p=0.30$ with $n=50$, $\text{E}=15$ and $\text{SD}\approx\sqrt{50\cdot0.3\cdot0.7}\approx3.2$. Getting 18 is within about 1 SD; this is consistent with the model.
A die is claimed to be fair. In 60 rolls, the face "6" appears 4 times. Is this consistent with a fair die?
Yes; $4/60$ is exactly what a fair die predicts.
Yes; long stretches without sixes are guaranteed for a fair die.
No; a fair die must show each face exactly 10 times in 60 rolls.
No; a fair die would produce about $10$ sixes with $\text{SD}\approx\sqrt{60\cdot\tfrac{1}{6}\cdot\tfrac{5}{6}}\approx2.9$. Getting only 4 is more than 2 SDs below the mean and is unusually low, so I would question fairness.
Explanation
Expected sixes $\approx10$ with $\text{SD}\approx2.9$; observing 4 is over 2 SDs below expectation, which is uncommon and suggests possible bias.
A spinner model says red occurs with probability $p=0.25$ each spin. In 80 spins, red appears 30 times. Does this result make you question the model?
Yes; any outcome is equally likely in 80 spins.
Yes; because 30 is exactly the expected count.
Yes; $30$ reds is much higher than the expected $20$ and is unusually large under $p=0.25$, so I would question the model.
No; red can never appear more than 25 times under this model.
Explanation
Under $p=0.25$ with $n=80$, $\text{E}=20$ and $\text{SD}\approx\sqrt{80\cdot0.25\cdot0.75}\approx3.9$. Getting 30 is about 2.6 SDs above the mean—unusual enough to question the model.