Data - AP Statistics

Card 1 of 552

0

Didn't Know

0

Didn't Know

Knew It

0

1 of 2019 left

Question

You are conducting the following series of observational studies to determine if generally understood parameters about a local population are accurate:

A. You stand on the sidewalk next to a stop sign to observe and record the number of cyclists and drivers who stop or don't stop.

B. You inform customers at a local fast food restaurant of your study to determine the proportion of healthy food choices from the menu and then observe their choices.

C. From your car in the mall parking lot, you observe the proportion of shoppers who hold the door open for the next person who enters or exits the mall.

D. To estimate the proportion of males and females in the local community, you observe spectators at a hockey match and count and record the gender in a randomly selected area of the crowd.

E. You review the daily sign-in sheets at a local fitness center to estimate the average number of workouts per week for residents of the local community.

Which observational study should yield the most accurate results?

Tap to reveal answer

Answer

Study C takes a representative sample of local residents, and it does so without affecting the outcomes.

Studies A and B are influenced by the person conducting the survey. Studies D and E are biased samples of the population.

Therefore, C should yield the most accurate results.

← Didn't Know|Knew It →

Question

What situation would most warrant the method of observational study to produce accurate results?

Tap to reveal answer

Answer

Obseverational study does not involve any interference with the group being studied, so watching baboons in their natural habit would be best suited for this method.

← Didn't Know|Knew It →

Question

On a residual plot, the -axis displays the and the -axis displays .

Tap to reveal answer

Answer

A residual plot shows the difference between the actual and expected value, or residual. This goes on the y-axis. The plot shows these residuals in relation to the independent variable.

← Didn't Know|Knew It →

Question

Tap to reveal answer

Answer

No explanation available

← Didn't Know|Knew It →

Question

What transformation should be done to the data set, with its residual shown below, to linearize the data?

Tap to reveal answer

Answer

Taking the log of a data set whose residual is nonrandom is effective in increasing the correleation coefficient and results in a more linear relationship.

← Didn't Know|Knew It →

Question

Which of the following correlation coefficients indicates the strongest relationship between variables?

Tap to reveal answer

Answer

Correlation coefficients range from 1 to -1. The closer to either extreme, the stronger the relationship. The closer to 0, the weaker the relationship.

← Didn't Know|Knew It →

Question

A national study on cell phone use found the following correlations:

-The correlation between the number of texts sent each day and a person's average credit card debt is .

-The correlation between the number of texts sent each day and the number of books read each month is .

Which of the following statements are true?

i. As the number of texts sent each day increases, average credit card debt increases.

ii. Sending more texts causes people to read less.

iii. A person's average credit card debt is related more strongly to the number of texts sent each day than the number of books read each month is related to the number of texts sent each day.

Tap to reveal answer

Answer

i is correct because there is a positive correlation between the number of texts sent each day and average credit card debt.

ii is incorrect because the word "cause" was used in the statement. Correlation does not mean causation. There is a relationship between the number of texts sent each day and the number of books that a person reads each month. However, the number of texts sent each day does not cause a person to read a certain number of books each month.

iii is correct because the absolute values of the correlations indicate which correlation is stronger. is a stronger correlation than .

← Didn't Know|Knew It →

Question

Which of the following shows the least correlation between two variables?

Tap to reveal answer

Answer

The strength of correlation is measured on an absolute value scale of to with being the least correlated and being the most correlated. The positive or negative in front of the correlation integer simply determines whether or not there is a positive or negative correlation between the variables.

A correlation of means that there is no correlation at all between two variables.

← Didn't Know|Knew It →

Question

In a medical school, it is found that there is a correlation of between the amount of coffee consumed by students and the number of hours students sleep each night. Which of the following is true?

i. There is a positive association between the two variables.

ii. There is a strong correlation between the two variables.

iii. Coffee consumption in medical school students causes students to sleep less each night.

Tap to reveal answer

Answer

Since the correlation is negative, there must be a negative association between the two variables (therefore statement i is incorrect). Statement ii is correct since a correlation of to on an absolute value scale of to is considered to be a strong correlation. Statement iii is incorrect since correlation does not mean causation.

← Didn't Know|Knew It →

Question

It is found that there is a correlation of exactly between two variables. Which of the following is incorrect?

Tap to reveal answer

Answer

Under no circumstance will correlation ever equate to causation, regardless of how strong the correlation between two variables is. In this case, all other answer choices are correct.

← Didn't Know|Knew It →

Question

Which of the following correlation coefficients implies the strongest relationship between variables:

Tap to reveal answer

Answer

A high correlation coefficient regardless of sign implies a stronger relationship. In this case has a stronger negative relationship than the positive relationship described by a value of

← Didn't Know|Knew It →

Question

A basketball coach wants to determine if a player's height can predict the number of points the player scores in a season. Which statistical test should the coach conduct?

Tap to reveal answer

Answer

Linear regression is the best option for determining whether the value of one variable predicts the value of a second variable. Since that is exactly what the coach is trying to do, he should use linear regression.

← Didn't Know|Knew It →

Question

Use the following five number summary to determine if there are any outliers in the data set:

Minimum:

Q1:

Median:

Q3:

Maximum:

Tap to reveal answer

Answer

An observation is an outlier if it falls more than above the upper quartile or more than below the lower quartile.

. The minimum value is so there are no outliers in the low end of the distribution.

. The maximum value is so there are no outliers in the high end of the distribution.

← Didn't Know|Knew It →

Question

For a data set, the first quartile is , the third quartile is and the median is .

Based on this information, a new observation can be considered an outlier if it is greater than what?

Tap to reveal answer

Answer

Use the $1.5 \cdot IQR$ criteria:

This states that anything less than or greater than will be an outlier.

Thus, we want to find

where .

$1.5 \cdot (70 - 40) =1.5\cdot 30=45$

$Q_3 + (1.5\cdot IQR) = 70 + 45 = 115$

Therefore, any new observation greater than 115 can be considered an outlier.

← Didn't Know|Knew It →

Question

Which values in the above data set are outliers?

Tap to reveal answer

Answer

Step 1: Recall the definition of an outlier as any value in a data set that is greater than or less than .

Step 2: Calculate the IQR, which is the third quartile minus the first quartile, or . To find and , first write the data in ascending order.

. Then, find the median, which is . Next, Find the median of data below , which is . Do the same for the data above to get . By finding the medians of the lower and upper halves of the data, you are able to find the value, that is greater than 25% of the data and , the value greater than 75% of the data.

Step 3: . No values less than 64.

. In the data set, 105 > 104, so it is an outlier.

← Didn't Know|Knew It →

Question

You are given the following information regarding a particular data set:

Q1:

Q3:

Assume that the numbers and are in the data set. How many of these numbers are outliers?

Tap to reveal answer

Answer

In order to find the outliers, we can use the and formulas.

Only two numbers are outside of the calculated range and therefore are outliers: and .

← Didn't Know|Knew It →

Question

Use the following five number summary to answer the question below:

Min:

Q1:

Med:

Q3:

Max:

Which of the following is true regarding outliers?

Tap to reveal answer

Answer

Using the and formulas, we can determine that both the minimum and maximum values of the data set are outliers.

This allows us to determine that there is at least one outlier in the upper side of the data set and at least one outlier in the lower side of the data set. Without any more information, we are not able to determine the exact number of outliers in the entire data set.

← Didn't Know|Knew It →

Question

A certain distribution has a 1st quartile of 8 and a 3rd quartile of 16. Which of the following data points would be considered an outlier?

Tap to reveal answer

Answer

An outlier is any data point that falls $1.5\ast IQR$ above the 3rd quartile and below the first quartile. The inter-quartile range is and $1.5\ast8=12$ . The lower bound would be and the upper bound would be . The only possible answer outside of this range is .

← Didn't Know|Knew It →

Question

In a regression analysis, the y-variable should be the variable, and the x-variable should be the variable.

Tap to reveal answer

Answer

Regression tests seek to determine one variable's ability to predict another variable. In this analysis, one variable is dependent (the one predicted), and the other is independent (the variable that predicts). Therefore, the dependent variable is the y-variable and the independent variable is the x-variable.

← Didn't Know|Knew It →

Question

If a data set has a perfect negative linear correlation, has a slope of and an explanatory variable standard deviation of , what is the standard deviation of the response variable?

Tap to reveal answer

Answer

The key here is to utilize

$b_{1} = r\frac{s_{y}}{s_{x}}$ .

"Perfect negative linear correlation" means , while the rest of the problem indicates $b_{1} = -7$ and $s_{x} = 2$ . This enables us to solve for $s_{y}$ .

$b_{1} = r\frac{s_{y}}{s_{x}}$
$\frac{b_{1}}{r} = \frac{s_{y}}{s_{x}}$
$s_{y} = \frac{b_{1}}{r}{s_{x}} = \frac{-7}{-1}2 = 7\cdot 2 = 14$

← Didn't Know|Knew It →

0

Knew It