



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
This document consists of a midterm exam from an introduction to statistics course, containing multiple choice questions on various statistical concepts such as creating histograms in excel, parameters, robust statistics, pareto charts, and analyzing categorical variables. The exam also includes a data analysis section where students are asked to calculate sample statistics, identify variable types, and consider potential biases in observational survey data.
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!
Multiple Choice. Select one answer for each question. (4 points each)
1. Which of the following steps will allow you to create a histogram in Excel? a) Tools → Data Analysis → Histogram. b) Chart Wizard → Histogram. c) Either a or b will work. d) None of the above. 2. An example of a parameter is: a) a residual. b) the coefficient of variation. c) the sample mean. d) the population mean. 3. Which of the following is the most robust to outliers? a) the range. b) the interquartile range (IQR). c) the sample mean. d) the standard deviation. 4. A pareto chart is: a) a bar chart that lists categories with the highest frequencies first. b) a pie chart that includes slices only for the 3 largest categories plus a fourth slice for "all others." c) a histogram skewed to the right. d) a more formal name for a dot plot. 5. To analyze the association between two categorical variables, most analysts: a) run a regression. b) create a contingency table. c) calculate the correlation. d) simply compare sample means and standard deviations.
1 Continued) Observational Survey Data You expand your study and randomly sample 20 students on the Colgate campus. You collect data on gender and the number of alcoholic drinks consumed per weekend during the school year, which is listed below.
F) First, you would like several sample statistics on “Drinks.” What steps would you follow in Excel to find several statistics at once? (3 points)
G) What Excel command would you use to quickly calculate the mean number of Drinks consumed? (3 points)
H) Is “Gender” a categorical, discrete, or continuous variable? What about “Drinks”? (4 points)
I) Is it possible to run a regression to analyze the association between Gender and Drinks, or do you have to employ an alternative method? Explain. (5 points)
A B 1 Gender Drinks 2 Male 3 3 Female 2 4 Male 1 5 Female 1 6 Female 0 7 Female 2 8 Male 6 9 Male 3 10 Female 3 11 Female 0 12 Female 0 13 Female 1 14 Female 2 15 Female 0 16 Male 5 17 Male 5 18 Male 0 19 Female 2 20 Female 2 21 Female 3
2) Statistics in the Media The following is an excerpt from a press-release sent by the Fred Hutchinson Cancer Research Center, a well-known non-profit organization in Seattle that studies cancer prevention and treatment.
Scientific Question: To what extent can a state-of-the-art, school-based, grade 3-10 tobacco-use prevention curriculum deter youth smoking throughout and beyond high school?
Study: We randomly sampled forty geographically diverse Washington state school districts and assigned them to one of two study groups: (1) the experimental group, whose teachers received training and taught the “Hutchinson Smoking Prevention Project (HSPP)” curriculum, and (2) the control group, whose teachers taught the district’s usual health curriculum. Study participants (8,388) were the entire 3 rd^ Grade enrollments of the first two years of each district’s participation. These youth were surveyed at 5th, 7 th^ , 9 th^ , and 12 th^ grade, and two years post high school, even if they subsequently left their third grade school district. Of the study participants, 93% were surveyed at 12th grade, and 94% at two years post high school.
Results: The HSPP curriculum had no significant impact on youth smoking.
Control Experimental 12th Grade Currently smoking at all, % 38.7 39. Currently smoking daily, % 25.7 25.
Two Years Post High School Currently smoking at all, % 41.2 39. Currently smoking daily, % 29.1 28.
A) What is the population? (3 points)
B) The researchers chose a sampling method more complicated than simple random sampling. What is it called? What was the likely motivation for this approach? (5 points)
C) Comment on the size of the nonresponse bias at two years post high school. Why is or isn’t it large? (4 points)
D) Does this study appear to suffer from reporting bias? (3 points)
5) Histograms Eli Manning is a quarterback for the National Football League’s New York Giants. In the first three games of this season, he has completed 79 passes. His passes advance the ball an average (mean) of 12 yards per completion. A histogram describing the number of yards for each of his completions is displayed below.
0
2
4
6
8
10
12
14
16
(^261014182226303438) Yards
Number of Completions
A) Comment on the shape of this data distribution. (i.e., is it normal or something else?) (3 points)
B) Is the median number of yards a pass completion earns likely to equal 12, be greater than 12, or be less than 12? (3 points)
C) For this data, what statistic is the best description of the number of yards a typical Eli Manning pass completion earns? (3 points)