Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Fundamentals of Statistical Inference, Exams of Statistics

Saint Petersburg College (SPC)Statistics

The key concepts and techniques in statistical inference, including the mean and standard deviation, the 68-95-99.7 rule, z-scores, hypothesis testing, effect size, statistical power, confidence intervals, and linear regression. It provides a comprehensive overview of the statistical methods used to draw conclusions from sample data and make inferences about the underlying population. The importance of understanding these statistical principles and their practical applications in various fields, such as psychology, economics, and social sciences. It also discusses the assumptions and limitations of these techniques, as well as strategies for interpreting and communicating statistical findings effectively.

Typology: Exams

2024/2025

Available from 10/14/2024

star_score_grades 🇺🇸

3.6

(19)

1.7K documents

1 / 33

This page cannot be seen from the preview

Don't miss anything!

St Petersburg College Florida

Statistics (Stat 2023) Exam

Quiz Module 4

Course Title and Number:Statistics (Stat 2023) Quiz module

Exam Title:[Insert Exam Title]

Exam Date:[Insert Exam Date]

Instructor:[Insert Instructor’s Name]

Student Name:[Insert Student’s Name]

Student ID:[Insert Student ID]

Examination

180 minutes

Instructions:

1. Read each question carefully.

2. Answer all questions.

3. Use the provided answer sheet to mark your

responses.

4. Ensure all answers are final before submitting the

exam.

Good Luck!

Partial preview of the text

Download Fundamentals of Statistical Inference and more Exams Statistics in PDF only on Docsity!

St Petersburg College Florida

Statistics (Stat 2023) Exam

Quiz Module 4

Course Title and Number: Statistics (Stat 2023) Quiz module 4 Exam Title: [Insert Exam Title] Exam Date: [Insert Exam Date] Instructor: [Insert Instructor’s Name] Student Name: [Insert Student’s Name] Student ID: [Insert Student ID]

Examination

180 minutes

Instructions:

**1. Read each question carefully.

Answer all questions.
Use the provided answer sheet to mark your** **responses.
Ensure all answers are final before submitting the** exam. Good Luck!

Statistics (Stat 2023) Exam

Quiz Module 4

What term refers to a frequency distribution that follows a bell- shaped, symmetrical, and unimodal curve? - Answer>> normal distribution In a normal distribution, the mean is located where? - Answer>> in the middle of the curve T/F For a normal curve, the median, mean, and mode are typically equal T/F The area below the curve is 120%. - Answer>> True False. 100% or 1. T/F The greater the standard deviation, the less spread out the normal curve. - Answer>> FALSE The greater the standard deviation, the more spread out the normal curve. The smaller the standard deviation, the narrower the normal curve. A normal distribution can be defined by its ___ and ______. - Answer>> mean and standard deviation the --___ rule con? - Answer>> 68-95-99. However, the rule only works when values are exactly 1, 2, or 3 standard deviations away from the mean. In order to apply the concept of proportions to other standard deviation values, such

Josef: X=87 - 1.75(4) = 80 Marco: X= 87 - 2.50(4) = 77 Brooklyn: X= 87 + 1.25(4) = 92 z-Scores for Sample Means from a Population - Answer>> (sample mean-population mean)/ population mean/ sqrt(sample size) Consider the example of the exam scores from Figure 4.3 ( μ = 75 , σ = 5 ). Suppose the teacher would like to know the proportion of scores on the exam that are below 65. - Answer>> 1. transform raw score to a z-score 2 use normal table to find the p-value for z score The proportion of scores below the z= -2.00 is 0.02275 or 2.275%. Therefore, a test score below 65 is unusual since less than 2.275% of the scores fall below it. a range of values that is likely to contain the true population mean. - Answer>> confidence interval What states that the distribution containing all sample means will approach the population mean. This implies that the population mean will be close to the sample mean. - Answer>> The Central Limit Theorem 95% confidence interval - Answer>> he 95% confidence interval, for example, means that 95% of the experiments with the given treatment will contain the true population mean. Consequently, 5% (or 1 in 20) of the experiments will not contain the true population mean. A 95% confidence interval implies that the researcher is 95% confident that the population mean lies in the interval that is centered around the sample mean. margin of error - Answer>> z crit *(population standard deviation/ sqrt (sample size)

T/F Hypothesis testing can determine the absolute size of the effect of the treatment. It can determine whether the treatment caused a substantial effect - Answer>> False. , it cannot determine the absolute size of the effect of the treatment. In other words, it cannot determine whether the treatment caused a substantial effect What researchers calculate to determine the absolute magnitude, or size, of the treatment. One measure for this is... - Answer>> effect size One measure for effect size is Cohen's d. cohen's d= mean difference/ standard deviation= (sample mean- population mean)/ s dev d=0. d=0. d=0.8 - Answer>> small effect medium effect large effect the probability of correctly rejecting a null hypothesis that is false - Answer>> statistical power Power = 1 - Beta - Answer>> In other words, if there is an effect, the power describes the likelihood that the study will provide evidence of the effect. Power is calculated prior to beginning a research study to define the probability of committing a Type II error (failing to reject a false null hypothesis) a value known as β. The statistical power of a study is measured on a scale of 0 to

Thus, researchers can calculate statistical power by using the following formula:

A ______ test involves a directional hypothesis, whereas a ______ test involves a non-directional hypothesis. - Answer>> one-tailed, two-tailed What is the critical area? - Answer>> The area under the curve containing extreme values that rarely occur in the distribution. If the test statistic falls in the critical region, the result is statistically significant and you reject the null hypothesis. What is effect size? - Answer>> While hypothesis testing can determine statistical significance, effect size can tell us how meaningful or impactful a particular result is. Which of the following factors affects statistical power? Type of hypothesis test Significance level Effect size All of the above - Answer>> All of the above You are conducting an experiment that is expected to increase the mean scores for participants in the population. If the population mean is μ=75, which statement below is the correct alternative hypothesis (Ha ) for a one-tailed test?a. μ>75b. μ≥75c. μ<75d. μ≤75 - Answer>> A A sample of n =12 is selected from a population whose mean is μ=80 (σ=12). After a treatment is applied to the sample, the size of the treatment is d=0.25. What was the sample mean?a. =79b. =81c. =83d. =85 - Answer>> d=0. d= x-M/ sdev x= d(sdev) + M x= 83 C

In a study, how does the sample size affect the rejection of the null hypothesis and Cohen's d? Assume other factors are constant. A smaller sample size increases the likelihood of rejecting the null hypothesis and decreases Cohen's d. A larger sample size increases the likelihood of rejecting the null hypothesis and increases Cohen's d. A small sample size decreases the likelihood of rejecting the null hypothesis and does not affect Cohen's d. A larger sample size increases the likelihood of rejecting the null hypothesis and does not affect Cohen's - Answer>> d. A larger sample size increases the likelihood of rejecting the null hypothesis and does not affect Cohen's Why is it important to measure effect size? - Answer>> While hypothesis testing can determine statistical significance, effect size can tell us how meaningful or impactful a particular result is. What is a confidence interval? - Answer>> A confidence interval is a range of values that is likely to contain the true population mean. The 95% confidence level is the most common. What two things are used to calculate a confidence interval? - Answer>> Point estimate and margin of error a sample statistic that is used to estimate the population parameter e.g. x bar - Answer>> point estimate a range of values likely to contain the population parameter - Answer>> confidence interval 95% confidence represents - Answer>> range of values where you would fail to reject the null hypothesis

Since this is a two-tailed test at α = 0.05, zcrit = ±1.96.

Calculate the Test Statistic
Compare and Decide Since z = -1.57 does not exceed the critical value of ±1.96, we fail to reject the null hypothesis. The result is not statistically significant.The results should be reported as follows:z = -1.57, p > 0.05, two-tailed 2 Sample t test used for... - Answer>> - Comparison of 2 group means ANOVA used for... - Answer>> Comparison of 3 or more group means Which chart/graph is used to show correlation? - Answer>>

Scatter Plot Correlation Overview - Answer>> - Measures association between 2 numeric variables
Correlation coefficient and significance
Correlation and causality Correlation - Answer>> The simplest measure of a relationship between 2 variables is given by the correlation. a) If one variable increases, does the other increase too? b) Or does one decrease when the other increases? c) Or is there no relationship? Correlation Coefficient - Answer>> - "r " is the Pearson correlation coefficient
x, y are 2 variables with means denoted by (line over x) and (line over y)

What is the Pearson Correlation Coefficient? - Answer>> - The average product of two standardized variables How can normal variables x and y be transformed to standard normals? - Answer>> - Use the z-score transformation Properties of the Pearson Correlation Coefficient - Answer>> - r can take values between -1 and 1 ii: r = 0: no correlation ii: 0 < r < 1: positive correlation

Similar relationship
If one goes up, so does the other ii: -1 < r < 0: negative correlation
Inverse relationship
If one goes up, the other goes down The definition of r is symmetric in x and y:
r(x,y) = r(y,x) No dependent and independent variables 4 Graphical ways the line can go - Answer>> - (+): the line will be straight and from left to right go up
(-): the line will be straight and from left to right go down
No relationship: the line will just go straight across
Nonlinear: the line will be curved ii: Can't use a Pearson correlation coefficient if the line is curved (nonlinear) Is r (Pearson coefficient) small or large - Answer>> - Use Cohen's guidelines: Small: r in [0.1, 0.3] (+) correlation OR r in [- 0.1, - 0.3] (-) correlation Medium: r in [0.3, 0.5] (+) correlation

Causality requires more evidence than merely a significant correlation.
For instance, two variables may be correlated due to impact of a third variable that was not measured.
Other examples that could be misinterpreted: a) Crime and number of police
They both be increasing, but it is it a high crime rate causing more police or more police causing a higher crime rate? There is no way to determine which one is causing the other (causality). b) Health of community and number of nurses:
They are both increasing but it is it a high number of health problems causing an increase in the number of nurses or is it a high number of nurses causing an increase in health problems? There is no way to determine which one is causing the other (causality). Basic Mathematics of a line - Answer>> - Slope
Intercept Simple Linear Regression - Answer>> - *Predict one numeric variable from knowledge of *another
Example 1: Haque and Zaritsky modeled expected systolic blood pressure for a child in terms of its age ... So if you know the child's age, you know what SBP to expect
Example 2: Mooney, Holmes, and Christie modeled the total number of flu cases expected in any year in terms of the highest weekly growth rate of flu cases Regression - Answer>> - The word regression is due to Francis Galton in his 1885 paper on the inheritance of stature.

Regression refers to the phenomenon whereby descendants of parents of extreme stature tend toward the average height of the population.
While the term regression says something about the behavior of residuals, the main idea is to fit a line to the data at hand. What is a line? - Answer>> - In a scatter plot, the pattern must resemble a straight line, or a cigar shape in practice.
Algebraically, the equation of a straight line is y = b (subscript 0) + b(subscript 1) x b0 is the intercept b1 is the slope
In terms of the standardized variables xz and yz it is β replaced b1 as the slope (Subscript z denotes standardization) y = dependent variable x = independent variable Slope - Answer>> - The slope provides the rate of change.
It represents the amount by which y changes for a single unit change in x
Slope can take any value, positive or negative or zero Intercept - Answer>> - The intercept is the value of y when x is zero, i.e. the value of y at which the line intersects the y-axis
Intercept can take any value, positive or negative or zero b (subscript 1) or Beta - Answer>> - "rise over run" Other names for slope, intercept - Answer>> - The slope and intercept are also called model coefficients

All 3 of these are squared and summed at the end Least Squares Solution continued

Minimizing the Residual Sum of Squares - Answer>> - The main idea is to find b0 and b1 that minimize S.
The algebra is easiest to write and understand in terms of standardized variables
In this scheme, we transfer over to the standardized variables, get the solution, then transfer back to the solution in terms of the original variables. The Solution that minimizes "S" is... - Answer>> - The standardized slope equals the correlation coefficient
This is the Pearson "r" correlation coefficient Results of study by Haque and Zaritsky - Answer>> - Systolic blood pressure (5th percentile at 50th height percentile) was modeled on child's age: SBP(subscript 5) = 65 + (2 x age) Ex: 10 y/o child SBP(subscript 5) = 65 + (2 x 10) => 65 + 20 => 85
This denotes that only 5% of kids would have a SBP < 85 and that 95% would have a SBP > 85.
SBP5 denotes systolic blood pressure at the 5th percentile at 50th height percentile
Age is measured in years What happens to the line when a log is used? - Answer>> - It curves Evaluating the Regression Model - Answer>> - Hypothesis Tests: a) F test (ANOVA) of model

b) *t tests of slope and intercept

Model goodness-of-fit
Were assumptions met? Residuals are normal, homoscedastic *Variables are linearly related

Observations are independent Hypothesis tests - Answer>> - The primary hypothesis of the simple linear regression model is about the existence of a linear relationship Null hypothesis, H0: β = 0 (aka Pearson r correlation coefficient) Alternative hypothesis, H1: β ≠ 0 F test is used -The* secondary hypotheses* are about the slope and intercept Null hypothesis, slope = 0 Null hypothesis, intercept = 0 t tests are used (2/3 of the tests will have the same p-values) ANOVA for regression model - Answer>> - The significance of the regression model is assessed by analysis of variance after decomposing the total sum of squares into regression and residual components Total SS = Residual SS + Regression SS

So for F1, 18 that means the sample size is 20 because it will always be F1 and the 18 is from 20- So for F1, 58 that means the sample size is 60 because it will always be F1 and the 58 is from 60- t tests of coefficients - Answer>> - The null hypotheses are that: (1) slope = 0, and (2) intercept = 0 Reject null if t > tn-2, crit or if t < -tn-2, crit, where tn-2, crit is the critical value, generally the 95% probability point, of the t distribution with n- degrees of freedom 95% confidence intervals - Answer>> - The 95% confidence intervals of the regression coefficients are obtained from the standard error of each coefficient and the t distribution: Slope: b1 ± tn-2, crit SE(b1) Intercept: b0 ± tn-2, crit SE(b0) (SE: Standard Error) R2 goodness-of-fit - Answer>> - R2 is a summary measure of goodness-of-fit that is widely used

Definition: R2 = Reg SS / Total SS
R2 is a number between 0 and 1, with high values indicating a good model
It is the proportion of variance of the DV that is explained by the linear model in terms of the IV. (DV: Dependent Variable, IV: Independent Variable) Note: The IV is also called the predictor in regression models

Remember: a) The denominator will ALWAYS be bigger than the numerator so that means... b) R2 can NEVER be greater than 1 and it will ALWAYS be between 0 - 1. c) The closer to 1, the better the model How to judge a model using R2 - Answer>> - In simple regression (1 DV + 1 IV), R is rigidly tied to the slope: R = β = r.

Adapting the Cohen guidelines for r to R2 suggests that it is: Small, if R2 is in [0.01, 0.09] (1% - 9%) Medium, if R2 is in [0.09, 0.25] (9% - 25%) Large, if R2 > 0.25 (> 25%) Assumptions of Regression - Answer>> 1) Ratio of cases to IVs must be substantial A rule of thumb is: N > 50 +8k for multiple regression with k IVs, i.e. N > 58 for simple regression.

Outliers must be removed from the IVs and the DV
Independence: data values (both DV and IV) are independent of each other Data must not be collected repeatedly from the same subject Interviewer learning curve for a questionnaire . Other assumptions of regression must be tested after computing the model.
Normality: Residuals (errors) are normally distributed

Fundamentals of Statistical Inference, Exams of Statistics

Related documents

Partial preview of the text

Download Fundamentals of Statistical Inference and more Exams Statistics in PDF only on Docsity!

St Petersburg College Florida

Statistics (Stat 2023) Exam

Quiz Module 4

Examination

180 minutes

Statistics (Stat 2023) Exam

Quiz Module 4