













Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
STAT 2020 Exam 1 Statistics For Biologists 2025-2025 Questions With Well Solved Answers Accurately A+ Score
Typology: Exams
1 / 21
This page cannot be seen from the preview
Don't miss anything!
descriptive statistics
methods of organizing, summarizing, and presenting data in an informative way
inferential statistics
methods for drawing conclusions about a phenomenon (population) on the basis of data (sample)
what are the two branches of statistics?
descriptive and inferential
population
is all subjects or items of interest, whose size is denoted by N
sample
a group (or subset) selected from a population, whose size is denoted by n
parameter
a number that describes a characteristic of a population
statistic
a number that describes a characteristic of a sample
what do we use to estimate unobserved values of parameters?
observed values of a statistic (of a sample)
when is a statistic unbiased?
if the mean of its sampling distribution is the same as the parameter it is intended to estimate
individuals
objects described in a set of data, may be people, animals, plants, or things
variable
any property that characterizes an individual, could be age, gender, blood pressure, flower color
what does it mean when a distribution of data is skewed?
that is it not symmetric, it extends more to one side than the other
left skew (negative skew)
right skew (positive skew)
where are the mean and the median in accordance with in mode in a left skew?
the mean and median are to the LEFT of the mode
where are the mean and the median in accordance with the mode in a right skew?
the mean and median are to the RIGHT of the mode
is the mean resistant to skew and outliers?
no
is the median resistant to skew and outliers?
yes
in what kind of data will the mean, median, mode, and midrange be approximately the same?
in data that is approximately symmetric with only one mode
for asymmetric data, what kind of measure of center should you report?
mean and median
variation
measure of the amount that values within a data set vary among themselves
standard deviation
measure of variation of values about the mean
is the standard deviation resistant to skew or outliers?
no
variance
measure of variation equal to the square of the standard deviation
process of using statistical tools to investigate data sets in order to understand their important characteristics, i.e. center, variation, distribution, outliers, time etc.
what values can an outlier have a significant effect on?
mean, standard deviation, and the scale of the histogram
bivariate (or paired) data
can be analyzed to determine if there is an association between the two variables, only explore linear associations with quantitative data
what does it mean when a correlation exists between two variables?
when one of them is linearly related to the other in some way
how should you begin the investigation into any association between two variables?
constructing a scatterplot!
what is r?
the linear correlation coefficient
what does the linear correlation coefficient (r) measure?
the strength and direction of a linear association
how is r related to p?
r is a sample statistic representing the population correlation coefficient p
what are the three requirements for making inferences about p using r?
data must be a random sample, points must approximate a straight-line pattern, outliers should be removed
what numbers is r always between?
does the value of r change if all values of either variable are converted to a different scale?
no
is the value of r affected by the choice of x and y?
no
what can we conclude when r is close to zero?
there is no significant linear correlation between x and y
what is variable x in a regression equation?
the independent, predictor, or explanatory variable
what is variable y in a regression equation?
the dependent or response variable
least-squares regression line
the unique line such that the sum of the vertical distances between the data points and the line is zero, and the sum of the squared vertical distances is the smallest possible
how can we interpret the slope of the regression line?
how much we expect y to change for every unit change in x
to find the slope of a regression line you must find variables s sub y and s sub x, what do these mean?
the standard deviation of the response variable y and the standard deviation of the explanatory variable x
least-squares regression is only for what kind of associations?
linear
what does the coefficient of determination predict?
the proportion of variation in y that is explained by x
influential individual
an observation that markedly changes the regression if removed
what is the difference between an outlier and an influential individual?
when an influential individual is removed then the regression line changes substantially, when an outlier is removed the regression line changes very little
residuals
the vertical distances from each point to the least-squares regression line
by definition, what is the sum of all the residuals?
zero
what kind of residual do points have that are above the least-squares regression line?
positive residual
probability sampling
individuals or units are randomly selected; the sampling process is unbiased
simple random sample (SRS)
made of randomly selected individuals; each individual in the population has the same probability of being in the sample
stratified random sample
make sure your sample has x,y,z% of individuals of certain types
sample survey
an observational study that relies on a random sample drawn from the entire population
incidence
rate of new cases per year
prevalence
rate of all cases at one point in time
uncover age or selection bias
parts of the population are systematically left out
nonresponse
some people choose not to answer/participate
wording effects
biased or leading questions, complicated/confusing statements can influence survey results
response bias
fancy term for lying or forgetting
case-control studies
start with two random samples of individuals with different outcomes and look for exposure factors in the subjects past
cohort studies
enlist individuals of common demographic and keep track of them over a long period of time, individuals who later develop a condition are compared with those who don't
cross-sectional studies
in cellular biology experiments what does a positive control predict?
that outcomes change
how to experiments use replication?
several or many individuals are studied
hawthorne effect
term used to describe a bias that may occur due to behavior modification because of study enrollment (observer effect)
double blind experiment
neither the subjects nor the experimenter know which individuals received which treatment
completely randomized experimental design
individuals are randomly assigned to groups, then the groups are assigned to treatments completely at random
repeated measures and matched pairs designs
choose pair of subjects that are closely matched and randomly asking who will receive what treatment. give them the treatment over time in a random order so we have repeated measures for each subject
two-way tables
summarize data about two categorical variables collected on the same set of individuals
how do you obtain the marginal distribution of a two way table?
studying the row totals and column totals
conditional distribution
distribution of one factor for each level of the other factor, computed using counts within a single row or column. denominator is the corresponding row or column total
random event
outcomes are uncertain, there is still a regular distribution of outcomes in a large number of repetitions
probability
proportion of times that an outcome will occur in a very long series of repetitions
discrete sample space
discrete variables that can take on only certain values
binomial distributions
models for some categorical variables, typically representing the number of successes in a series of n independent trials
what is n in a binomial distribution?
total number of observations
what is p in a binomial distribution?
probability of success on each observation
binomial coefficient
counts number of ways in which k successes can be arranged among n observations
binomial probability
count of ways k successes can be arrange among n observations multiplied by the probability of any specific arrangement of the k successes
poisson distributions
describes the count X of occurrences of an event in fixed, finite intervals of time or space (describing the number of items in containers)
what is mu in poisson probabilities?
population mean number of occurrences of a specified interval of time or space
what numbers are involved in the rule for any normal curve?
68 - 95 - 99.
how can you obtain the area between two z-values?
first get the area under the curve for each z-value and subtract the smaller area from the larger area
normal quantile plot
mode of assessing if a data set has an approximately normal distribution
sampling distribution of a statistic
the probability distribution of that statistic for samples of a given size n taken from a given population
central limit theorum
when randomly sampling from any population with mean and standard deviation, when n is large enough the sampling distribution of x hat is approximately normal
law of large numbers