








Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Stast 1602 Exam 1 Introduction To Data Science With Python 2025 Questions With Right Answers Verified A+ 100 %
Typology: Exams
1 / 14
This page cannot be seen from the preview
Don't miss anything!
What comprises data analysis?
Exploration (identifying patterns), prediction (making informed guesses), inference (Will the patterns appear in new observations? How accurate?)
What is the data science work flow?
Import->tidy->transform->visualize->model->communicate
What is an observational study?
Study in which investigators have no control over how data were generated
What are individuals?
Items under study
What is a treatment?
Factor we believe may cause some effect
What is an outcome?
Measured on each individual; manifestation of effect
How do you establish causality?
Observe an association then perform analysis that supports a causal relationship
What is a confounding factor?
Factor that causes treatment and control group to be systematically different
How to avoid confounding factors?
Assign individuals to treatment and control at random; do RCT
What is a RCT?
Randomized controlled trial; study that randomly assigns individuals to treatment or control; but take more resources and must take ethics into account; to establish causality
When is there an association between a treatment and outcome?
If there is a systematic difference in outcomes of treatment and control group
What is comparison?
What is output of type(1<2)?
Bool
What is a float?
Number with decimal
What is an int?
Whole number
What is a str?
String; text
What is output of "a"+"t"?
at
What does str( ) do?
Returns a string representation of its argument
What is output of str(2+1)?
3 (as a str instead of int)
What are boolean values?
Represent truth of statement as either True or False
What does bool( ) do?
Tests object for its truth value; by default objects are True unless defined to be False; False=0, True=
What are boolean comparison operators?
,<,==, is
What are boolean operators?
And, or, not
What is output of True or 2?
True
What is a python statement?
Single line of code that either returns a value or causes something to happen
What is the relation between expressions and statements?
Any expression is a statement but not every statement is an expression
What is a table?
Collection of data stored in rows and columns; each column is a numpy array
What does Table( ) do?
Creates empty table
What is a method?
Function associated with an object
What is an attribute?
Variable associated with an object
What is a property?
Special type of method that doesn't need to be called with ( ) after the name
What does num_columns do?
Returns number of columns in table it refers to
What is an array?
Sequence type- ordered and can refer to particular member by its relative position
What does make_array( ) do?
Makes an array
What is output of make_array('a','b','y','z')?
a,b,y,z
What is an index?
Relative position of a member of an array
What to start counting from when determining relative position?
What does array.item(3) do?
Outputs 4th item in array
What is output of table.column(0).item(3)?
Creates array of column 0 and outputs 4th item
What is a variable?
Feature or characteristic thats varies from individual to individual
What is a quantitative variable?
Variable that assumes numerical values to do arthmatic
What does tab.scatter('col1','col2' ) do?
Creates scatter plot od points with coordinates (a.b) of values in columns 1 and 2 of table
What does tab.scatter( ) do?
Makes 1 point for each row of table named tab
What does tab.plot('1','2') do?
Produces line plot with points coordinates (a,b) of columns 1 and 2; used to plot relationships over time
What are categorical variables?
Can't do arithmetic with them
What are frequencies?
Number of observations that fall in each category of variable
What is a distribution table?
Table of frequencies of categories
What does tab.barh('1','2') do?
Produces horizontal bar chart with 1 bar for variable 1
What does tab.group('1') do?
Produces new table with columns 1 and count (frequency of each category in 1)
What does tab.relabel('old_name','new_name') do?
Creates crave with 1 on horizontal axis divided into 10 bins;bars are of area proportional to number of observations in each bin
What is the area of a bar in a histogram?
Area=% of entries in bin= height of bar*width of bar; sum of all bar areas=100%
What is the vertical value in a histogram?
Percent of observations in bin relative to width of bin; bar height=area/width=%entries in bin/width=density
What is the total area of a histogram?
1
What does tab.hist('1',bins=make_array(...)) do?
Customizes histograms number and width of bins
How to make function?
def func_name(thing):
variable=thing.action()
return ...
func_name(object)
What does tab.apply(name_func, 'column') do?
Applies the function name_func to column in tab
What does tab.group('column', max) do?
Makes table with rows grouped by column, first column is column and second column is column max