Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lecture 4: Random Variables and Distributions, Lecture notes of Probability and Statistics

There are two type of random value discrete and continuous

Typology: Lecture notes

2020/2021

Uploaded on 06/11/2021

amritay
amritay 🇺🇸

4.7

(14)

256 documents

1 / 31

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 4: Random
Variables and Distributions
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f

Partial preview of the text

Download Lecture 4: Random Variables and Distributions and more Lecture notes Probability and Statistics in PDF only on Docsity!

Lecture 4: Random

Variables and Distributions

Goals

  • Working with distributions in R
    • Overview of discrete and continuous

distributions important in genetics/genomics

  • Random Variables

Two Types of Random Variables

  • A discrete random variable has a

countable number of possible values

  • A continuous random variable takes all

values in an interval of numbers

Probability Distributions of RVs

Discrete

Let X be a discrete rv. Then the probability mass function (pmf), f(x), of X is: f ( x ) = P(X = x), x ∈ Ω 0, x^ ∉^ Ω

Continuous

P ( a " X " b ) = f ( x ) dx a b

Let X be a continuous rv. Then the probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a ≤ b: a b A a

Using CDFs to Compute Probabilities

Continuous rv: ! F ( x ) = P ( X " x ) = f ( y ) dy #$ x % pdf (^) cdf P ( a " X " b ) = F ( b ) # F ( a )

Expectation of Random Variables

Continuous

X = E [ X ] = x " f ( x ) dx #$ $

The expected or mean value of a continuous rv X with pdf f(x) is:

Discrete

Let X be a discrete rv that takes on values in the set D and has a pmf f(x). Then the expected or mean value of X is:

X

= E [ X ] = x " f ( x )

x # D

Example of Expectation and Variance

  • Let L 1 , L 2 , …, L n be a sequence of n nucleotides and define the rv X i : 1, if L i = A 0, otherwise X i
  • pmf is then: P(X i = 1 ) = P(L i = A) = p A P(X i = 0 ) = P(L i = C or G or T) = 1 - p A
  • E[X] = 1 x p A
  • 0 x (1 - p A ) = p A
  • Var[X] = E[X - μ] 2 = E[X 2 ] - μ 2 = [ 1 2 x p A
  • 0 2 x (1 - p A )] - p A 2 = p A (1 - p A )

The Distributions We’ll Study

  1. Binomial Distribution
  2. Hypergeometric Distribution
    1. Poisson Distribution
      1. Normal Distribution

Binomial Distribution

! P { X = x } = (^) ( ) p x ( 1 " p ) n n " x x

pmf:
E(x) = np
cdf:

P { X " x } = ( ) p

y

( 1 # p )

n # y y = 0 x $ n y

Var(x) = np( 1 -p)

Binomial Distribution: Example 1

  • A couple, who are both carriers for a recessive
disease, wish to have 5 children. They want to know
the probability that they will have four healthy kids

! P { X = 4 } = (^) ( )0. 4 " 0. (^5) 1 4

0 1 2 3 4 5 p(x)

Hypergeometric Distribution

  • Population to be sampled consists of N
finite individuals, objects, or elements
  • Each individual can be characterized as a
success or failure, m successes in the
population
  • A sample of size k is drawn and the rv of
interest is X = number of successes

Hypergeometric Distribution

  • Similar in spirit to Binomial distribution, but from a finite population without replacement 20 white balls out of 100 balls If we randomly sample 10 balls, what is the probability that 7 or more are white?

Hypergeometric Distribution

  • Extensively used in genomics to test for “enrichment”: " = Number of annotated genes Number of genes of interest Number of genes with annotation Number of genes of interest with annotation

Poisson Distribution

  • Useful in studying rare events
  • Poisson distribution also used in situations
where “events” happen at certain points
in time
  • Poisson distribution approximates the
binomial distribution when n is large and p
is small