PROBABILITY DISTRIBUTIONS

History

The history of random variables and how they evolved into mapping from sample space to real numbers was a subject of interest. The modern interpretation certainly occurred after the invention of sets and maps (1900), but as Eremenko says, random variables were used much earlier. Mathematicians felt the need to interpret random variables as maps. In 1812, Laplace published his book on Theory analytique des probabilities in which he laid down many fundamental results in statistics.

Random Variable

A random variable X is a function defined on a sample space S into the real numbers R such that the inverse image of points or subset or interval of R is an event in S, for which probability is assigned.

Illustration 1

Suppose a coin is tossed once. The sample space consists of two sample points H (head) and T (tail).

That is S = { T, H }

Let X : S R be the number of heads

Then X(T) = 0 and X(H) = 1

Thus X is a random variable that takes on the values 0 and 1. If X (ω) denotes the number of heads, then

Illustration 2

A batch of 150 students is taken in 4 buses to an excursion. There are 38 students in the first bus, 36 in second bus, 32 in the third bus, and the remaining students in the fourth bus. When the buses arrive at the destination, one of the 150 students is randomly chosen. Suppose that X denotes the number of students on the bus of that randomly chosen student. Then X takes on the values 32, 36, 38, and 44.

Illustration 3

A coin is tossed until head occurs.

The sample space is S = {H,TH,TTH,TTTH,……….}.

Suppose X denotes the number of times the coin is tossed until head occur.

Then the random variable X takes on the values 1 2, ,3,……………

Illustration 4

Suppose N is the number of customers in the queue that arrive at a service desk during a time period, then the sample space should be the set of non-negative integers. That is S = { 0, 1,2, 3….. } and N is a random variable that takes on the values 0,1,2, 3,…..

Illustration 5

If an experiment consists in observing the lifetime of an electrical bulb, then a sample space would be the life time of electrical bulb. Therefore the sample space is S = [0, . Suppose X denotes the lifetime of the bulb, then X is a random variable that takes on the values in .

Illustration 6

Let D be a disk of radius r. Suppose a point is chosen at random in D. Let X denote the distance of the point from the centre. Then the sample space S = D and X is the random variable that takes any number from 0 to r. That is

Types of Random Variable

Discrete random variables

A random variable X is defined on a sample space S into the real numbers R is called discrete random variable if the range of X is countable, that is, it can assume only a finite or countably infinite number of values, where every value in the set S has positive probability with total one.

Probability Mass Function

If X is a discrete random variable with discrete values x₁,x₂,x₃,……….. then the function denoted by f (.) or p(.) and defined by

f(x_k) = P(X = X_k) , for k = 1,2,3…… is called the probability mass function of X.

Example Two fair coins are tossed simultaneously (equivalent to a fair coin is tossed twice). Find the probability mass function for number of heads occurred.

Sol: The sample space S = {H,T} x {H,T}

That is S = {TT,HT,TH,HH}

Let X be the random variable denoting the number of heads. Therefore

X(TT) = 0 X(TH) = 1 X(HT) = 1 X(HH) = 2

Then the random variable X takes on the values 0, 1 and 2

The probabilities are given by

f(0) = P(X = 0) =

f(1) = P(X = 1) =

f(2) = P(X = 2) =

The function f (x) satisfies the conditions

(i) f(x) ≥ 0 for x = 0,1,2

(ii) = = f(0) + f(1) + f(2) = = 1

Therefore f (x) is a probability mass function given by

Cumulative Distribution Function or Distribution Function with probability mass function

The cumulative distribution function F (x) of a discrete random variable X , taking the values x₁,x₂,x₃,…. such that x₁<x₂<x₃<…… with probability mass function f(x_i) is F(x) = P(X≤x) = , xR.

Cumulative Distribution Function from Probability Mass function

Both the probability mass function and the cumulative distribution function of a discrete random variable X contain all the probabilistic information of X. The probability distribution of X is determined by either of them. In fact, the distribution function F of a discrete random variable X can be expressed in terms of the probability mass function f(x) of X and vice versa.

Example - If the probability mass function f (x) of a random variable X is

Find (i) its cumulative distribution function, hence find (ii) P (X ≤ 3) and, (iii) P (X ≥ 2)

Solution – P(X<1) = 0 for

F(1) = P(X≤1) = P(X<1) + P(X = 1) = 0 + =

F(2) = P(X≤2) = P(X<1) + P(X = 1) + P(X = 2) =

F(3) = P(X<1) + P(X = 1) + P(X = 2) + P(X = 3) =

F(4) = P(X<1) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) =

Therefore the cumulative distribution function is

P(X≤3) = F(3) =

P(X≥2) = 1 – F(1) = 1 -

Probability Mass Function from Cumulative Distribution Function

Suppose X is a discrete random variable taking the values x₁,x₂,x₃,---- such that x₁<x₂<x₃<---- and F(x_i) is the distribution function. Then the probability mass function f(x_i) is given by

Example - A random variable X has the following probability mass function.

Find (i) P(2<X<6) (ii) P(2≤X<5) (iii) P(X≤4) (iv) P (3<X)

Sol : Since the given function is a probability mass function, the total probability is one. That is

k+2k+6k+5k+6k+10k =1

30k = 1

k =

Therefore the probability mass function is

(i) P(2<X<6) = f(3) +f(4) + f(5) =

(ii) P(2≤X<5) = f(2) + f(3) +f(4) =

(iii) P(X≤4) = f(1) + f(2) + f(3) +f(4) = =

(iv) P (3<X) = f(4) + f(5) + f(6) = + =

Continuous Distributions

Let S be a sample space and let a random variable X:SR that takes on any value in a set I of ℝ . Then X is called a continuous random variable if P(X=x) = 0 for every x in I.

Probability density function

A non-negative real valued function f (x) is said to be a probability density function if, for each possible outcome x, x of a continuous random variable X having the property

Distribution function (Cumulative distribution function)

The distribution function or cumulative distribution function F (x) of a continuous random variable X with probability density f(x) is

Distribution function from Probability density function

Both the probability density function and the cumulative distribution function (or distribution function) of a continuous random variable X contain all the probabilistic information of X . The probability distribution of X is determined by either of them. Let us learn the method to determine the distribution function F of a continuous random variable X from the probability density function f (x) of X and vice versa.

Example - If X is the random variable with probability density function f (x) given by,

f(x) =

Find the distribution function F(x).

Sol: By definition, F(x) = P(X ≤ x) =

When x<1, F(x) = P(X ≤ x) = = 0

When 1≤x<2, F(x) = P(X ≤ x) = + =

When 2≤x<3 F(x) = P(X ≤ x) = + + = 1-

When x≥ 3, F(x) = P(X ≤ x) = + + + = 1

These give F(x) =

Probability density function from Probability distribution function

Suppose F (x) is the distribution function of a continuous random variable X . Then the probability density function f (x)is given by

when ever derivative exists.

Mathematical Expectation

Suppose X is a random variable with probability mass (or) density function f (x). The expected value or mean or mathematical expectation of X , denoted by E (X) or μ is

Variance

The variance of a random variable X denoted by Var(X) or V(X) or ( or ) is

V(X) = E(X-E(X))² = E(X - µ) ²

^·Square root of variance is called standard deviation.

Properties of Mathematical expectation and variance

(i) E(aX+b) = aE (X) + b , where a and b are constants

Corollary 1: E (aX) = aE (X) (when b = 0 )

Corollary 2: E (b) = b (when a = 0 )

(ii) Var (X) = E(X²) - E(X)²

(iii) Var(aX +b) = a² Var(X) where a and b are constants

Corollary 3: V (aX) = a² V (X) (when b = 0)

Corollary 4: V (b) = 0 (when a = 0)

Example - Suppose that f (x) given below represents a probability mass function,

Find (i) the value of c (ii) Mean and variance.

Sol: (i)

c² + 2c² + 3c² + 4c² + c + 2c = 1

c =

Hence, the probability mass function is

(ii) To find mean and variance, let us use the following table

Theoretical Distributions: Some Special Discrete Distributions

The One point distribution

· The random variable X has a one point distribution if there exists a point x₀ such that, the probability mass function f (x) is defined as f (x) = P(X = x₀) =1.

· the mean and the variance are respectively x₀ and 0 .

The Two point distribution

Unsymmetrical Case: The random variable X has a two point distribution if there exists two values x₁ and x₂, such that

· The mean and the variance are respectively px₁ + qx₂ and pq(x₂ – x₁)²

Symmetrical Case: When p = q =

The mean and variance respectively are and

The Bernoulli distribution

Let X be a random variable associated with a Bernoulli trial by defining it as X (success) = 1 and X (failure) = 0, such that

X is called a Bernoulli random variable and f (x) is called the Bernoulli distribution.

· X is a Bernoulli’s random variable following with parameter p, the mean μ and variance σ² of Bernoulli distribution are

µ = p and

The Binomial Distribution

A discrete random variable X is called binomial random variable, if X is the number of successes in n -repeated trials such that

(i) The n- repeated trials are independent and n is finite

(ii) Each trial results only two possible outcomes, labelled as ‘success’ or ‘failure’

(iii) The probability of a success in each trial, denoted as p, remains constant.

· The binomial random variable X, equals the number of successes with probability p for a success and q= 1- p for a failure in n-independent trials, has a binomial distribution denoted by X B(n, p). The probability mass function of X is

, x = 0,1,2, ………..

· If X is a binomial random variable with parameters p and n, the mean μ and variance σ² of binomial distribution are

µ = np and

Example - Find the binomial distribution function for a fair die is rolled 10 times and X denotes the number of times 4 appeared.

Sol: A fair die is rolled ten times and X denotes the number of times 4 appeared. X is binomial random variable that takes on the values 0,1,2,3,……,10 with n = 10 and p = .

Probability of getting a four in a die is p = and q = 1 – p = .

Therefore the binomial distribution is