Random variable


Understanding Random Variables

A random variable is a fundamental concept in probability and statistics. It is a variable that takes on different values based on the outcomes of a random phenomenon. Random variables are used to quantify uncertain events and are divided into two main types: discrete and continuous.

Discrete vs. Continuous Random Variables

Aspect Discrete Random Variable Continuous Random Variable
Definition Takes on a countable number of distinct values. Takes on an uncountable number of values, often within an interval.
Examples Number of heads in coin tosses, number of students in a class. Weight of a person, time taken to run a race.
Probability Probability mass function (PMF) Probability density function (PDF)
Cumulative Distribution Cumulative distribution function (CDF) is step-wise. CDF is a continuous curve.
Summarization Sum or count values directly. Use integration to summarize probabilities over intervals.

Probability Mass Function (PMF) and Probability Density Function (PDF)

Discrete Random Variables

For a discrete random variable $X$, the PMF is defined as:

$$ P(X = x) = p(x) $$

where $p(x)$ gives the probability that the random variable equals a particular value $x$.

Continuous Random Variables

For a continuous random variable $Y$, the PDF is defined as:

$$ f(y) \geq 0 $$

where $f(y)$ is the PDF, and the probability that $Y$ takes on a value in an interval $(a, b)$ is given by:

$$ P(a < Y < b) = \int_{a}^{b} f(y) \, dy $$

Cumulative Distribution Function (CDF)

The CDF for a random variable $X$, denoted by $F(x)$, is defined as:

$$ F(x) = P(X \leq x) $$

For discrete random variables, $F(x)$ is a step function, while for continuous random variables, it is a smooth curve.

Expectation and Variance

Expectation (Mean)

The expectation or mean of a random variable is a measure of its central tendency.

For a discrete random variable $X$ with PMF $p(x)$:

$$ E[X] = \sum_{x} x \cdot p(x) $$

For a continuous random variable $Y$ with PDF $f(y)$:

$$ E[Y] = \int_{-\infty}^{\infty} y \cdot f(y) \, dy $$

Variance

The variance measures the spread of a random variable around its mean.

For a random variable $X$ with mean $\mu$:

$$ Var(X) = E[(X - \mu)^2] = E[X^2] - (E[X])^2 $$

Examples

Example 1: Discrete Random Variable

Consider a random variable $X$ representing the number of heads when flipping a fair coin three times. The possible values of $X$ are 0, 1, 2, and 3. The PMF $p(x)$ is given by:

  • $P(X = 0) = \frac{1}{8}$
  • $P(X = 1) = \frac{3}{8}$
  • $P(X = 2) = \frac{3}{8}$
  • $P(X = 3) = \frac{1}{8}$

The expectation is:

$$ E[X] = 0 \cdot \frac{1}{8} + 1 \cdot \frac{3}{8} + 2 \cdot \frac{3}{8} + 3 \cdot \frac{1}{8} = 1.5 $$

Example 2: Continuous Random Variable

Let $Y$ be a random variable representing the time (in hours) it takes to complete a task, with the PDF given by:

$$ f(y) = \begin{cases} 2y & 0 \leq y \leq 1 \ 0 & \text{otherwise} \end{cases} $$

The probability that the task takes between 0.5 and 1 hour is:

$$ P(0.5 < Y < 1) = \int_{0.5}^{1} 2y \, dy = [y^2]_{0.5}^{1} = 1 - 0.25 = 0.75 $$

The expectation is:

$$ E[Y] = \int_{0}^{1} y \cdot 2y \, dy = \int_{0}^{1} 2y^2 \, dy = \left[\frac{2}{3}y^3\right]_{0}^{1} = \frac{2}{3} $$

Understanding random variables is crucial for analyzing and interpreting data in various fields, including economics, engineering, social sciences, and natural sciences. They provide a mathematical framework for dealing with uncertainty and making predictions based on probabilistic models.