# Lecture 4: Continuous Random Variables

STA237: Probability, Statistics, and Data Analysis I

Michael Jongho Moon

PhD Student, DoSS, University of Toronto

Wednesday, May 17, 2023

# Example: A broken watch

• Suppose you find a broken watch with only the hour hand in its place.
• What is the probability that the watch stopped at exactly 9 o’clock?
• Assume an equal probability for any position of the hour hand.
• Let $H$ be the random variable that represent the position of the hour hand. We want to compute
$P\left(H=9\right).$

• There are 12 hours on a watch.
• Is it then $P\left(H=9\right)=\frac{1}{12}?$

• What if the hour hand was off the mark by very small amount?

• What if the hour hand was exactly half-way between 9 and 10?
• Position of the hand is a location on a continuous curve.
• There are infinite number of locations on a continuous curve.
• $P(H=9)=0$
• $H$ is an example of a continuous random variable - variables that are uncountable.

### Other examples

• The height of a person. There is no next value after 176.33 cm.
• Waiting time at a restaurant. We can’t count time in general.
• Continuous variables are also used to model values that can only be discrete in practice such as a person’s annual income in CAD.
• How do we define probabilities associated with a continuous random variable?

We use intervals.

# Continuous random variable

Similar to a probability mass function, a probability density function uniquely defines (the behaviour of) a continuous random variable.

A random variable $X$ is continuous if for some function $f:\mathbb{R}\to\mathbb{R}$ and for any numbers $a$ and $b$ with $a\le b$,

$P\left(a\le X\le b\right)=\int_a^b f(x) dx.$ The function $f$ has to satisfy

(i) $f(x)\ge 0$ for all $x$, and
(ii) $\int_{-\infty}^\infty f(x) dx = 1$.

We call $f$ the probability density function of $X$ and the value $f(x)$ is the probability density of $X$ at $x$.

$P\left(a\le X\le b\right)=\int_a^b f(x) dx$

• $f(x)$ is NOT a probability
• Both a pmf and a pdf uniquely defines a random variable, but a pmf maps to $[0,1]$ and a pdf to $[0,\infty)$
• $f(x)$ can be interpreted as a relative measure of likelihood around $x$

# Continuous random variable

The definition of a cdf is the same for both discrete and continuous random variables.

The cumulative distribution function $F$ of a random variable $X$ is the function

$F:\mathbb{R}\to [0,1],$

defined by

$F(a)=P(X\le a)\quad$ $\quad\text{for }-\infty<a<\infty.$

## Properties of cumulative distribution functions

• For continuous random variable $X$ with pdf $f$, we have
$F_X(a)=P(X\le a)=\int_{-\infty}^a f(x) dx)$.
• For discrete random variable $Y$ taking values $y_i$ with pmf $p$ and , we have
$F_Y(a)=P(Y\le a)=\sum_{y_i\le a}p(y_i)$.
• A cdf uniquely defines a distribution for both discrete and continuous random variables.
• Continuous random variables have continuous cdfs.

# Continuous random variable

The property provides an alternative definition.

A random variable is called continuous if its cumulative distribution function $F$ is continuous everywhere.

## More on cumulative distribution functions

• Any cdfs are
1. non-decreasing,
2. right-continuous, and
3. (approaching) 0 on the left end to 1 on the right end

## Example: Dekking et al. Quick Exercise 5.1

Suppose a random variable $X$ is defined by the following probability density function.

$f(x)=\begin{cases}\frac{1}{2\sqrt{x}} & \text{when }0<x<a \\0 &\text{otherwise}\end{cases}$

What is $a$?

• $\int_{-\infty}^\infty f(x) dx =1$

We know $F(a)=1$ and $F(0)=0$. $\implies \int_{-\infty}^\infty f(x) dx= \int_{0}^a 1\left/(2\sqrt{x})\right. dx$

• $\int_{0}^a 1\left/(2\sqrt{x})\right. dx=1$
• $\int_{0}^a x^{-1/2}\left/2\right. dx=1$
• $\left. x^{1/2}\right|_{0}^a =1$
• $a^{1/2}=1$
• $a = 1$

# Quantile, percentile, and median

Let $X$ be a continuous random variable and $p$ a number between 0 and 1. The $p$th quantile or $100\cdot p$th percentile of the distribution $X$ is the smallest number $q_p$ such that

$F(q_p)=P(X\le q_p)=p.$

The median of a distribution is its $50$th percentile.

# Quantile, percentile, and median

The previous definition is ambiguous for discrete random variables since there may not be a value $q$ that satisfies $F(q)=p$.

Let $X$ be a random variable with cumulative distribution function $F$. Then the quantile function of $X$ is the function $F^{-1}$ defined by

$F^{-1}(t) = \min \left\{x: F(x) \ge t \right\},$

for $0<t<1$.

## Example: A broken watch

We assumed there is an equal likelihood of $H$ being between 0 and 12.

Its probability density function will be a constant, say $k$, over the interval from 0 to 12.

What is $k$?

The cumulative distribution function will start to increase from 0 at $H=0$ at a constant rate to reach 1 at $H=12$. $F$ is continuous on $\mathbb{R}$.

# Common continuous distributions

## Uniform distribution

We use a uniform distribution to assign equal probabilities across a fixed interval.

It often models completely arbitrary experiments, or complete ignorance about the likelihood of outcomes

A continuous random variable has a uniform distribution on interval $[\alpha, \beta]$ if its probability density function $f$ is given by

$f(x)=\begin{cases}\frac{1}{\beta-\alpha} & \alpha \le x\le \beta\\ 0 &\text{otherwise.}\end{cases}$

We denote this distribution by $U(\alpha,\beta)$.

## Uniform distribution

We use a uniform distribution to assign equal probabilities across a fixed interval.

It often models completely arbitrary experiments, or complete ignorance about the likelihood of outcomes

$Y \sim U(\alpha, \beta)$

## Example: Air duct cleaning scam calls

Suppose Michael receives approximately $r$ air duct cleaning scam calls every year.

Let the random variable $T$ be the amount of time between two consecutive calls.

To compute the distribution of $T$, we model the calls as a Poisson process …

• divide 1 year into $n$ equal-length intervals
• make the intervals small enough that Michael may receive only 1 call per $1/n$-year interval
• assume whether Michael receives a call during a particular $1/n$-year interval is identical and independent from each other

## Example: Air duct cleaning scam calls

$\vdots$

To compute the distribution of $T$, we model the calls as a Poisson process …

• divide 1 year into $n$ equal-length intervals
• make the intervals small enough that Michael may receive only 1 call per $1/n$-year interval
• assume whether Michael receives a call during a particular $1/n$-year interval is identical and independent from each other

Then, $p_n=r/n$ represents the probability of getting a scam call in any $1/n$-year interval.

$P(T>t\text{ years})$

• $=P(T>t\times n\times1/n\text{-year intervals})$
• $=\left(1-p_n\right)^{t\cdot n}$
• $=\left(1-\frac{r}{n}\right)^{t\cdot n}$

Let $n\to\infty$.

• $P(T>t\text{ years})$
$= \lim_{n\to\infty}\left(1 - r\cdot\frac{1}{n}\right)^{t\cdot n}$
• $=e^{-t\cdot r}$

To compute $F_T(t)$, we can use

$F_T(t)=P(T\le t)=1-e^{-rt}.$

Taking its derivative gives its pdf.

$f_T(x) = \frac{d}{dx} \left(1-e^{-rx}\right) = re^{-rx}$

$T$ is an example of an exponential random variable.

## Exponential distribution

Exponential random variables are often used to model time until the next event in a Poisson process. $\lambda$ is the expected rate of events .

A continuous random variable has an exponential distribution with parameter $\lambda$, $\lambda>0$, if its probability density function $f$ is given by

$f(x) = \begin{cases} \lambda e^{-\lambda x} & x\ge0\\ 0 & \text{otherwise.} \end{cases}$

We denote this distribution by $\text{Exp}(\lambda)$.

## Exponential distribution

Exponential random variables are often used to model time until the next event in a Poisson process. $\lambda$ is the expected rate of events .

While $F_Y$ is everywhere, $f_Y$ is discontinuous at $0$.

$Y \sim \text{Exp}(1)$

### Example: Customer arrivals

(Adopted from Devore & Berk)

Let $X$ be the time (hr) between two successive arrivals at the drive-up window of a local bank. Suppose $X$ has an exponential distribution with $\lambda=\lambda_0$.

What is the probability that no customer showing up for first 2 hours after opening?

Suppose 2 hours have passed since opening without a customer. What is the probability that no customer shows up for the next 2 hours?

The probability that no customer shows up in the first 2 hours is

• $P(X>2)$
• $=1-F(2)$
• $=1-\int_0^2 \lambda_0 e^{-\lambda_0x} dx$
• $=1-\left.e^{-\lambda_0 x}\right|_0^2$
• $=1-e^{-\lambda_00}+e^{-\lambda_02}$
• $=e^{-2\lambda_0}$

The probability that no customer shows up for next 2 hours after no customer showed up for the first 2 hours is

• $P(X>4 | X>2)$
• $=\frac{P(\{X>4\}\cap\{X>2\})}{P(X>2)}$

because $\{X>4\}$ implies $\{X>2\}$.

• $=\frac{P(X>4)}{P(X>2)}=\frac{e^{-4\lambda_0}}{e^{-2\lambda_0}}=e^{-2\lambda_0}$

$P(X>4|X>2)=P(X>2)$

Whether there was a customer in the past 2 hours does not change the probability of a customer’s arrival in the next 2 hours.

### Memoryless property of exponential random variables

For any $s,t>0$,

\begin{align} & P(X>s + t | X>s) \\ = & \frac{P(X>s + t)}{P(X>s)} \\ = & \frac{1-\left(1-e^{-\lambda(s+t)}\right)}{1-\left(1-e^{\lambda s}\right)} \\ =&\frac{e^{-\lambda s}e^{-\lambda t}}{e^{-\lambda s}} \\ = & P(X>t)\end{align}

The timing of a past event does not change the probability of the timing for the next event.

## Gamma distribution

$\Gamma(\cdot)$ is called the gamma function and $\Gamma(n)=(n-1)!$ when $n$ is a positive integer.

A continuous random variable has a gamma distribution with parameter $\alpha$ and $\beta$, $\alpha>0$ and $\beta>0$, if its probability density function $f$ is given by

$f(x)=\frac{1}{\Gamma(\alpha)}\beta^\alpha x^{\alpha-1}e^{-\beta x}\quad\text{for }x>0.$

We denote this distribution by $\text{Gamma}(\alpha, \beta)$.

## Gamma distribution

$\Gamma(\cdot)$ is called the gamma function and $\Gamma(n)=(n-1)!$ when $n$ is a positive integer.

Gamma distribution is more versatile in comparison to exponential distribution with two parameters. It is used to model insurance claim amounts, rainfalls, etc.

$G\sim \text{Gamma}(\alpha, \beta)$

## Normal distribution

Normal distribution, or Gaussian distribution, is central in probability theory and statistics.

It is often used to model observational errors.

A continuous random variable has a normal distribution with parameter $\mu$ and $\sigma^2$, $\sigma^2>0$, if its probability density function $f$ is given by

$f(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left\{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right\}.$

We denote the distribution by $N(\mu,\sigma^2)$.

## Normal distribution

Normal distribution, or Gaussian distribution, is central in probability theory and statistics.

It is often used to model observational errors.

Normal distributions have a symmetric shape around its centre.

$\mu$ controls the center of the distribution (location) while the $\sigma$ controls the spread of the distribution (shape).

$X_{\mu,\sigma} \sim N(\mu, \sigma^2)$

## Standard normal distribution

Standard normal distribution is a special case of normal distribution.

We can transform any normal random variable $X\sim N(\mu, \sigma^2)$ to $Z$ by $Z = \frac{X-\mu}{\sigma}.$

A normal distribution with $\mu=0$ and $\sigma^2=1$ is called the standard normal distribution.

We often denote a standard normal random variable by $Z$, $Z\sim N(0,1)$, its pdf with $\phi$, and its cdf with $\Phi$.

$\phi(z) = \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}z^2}$

$\Phi(a) = \int_{-\infty}^a\frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}z^2}dz$

## Standard normal distribution

There is no explicit solution for $F(a)=\int_{-\infty}^a f(a) dx$.

To compute probabilities for any normal random variable, we can

1. transform the variable to $Z$ and use a look-up table for $\Phi$ (sometimes $1-\Phi$), or
2. use R or similar.

$Z = \frac{X-\mu}{\sigma}.$

## Example: Computing probabilities of a normal random variable

Suppose $X\sim N(1, 4^2)$. Find

1. $P(X > 2)$
2. $P(X\le 0)$
3. $q_{0.25}$

## $X\sim N(1, 4^2)$

$P(X > 2)$

• $= P(X\ge2)$

$P(X=2)=0$

• $= P\left(\frac{X-1}{4}\ge\frac{2-1}{4}\right)$
• $= P\left(Z\ge\frac{1}{4}\right)$
• $\approx 0.4013$

## $X\sim N(1, 4^2)$

$P(X > 2)$

• $= P(X\ge2)$

$P(X=2)=0$

• $= P\left(\frac{X-1}{4}\ge\frac{2-1}{4}\right)$
• $= P\left(Z\ge\frac{1}{4}\right)$
• $\approx 0.4013$
1 - pnorm(1/4) # standard normal
[1] 0.4012937
1 - pnorm(2, mean = 1, sd = 4) # normal with mu 1 and sigma 4
[1] 0.4012937

## $X\sim N(1, 4^2)$

$P(X \le 0)$

• $= P(X\le0)$
• $= P\left(Z\le-\frac{1}{4}\right)$
• $= P\left(Z\ge\frac{1}{4}\right)$

$Z$ is symmetric around $0$.

• $\approx 0.4013$

## $X\sim N(1, 4^2)$

$P(X \le 0)$

• $= P(X\le0)$
• $= P\left(Z\le-\frac{1}{4}\right)$
• $= P\left(Z\ge\frac{1}{4}\right)$

$Z$ is symmetric around $0$.

• $\approx 0.4013$
pnorm(-1/4) # standard normal
[1] 0.4012937
pnorm(0, mean = 1, sd = 4) # normal with mu 1 and sigma 4
[1] 0.4012937

## $X\sim N(1, 4^2)$

$q_{0.25}$

• $0.25= F(q_{0.25})=P(X\le q_{0.25})$
• $0.25= P\left(Z\le \frac{q_{0.25} - 1}{4}\right)$
• $0.25= P\left(Z\ge -\frac{q_{0.25} - 1}{4}\right)$
• $\implies \frac{1-q_{0.25}}{4}\approx0.675$
• $q_{0.25} = -1.7$

## $X\sim N(1, 4^2)$

$q_{0.25}$

• $0.25= F(q_{0.25})=P(X\le q_{0.25})$
• $0.25= P\left(Z\le \frac{q_{0.25} - 1}{4}\right)$
• $0.25= P\left(Z\ge -\frac{q_{0.25} - 1}{4}\right)$
• $\implies \frac{1-q_{0.25}}{4}\approx0.675$
• $q_{0.25} = -1.7$
1 - qnorm(0.75) * 4 # standard normal
[1] -1.697959
qnorm(0.25, mean = 1, sd = 4) # normal with mu 1 and sigma 4
[1] -1.697959

# R worksheet

## Install learnr and run R worksheet

1. Click here to install learnr on r.datatools.utoronto.ca

2. Follow this link to open the worksheet

If you see an error, try:

2. Find rlesson04 from Files pane
3. Click Run Document

Other steps you may try:

1. Remove any .Rmd and .R files on the home directory of r.datatools.utoronto.ca
2. In RStudio,
1. Click Tools > Global Options
2. Uncheck “Restore most recently opened project at startup”
3. Run install.packages("learnr") in RStudio after the steps above or click here

# Summary

• Continuous random variables describe uncountable random outcomes using probabilities of intervals
• Probability density function and cumulative distribution function uniquely define the behaviour of a random variable
• Common continuous random variables include exponential and normal
• Standard normal distribution is a special case of normal distribution

## Practice questions

Chapter 5, Dekking et al.

• Read Section 5.4

• Quick Exercises 5.1, 5.6, 5.7

• All exercises from the chapter

• See a collection of corrections by the author here