Note for chapter 2 Probability and Distributions (概率与分布)


符号 表示 定义
\forall for all symbol 对所有 x:P(x)\forall x: P(x) 表示 P(x)P(x) 对于所有 xx 为真

Probability 概率

Properties of a probability space 概率空间的性质


  • P(C)=1P(Cc)P(C)=1-P\left(C^{c}\right)
  • P()=0P(\emptyset)=0
  • P(C1)P(C2)P\left(C_{1}\right) \leq P\left(C_{2}\right) if C1C2C_{1} \subset C_{2}
  • 0P(C)1,CB0 \leq P(C) \leq 1, \quad \forall C \in \mathcal{B}
  • Inclusion-exclusion formula (容斥原理)

    P(C1C2)=P(C1)+P(C2)P(C1C2) P\left(C_{1} \cup C_{2}\right)=P\left(C_{1}\right)+P\left(C_{2}\right)-P\left(C_{1} \cap C_{2}\right)

    More general:

    P(C1Ck)=p1p2+p3+(1)k+1pk, P\left(C_{1} \cup \ldots \cup C_{k}\right)=p_{1}-p_{2}+p_{3}-\ldots+(-1)^{k+1} p_{k},


    p1=i=1kP(Ci),p2=i=1kj=i+1kP(CiCj)pk=P(C1Ck)\begin{array}{l}{p_{1} = \sum_{i=1}^{k} P\left(C_{i}\right), \quad p_{2}=\sum_{i=1}^{k} \sum_{j=i+1}^{k} P\left(C_{i} \cap C_{j}\right)} \\ {p_{k}=P\left(C_{1} \cap \ldots \cap C_{k}\right)}\end{array}

Law of total probability 全概公式

Let {C1,,Ck}\left\{C_{1}, \ldots, C_{k}\right\} be a partition of CC

P(C)=i=1kP(Ci)P(CCi)P(C)=\sum_{i=1}^{k} P\left(C_{i}\right) P\left(C | C_{i}\right)

is called the law of total probability.

Bayes’ Theorem:

P(CjC)=P(CCj)P(C)=P(Cj)P(CCj)i=1kP(Ci)P(CCi)P\left(C_{j} | C\right)=\frac{P\left(C \cap C_{j}\right)}{P(C)}=\frac{P\left(C_{j}\right) P\left(C | C_{j}\right)}{\sum_{i=1}^{k} P\left(C_{i}\right) P\left(C | C_{i}\right)}


Bernoulli experiment (伯努利试验) and Bernoulli Distribution (伯努利分布)

伯努利试验(Bernoulli experiment)是在同样的条件下重复地、相互独立地进行的一种随机试验,其特点是该随机试验只有两种可能结果:发生或者不发生。

我们假设该项试验独立重复地进行了 nn 次,那么就称这一系列重复独立的随机试验为 nn 重伯努利试验,或称为伯努利概型。单个伯努利试验是没有多大意义的。

Let XX be a random variable associated with a Bernoulli trial by defining it as follows:

X( success )=1 and X( failure )=0X(\text { success })=1 \quad \text { and } \quad X \quad(\text { failure })=0

The pmf of XX can be written as

p(x)=px(1p)1x,x=0,1p(x)=p^{x}(1-p)^{1-x}, x=0,1

伯努利分布就是常见的0-1分布,即两点分布(two-point distribution)。

Binomial Distribution (二项分布)

Let XX equal the number of observed successes in nn Bernoulli trials, the
possible values of XX are 0,1,,n.0,1, \cdots, n . We say the XX follows a binomial distribution and write XB(n,p).X \sim B(n, p) . The pdf of xx is

p(x)={(nx)px(1p)nx,x=0,1,,n0, elsewhere. p(x)=\left\{\begin{array}{ll}{\left(\begin{array}{l}{n} \\ {x}\end{array}\right) p^{x}(1-p)^{n-x}, \quad x=0,1, \cdots, n} \\ {0,} & {\text { elsewhere. }}\end{array}\right.

where (nx)=n!x!(nx)!\left(\begin{array}{l}{n} \\ {x}\end{array}\right)=\frac{n !}{x !(n-x) !}.

Geometric distribution (几何分布)

Let XX be the number of a Bernoulli trials where the first “yes”
appeared. DX={1,2,}\mathcal{D}_{X}=\{1,2, \ldots\}
Let YY be the number of “No” before the first “yes”. Y=X-1.
DY={0,1,}\mathcal{D}_{Y}=\{0,1, \ldots\}

P(X=n)=P(Y=n1)=p(1p)n1,n=1,2,P(X=n)=P(Y=n-1)=p(1-p)^{n-1}, n=1,2, \cdots

Multinomial Distribution (多项分布)

  • This is an extension of the binomial distribution.
  • Let a random experiment be repeated nn independent times.
  • Each experimental results in but one of kk mutually exclusive
    and exhaustive ways, say C1,,CkC_{1}, \ldots, C_{k}. Let pip_{i} be the prob. that the
    outcome is an element of Ci.C_{i} .
  • Let XiX_{i} be the number of outcomes that are elements of Ci.C_{i} . We
    have X1+X2++Xk=n.X_{1}+X_{2}+\cdots+X_{k}=n .
  • The pmf of X1,,Xk1X_{1}, \cdot, X_{k-1} is
    P(X1=n1,,Xk=nk)={n!n!nk!p1n1pknk,n1+nk=n0, elsewhere. P\left(X_{1}=n_{1}, \ldots, X_{k}=n_{k}\right)=\left\{\begin{array}{ll}{\frac{n !}{n ! \ldots n_{k} !} p_{1}^{n_{1}} \cdots p_{k}^{n_{k}},} & {n_{1}+\cdots n_{k}=n} \\ {0,} & {\text { elsewhere. }}\end{array}\right.

Poisson Distribution (泊松分布)

A random variable XX that has a pmf

p(x)={mxemx!,x=0,1,0, elsewhere, p(x)=\left\{\begin{array}{ll}{\frac{m^{x} e^{-m}}{x !},} & {x=0,1, \cdots} \\ {0, \text { elsewhere, }}\end{array}\right.

is said to have a Poisson distribution with parameter mm.

Suppose X1,X2,,XnX_{1}, X_{2}, \cdots, X_{n} are independent random variables and suppose XiX_{i} has a Poisson distribution with parameter mi.m_{i} . Then Y=i=1nXiY=\sum_{i=1}^{n} X_{i} has a Poisson distribution with parameter i=1nmi\sum_{i=1}^{n} m_{i}.

Exponential Distribution (指数分布)

The exponential distribution E(λ)E(\lambda) with the pdf

f(x)={λeλx,x00,x<0f(x)=\left\{\begin{array}{ll}{\lambda e^{-\lambda x},} & {x \geq 0} \\ {0,} & {x<0}\end{array}\right.

was one of important continuous distribution in theory of reliability, queueing theory and telephone system.

Gamma Distribution (伽玛分布)

The Gamma Function

The integral is called the gamma function of α>0,\alpha>0, and we write

Γ(α)=0yα1eydy\Gamma(\alpha)=\int_{0}^{\infty} y^{\alpha-1} e^{-y} d y


  1. Γ(1)=1\Gamma(1)=1
  2. Γ(α)=(α1)Γ(α1)\Gamma(\alpha)=(\alpha-1) \Gamma(\alpha-1)
  3. Γ(n)=(n1)!\Gamma(n)=(n-1) ! if nn is a positive integer
  4. Γ(0)=,Γ(12)=π,Γ(α)Γ(1α)=πsin(πα)\Gamma(0)=\infty, \quad \Gamma\left(\frac{1}{2}\right)=\sqrt{\pi}, \Gamma(\alpha) \Gamma(1-\alpha)=\frac{\pi}{\sin (\pi \alpha)}

The Γ\Gamma-distribution

A random variable XX that has the pdf of the form

f(x)={1Γ(α)βαxα1ex/β,0<x<0, elsewhere f(x)=\left\{\begin{array}{ll}{\frac{1}{\Gamma(\alpha) \beta^{\alpha}} x^{\alpha-1} e^{-x / \beta},} & {0<x<\infty} \\ {0,} & {\text { elsewhere }}\end{array}\right.

is said to have a gamma distribution with parameters α\alpha and β,\beta, where α>0\alpha>0 and β>0.\beta>0 . We will write XΓ(α,β)X \sim \Gamma(\alpha, \beta) or X \sim \operatorname{gamma}_{(\alpha, \beta)} .
We have

  1. f(x)0f(x) \geq 0;
  2. 1=01Γ(α)βαxα1ex/βdx1=\int_{0}^{\infty} \frac{1}{\Gamma(\alpha) \beta^{\alpha}} x^{\alpha-1} e^{-x / \beta} d x

as by using a transformation of y=x/βy=x / \beta in the integral of Γ(α)\Gamma(\alpha)

Γ(α)=0(xβ)α1ex/β(1β)dx\Gamma(\alpha)=\int_{0}^{\infty}\left(\frac{x}{\beta}\right)^{\alpha-1} e^{-x / \beta}\left(\frac{1}{\beta}\right) d x

The Γ\Gamma-distribution involves many useful distributions

  1. The standard Γ\Gamma -distribution Γ(α,1)\Gamma(\alpha, 1) with pdf\mathrm{pd} f

f(x)={xα1ex/Γ(α),x00,x<0f(x)=\left\{\begin{array}{ll}{x^{\alpha-1} e^{-x} / \Gamma(\alpha),} & {x \geq 0} \\ {0,} & {x<0}\end{array}\right.

  1. The exponential distribution (α=1,λ=1/β)(\alpha=1, \lambda=1 / \beta) with pdf

f(x)={λeλx,x00,x<0f(x)=\left\{\begin{array}{ll}{\lambda e^{-\lambda x},} & {x \geq 0} \\ {0,} & {x<0}\end{array}\right.

  1. The χ2\chi^{2} -distribution (α=n/2,β=2)(\alpha=n / 2, \beta=2) with pdf\mathrm{pdf}

f(x)={12n/2Γ(n/2)xn/21ex/2,x00,x<0f(x)=\left\{\begin{array}{ll}{\frac{1}{2^{n / 2} \Gamma(n / 2)} x^{n / 2-1} e^{-x / 2},} & {x \geq 0} \\ {0,} & {x<0}\end{array}\right.

The χ2\chi^{2}-distribution (χ2\chi^{2})

Example. If XX has the pdf

f(x)={14xex/2,0<x<0, elsewhere,  then Xχ2(4)f(x)=\left\{\begin{array}{ll}{\frac{1}{4} x e^{-x / 2},} & {0<x<\infty} \\ {0,} & {\text { elsewhere, }} \\ {\text { then } X \sim \chi^{2}(4)}\end{array}\right.

Let X1,X2,,XnX_{1}, X_{2}, \cdots, X_{n} be independent and
Xiχ2(ni),i=1,,n.X_{i} \sim \chi^{2}\left(n_{i}\right), i=1, \ldots, n . Then

Y=i=1nXiχ2(m)Y=\sum_{i=1}^{n} X_{i} \sim \chi^{2}(m)

where m=i=1nni.m=\sum_{i=1}^{n} n_{i} .

The β\beta-distribution (贝塔分布)

The beta function

B(a,b)=01ya1(1y)b1dy;a>0,b>0B(a, b)=\int_{0}^{1} y^{a-1}(1-y)^{b-1} d y ; a>0, b>0


  1. B(a,b)=b(b,a)B(a, b)=b(b, a)
  2. B(a,b)=Γ(a)Γ(b)Γ(a+b)B(a, b)=\frac{\Gamma(a) \Gamma(b)}{\Gamma(a+b)}
  3. B(a,ba)=0xa1(1+x)bdxB(a, b-a)=\int_{0}^{\infty} x^{a-1}(1+x)^{-b} d x

The β\beta-distribution

Let X1X_{1} and X2X_{2} be two independent random variables, where X1Γ(α,1)X_{1} \sim \Gamma(\alpha, 1) and X2Γ(β,1).X_{2} \sim \Gamma(\beta, 1) . The distribution of B=X1X1+X2B=\frac{X_{1}}{X_{1}+X_{2}} is called the β\beta -distribution with parameters α\alpha and β\beta and write Bβ(α,β)B \sim \beta(\alpha, \beta) or B \sim \operatorname{beta}(\alpha, \beta)

Properties of The β\beta -distribution:
The β\beta -distribution involves

  1. The uniform distribution =β(1,1)=\beta(1,1) with pdf 1 on [0,1][0,1] and 0
  2. The inverse sine distribution =β(1/2,1/2).=\beta(1 / 2,1 / 2) . Its pdf is

p(x)={1πx(1x),0x10, elsewhere p(x)=\left\{\begin{array}{ll}{\frac{1}{\pi \sqrt{x(1-x)}},} & {0 \leq x \leq 1} \\ {0,} & {\text { elsewhere }}\end{array}\right.

  1. The power distribution =β(α,1)=\beta(\alpha, 1) and its pdf is

p(x)={αxα1,0x10, elsewhere p(x)=\left\{\begin{array}{ll}{\alpha x^{\alpha-1},} & {0 \leq x \leq 1} \\ {0,} & {\text { elsewhere }}\end{array}\right.

Normal Distribution (正态分布)

Definition A random variable XX that has a pdf

p(x)=12πσexp{(xμ)22σ2}p(x)=\frac{1}{\sqrt{2 \pi} \sigma} \exp \left\{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}\right\}

is said to have a normal distribution with parameters μ\mu and σ2,\sigma^{2}, and write XN(μ,σ2).X \sim N\left(\mu, \sigma^{2}\right) . When μ=0\mu=0 and σ2=1,\sigma^{2}=1, we say that XX follows a standard normal distribution.

Assume random variable XN(μ,σ2)X \sim N\left(\mu, \sigma^{2}\right) with σ2>0,\sigma^{2}>0, then the random variable V=(Xμ)2/σ2χ2(1)V=(X-\mu)^{2} / \sigma^{2} \sim \chi^{2}(1).

The tt-distribution (tt分布)

Let random variables WN(0,1)W \sim N(0,1) and let Vχ2(n)V \sim \chi^{2}(n) are independent.
Define a new random variable TT by writing

T=WV/n=nWVT=\frac{W}{\sqrt{V / n}}=\sqrt{n} \frac{W}{\sqrt{V}}

We say that TT follows a tt -distribution with nn degrees of freedom.

The FF-distribution (FF分布)

Let Uχm2U \sim \chi_{m}^{2} and Vχn2V \sim \chi_{n}^{2} be independent. Then

F=U/mV/nF=\frac{U / m}{V / n}

have the pdf

p(f)={Γ(m+n2)(m/n)m/2Γ(m/2)Γ(n/2)fm/21(1+mfn)(m+n)/2,0<f<0, elsewhere p(f)=\left\{\begin{array}{ll}{\frac{\Gamma\left(\frac{m+n}{2}\right)(m / n)^{m / 2}}{\Gamma(m / 2) \Gamma(n / 2)} \frac{f^{m / 2-1}}{\left(1+\frac{m f}{n}\right)^{(m+n) / 2}},} & {0<f<\infty} \\ {0,} & {\text { elsewhere }}\end{array}\right.

We say that FF follows a FF -distribution with mm and nn degrees of
freedom, and write FFm,nF \sim F_{m, n}