Machine Learning Prabability

Random Variables
- Binary Random Variable $x \in {0, 1}$
- Categorical Random Variable $x \in {1, 2, \cdots, K}$
Probability Function
- Probability Mass Function: returns the probability of a given outcome
- Cumulative Distribution Function: returns the probability of a value less than or equal to a given outcome
- Percent-Point Function: returns a discrete value that is less than or equal to the given probability
Discrete Probability Distribution
- Bernoulli Distribution:
  - $x \in {0, 1}$
  - $P(x=1) = p$
  - $P(x=0) = 1-p$
  - binom(n=1, p)
- Binomial Distribution
  - Multiple independent Bernoulli trials (Bernoulli process) follows Binomial Distribution
  - binom(n=100, p=0.3)
- Multinoulli Distribution
  - $x \in {1, 2, \cdots, K }$
  - $P(x=1) = p_1$
  - $P(x=2) = p_2$
  - $P(x=2) = p_2$
  - $\cdots$
  - $P(x=K) = p_K$
- Multinomial Distribution
  - Multiple independent Multinoulli trials follows multinomial distribution
  - multinomial(n=100, p=[0.1, 0.2, 0.7]) #the value of the last entry is ignored and assumed to take up any leftover probability mass, but this should not be relied on