Glossary
Maths :
Euclidean Distance
- Pronounce: “yu cly di an”
- Distance between 2 points
Pythagorean theorem
- Pronounce: “py tha gor ri an”
- a^2 + b^2 = c^2
Statistic :
Gaussian Distribution (a.k.a. Normal Distribution)
(img from wiki)
orthogonal
- We say that 2 vectors are orthogonal if they are perpendicular to each other . i.e. the dot product of the two vectors is zero.
Variance
-
variance measures the average degree to which each point differs from the mean.
-
Note the diff:
- Variance is the average squared deviations from the mean, while standard deviation is the square root of this number
Sample with replacement
- It means the sample drawn will be put back to the pool before next drawn
- i.e. multiple drawn is independent of each others
Stochastic
- Stochastic refers to the property of being well described by a random probability distribution.
- Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselves
- But two terms are often used synonymously.
- ~= Random
Squared residual
- a.k.a. Residual sum of squares, sum of squared residuals, sum of squared estimate of errors
- It is the sum of the squares of residuals. It is a measure of the discrepancy between the data and an estimation model
-
What is Residual ? (Image from https://www.statology.org/residuals/)
Machine Learning
Regularizations
- Regularization is a technique that helps overcoming over-fitting problem in machine learning models. It is called Regularization as it helps keeping the parameters regular or normal
- e.g. the L2 regularization in Ridge Regression
- https://medium.com/@minions.k/ridge-regression-l1-regularization-method-31b6bc03cbf#:~:text=Regularization is a technique that,the parameters regular or normal.
Variance
- Note: not the variance in statistic
-
Image from https://medium.com/@6453gobind/bias-variance-trade-off-87986b5b5add
-
i.e. high variance = overfit