์ผ | ์ | ํ | ์ | ๋ชฉ | ๊ธ | ํ |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 |
- ์ค๋ฅ๊ฒ์ถ
- ์ฝ๋ฉํ ์คํธ์ค๋น
- ๋ฐ์ดํฐ ์ ์ก
- i-type
- 99ํด๋ฝ
- ์ค๋ธ์
- ๊ฐ๋ฐ์์ทจ์
- ํฐ์คํ ๋ฆฌ์ฑ๋ฆฐ์ง
- well known ํฌํธ
- ํ๋ ์ ๊ตฌ์กฐ
- ์ค๋ฅ์ ์ด
- ํ ํฐ ๋ฒ์ค
- xv6
- leetcode
- ์ค๋ ๋
- IEEE 802
- tcp ํ๋กํ ์ฝ
- ์์๋ฒํธ
- ํญํด99
- ์ฐ๋ถํฌdb
- ๊ทธ๋ฆฌ๋ ์๊ณ ๋ฆฌ์ฆ
- reducible
- ์ฃผ๊ธฐ์ ํธ
- mariadb
- git merge
- tcp ์ธ๊ทธ๋จผํธ
- ๋น์ฃผ๊ธฐ์ ํธ
- til
- ์๋น์ค ํ๋ฆฌ๋ฏธํฐ๋ธ
- ํ๋ก์ด๋์์
- Today
- Total
Unfazedโ๏ธ๐ฏ
ch2_learning ๋ณธ๋ฌธ
Key terms
Y = f(x1, x2, x3)
- want to improve sales (Y) of a product
-> Y: output variable, dependent variable
-control adveritsing budgets : sns(x1), streaming(x2), flier(x3)
->x1, x2, x3 : input variables, independent variables, predictors
Key questions
1) What is the relationship between x1, x2, x3 and Y? -> learning
2) How accurately can we predict Y from x1, x2, x3? -> prediction
data -- (learn) --> pattern, knowledge, principles (model)
<--(apply) --
Formally,
collect data : observe Yi and Xi = (Xi1, ..., Xip) for i = 1, ..., n
assume that there is a relationship between Y and X's.
model the relationship(f) as Yi = f(Xi) + ei (random error (zero - mean))
statistical learning : estimate (learn) f from data
Models are useful for
1) prediction : predict Y from (new or unseen) X
2) inference : understand the relationship between X and Y
Prediction
Once we have a good model, we can predict Y from new X.
y^ (prediction, estimate) = f^(X) (estimate of f, f itself is unknown!)
(์ด ์์์์ ๊ธฐํธ ^๋ “์บ๋ฟ(caret)”์ด๋ผ๋ ๊ธฐํธ์ด๋ค. y^๋ “์์ด ํ(y hat)”์ด๋ผ๊ณ ์ฝ๋๋ค.)
How accurate is the prediction?
Reducible vs irreducible errors
True relationship : Y = f(x) + e
We learn f from data and use it for prediction. Y^ = f^(X)
In general, f^ != f : reducible error, potentially improved by f^ -> f.
But, even if we know f, there is irreducible error. Y^ = f(X) (still missing e)
Irreducible Errors are the errors caused by the variables beyond the realm of X (our set of predictor variables).
Quantification of the error
- mean squared error(MSE)
Goal : estimate f so that reducible error is minimized
Inference
In prediction, f was a (?)black box.
But, for inference, we want to know the exact form of f.
Understand how Y changes as a function of X1, ..., Xp.
input(x) ---> (?)black box f --> output(Y)
Inference questions
-which predictors are associated with the reponse
e.g. among X1, ..., Xp, which are relevant?
-what is the relationship between the response and each predictor?
e.g. increasing x1 -> increase (or decrease) Y?
e.g. increasing x1 -> increase (or decrease) Y when x2 is positive?
-Can the relationship between Y and each predictor be adequately summarized using a linear equation, or is the relationship more complicated?
Some examples of Prediction vs. inference
prediction example : direct-marketing
- given 90,000 people with 400 different characteristics, want to predict how much money an individual will donate.
- should I send a mailing to a given individual?
(Don't care how you estimate Y)
x1, ..., xp: demographic data
Y: positive or negative response
Inference example : advertising
which media contribute to sales?
which media generate the biggest boost in sales? or
how much increase in sales is associated with a given increase in SNS advertising?
Inference example : Housing
How do we estimate f?
Given a set of training data {(x1, y1), (x2, y2), ..., (xn,yn)}, we want to estimate f.
Two types of approaches.
-parametric methods
-non-parametric methods
Parametric methods
-estimating f -> estimating a set of parameteres
Step 1. make an assumption about the functional form, a model, of f.
i.e. a linear model
Step 2. Use the training data to fit the model.
i.e. estimate beta0,1,2,...,Beta p of the linear model (using least square)
linear model : only need to estimate p+1 coefficients! coefficient(๊ณ์)
non-parametric methods
-do not make explicit assumptions about the functional form of f.
-advantage : flexibility! can fit a wider range of f.
-disadvantage: difficult to learn. require more data.
parametric vs. non-parametric models
-There are always parameters.
-Parametric models: parameteres are explicitly estimated. (e.g. linear regression)
-Non-parametric models:
-I chooses a family of models. But, I don't have direct control of parameters.
-Non-parameter models actually have far more parameters!
The bigger the better?
Q. why would we ever choose to use a more restrictive method instead of a very flexible approach?
Trade-off : flexibility vs. interpretability
์ฌ๋ฌ๊ฐ์ง ํต๊ณํ์ต ๋ฐฉ๋ฒ๋ค์ ์ ์ฐ์ฑ๊ณผ ํด์ ๊ฐ๋ฅ์ฑ ์ฌ์ด์ ๊ท ํ์ ๋ํ๋ด๋ ๊ทธ๋ฆผ. ์ผ๋ฐ์ ์ผ๋ก ์ ์ฐ์ฑ์ด ์ฆ๊ฐํ๋ฉด ํด์ ๊ฐ๋ฅ์ฑ์ด ๊ฐ์ํ๋ค.
Simple models are easier to interpret!
Back to linear regression.
y^=β0 + β1⋅Xi1+ β2Xi2 +...+ βpXip
βj : the average increase in Y for a one unit increase in Xj holding all other variables constant.
Overfitting
Too flexible model -> poor estimation
Y = f(X) + e (random error)
Summary
learned key concepts of supervised learning
learning : learn f from (training) data
prediction vs. inference
reducible vs. irreducible errors
parametric vs. non-parametric methods for learning
flexibility vs. interpretability
overfitting
์ฐธ๊ณ ์๋ฃ :
์ต์ ์ ๋ชจ๋ธ์ ์ฐพ์์ (๋ถ์ : bias์ variance ๋ฌธ์ ์ดํดํ๊ธฐ)
๋๋ชจ ์๋น ๋ง๋ฆฐ๊ณผ ์น๊ตฌ ๋๋ฆฌ๊ฐ ๋๋ชจ๋ฅผ ์ฐพ์ ๋์ ๋ฐ๋ค๋ก ๋ชจํ์ ๋ ๋ฌ๋ค๋ฉด, ์ฐ๋ฆฌ๋ ๋ฐ์ดํฐ์ ๋ฐ๋ค์์ ์ต์ ์ ๋ชจ๋ธ์ ์ฐพ์ ์ค๋๋ ๋ชจํ์ ๋ ๋ฉ๋๋ค. ๋ค๋ฅธ ์ ์ด ์๋ค๋ฉด ๋ง๋ฆฐ์ ์๋ค ๋๋ชจ๋ ์ ์ผํ
medium.com
https://senthilkumarbala.medium.com/reducible-and-irreducible-errors-663eadace3a3
Reducible and Irreducible Errors
Suppose you are an aspiring data scientist in a procurement team of a large company and want to get your hands dirty on your first…
senthilkumarbala.medium.com
2.4. Linear model vs non-Linear model
## Linear model vs non-Linear model ์ด์ ์ฅ์์ ์ ํ ๋ชจ๋ธ์ ํ์ฅ์ ์ผํ์ผ๋ก ๋น์ ํ ์๊ด๊ด๊ณ๋ฅผ ์์ฉํ๋๋ก ํ๋ ๋ฐฉ๋ฒ์ด ํ์ฉํ ์ ์์์ ๋ณด์์ต๋๋ค. …
wikidocs.net
https://process-mining.tistory.com/131
Parametric model๊ณผ Non-parametric model
๋จธ์ ๋ฌ๋(ํน์ ํต๊ณํ)์ ๊ณต๋ถํ๋ค ๋ณด๋ฉด, parametric/non-parametric model์ด๋ parametric/non-parametric test์ ๊ฐ์ ๋จ์ด๋ฅผ ์์ฃผ ์ ํ ์ ์๋ค. ์ด๋ฒ ํฌ์คํ ์์๋ parametric model๊ณผ non-parametric model์ ๋ป๊ณผ ํจ๊ป
process-mining.tistory.com
'AI > ๋จธ์ ๋ฌ๋' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
ch2_learning2 ํธํฅ-๋ถ์ฐ Bias-variance Trade-off (0) | 2024.04.16 |
---|---|
Comparative Analysis of Classification Models: Logistic Regression, Naive Bayes, LDA, and QDA (0) | 2024.04.03 |
ch2_learning2 (Balancing Flexibility to Optimize Model Accuracy) ๋ชจ๋ธ์ ์ ํ๋ ํ๊ฐ (0) | 2024.03.13 |
ch1_overview (0) | 2024.03.06 |
Multi-Layered Perceptron (MLP) : Structure (0) | 2023.10.16 |