ch2_learning2 편향-분산 Bias-variance Trade-off

AI/머신러닝

ch2_learning2 편향-분산 Bias-variance Trade-off

9taetae9 2024. 4. 16. 20:04

728x90

Bias(편향) : Gap between the real problem and our model / The expected error created by using a model to apporoximate a real world function/relationship. 모델을 통해 얻은 예측값과 실제 정답과의 차이의 평균

f hat(x) : predicted value

f(x) : true value

$Bias [\hat{f} (x)] = E [\hat{f} (x) - f (x)]$

Variance(분산) : change for different training data sets / The amount our predicted values would change if we had a differenct training dataset. It is the "flexibility" of our model, balanced against bias. 다양한 데이터 셋에 대하여 예측값이 얼만큼 변화할 수 있는 지에 대한 양(Quantity)의 개념, 예측값들의 편차

more flexible -> lower bias, higher variance 더 복잡도가 높을수록 편향은 작아지고 분산은 커진다.

A more flexible model can adapt more closely to the training data, reducing bias but increasing the variance because it becomes sensitive to noise in the data. (possible problem : overfitting)

복잡도가 높은 모델은 training data에 대해 정확도가 높지만

Conversely, a less flexible model may not capture complex patterns, leading to higher bias but lower variance. (possible problem : underfitting)

$Var [\hat{f} (x)] = E [(\hat{f} (x) - E [\hat{f} (x)])^{2}] = E [\hat{f} (x)^{2}] - E [\hat{f} (x)]^{2}$

$Var [\hat{f} (x)] = E [(\hat{f} (x) - E [\hat{f} (x)])^{2}] = E [\hat{f} (x)^{2}] - E [\hat{f} (x)]^{2}$ 학습 상태를 나타낼 수 있는 좋은 척도가 된다.

Q) Debugging a learning algorithm

Suppose you have implemented regularized linear regression to predict housing prices. However, when you test your hypothesis in a new set of houses, you find that it makes unacceptably large errors in its prediction. What should you try next?

A1) Evaluate Bias and Variance
High Bias (Underfitting): This is typically indicated if both training error and cross-validation error are high. The model is too simple and fails to capture the underlying trend of the data.
High Variance (Overfitting): This is indicated if the training error is low, but the cross-validation error is high. The model is too complex, capturing noise instead of the underlying pattern.

A2) Strategies Based on Diagnosis

For High Bias
Add more features or create complex features: This can help the model capture more complex patterns in the data, thus reducing bias.
Add polynomial features: Including interaction terms (e.g., x1x2) or polynomial features (e.g., x1^2, x2^2) can make the model more flexible and better at capturing complex patterns.
Decrease λ: Lowering the regularization strength can allow the learning algorithm to fit the data more closely, thus reducing bias but possibly increasing variance.
For High Variance
1) Get more training examples: More data can help the model generalize better, reducing overfitting.

2) Try smaller sets of features: Reducing the number of features can help by simplifying the model, thus reducing the chance it will capture noise in the training data.
3) Increase λ (regularization strength): Increasing the regularization parameter can penalize large weights in the model, effectively simplifying the model and reducing variance.

A3) Iterative Testing
After implementing a change, re-evaluate the model's performance using a new validation set or through cross-validation. Continue adjusting features, the regularization parameter, and other aspects of the model based on these diagnostic tests until you achieve a satisfactory level of error.

A4) Cross-Validation
Utilize cross-validation to assess how the adjustments are impacting model performance on unseen data. This will guide you in understanding whether adjustments are effectively addressing bias or variance problems.

참고 자료 :

https://gaussian37.github.io/machine-learning-concept-bias_and_variance/

머신러닝에서의 Bias와 Variance

gaussian37's blog

gaussian37.github.io

https://www.rnfc.org/courses/isl/Lesson%202/Summary/

Lesson 2 - Statistical Learning · RN Financial Research Centre

⇦ Back to Resources Lesson 2 - Statistical Learning \[\newcommand{\Var}{\mathrm{Var}} \newcommand{\MSE}{\mathrm{MSE}} \newcommand{\Avg}{\mathrm{Avg}} \newcommand{\Bias}{\mathrm{Bias}}\] What is Statistical Learning? Consider observed response variable \(

www.rnfc.org

https://wikidocs.net/214038

2.2.2 편향-분산 트레이드오프(The Bias-Variance Trade-Off)

(32쪽) Test MSE 곡선에서 관찰되는 U자 형태(그림 2.9-2.11)는 통계학습 방법의 두 가지 경쟁하는 특성(two competing properties)의 결과이다…

wikidocs.net

728x90