AI/머신러닝
Comparative Analysis of Classification Models: Logistic Regression, Naive Bayes, LDA, and QDA
9taetae9
2024. 4. 3. 13:10
728x90
Criteria | Logistic Regression | Naive Bayes Classifier | Linear Discriminant Analysis (LDA) | Quadratic Discriminant Analysis (QDA) |
Model Type | Parametric | Parametric | Parametric | Parametric |
Assumption about Data Distribution | None on distribution, assumes a linear relationship between log odds and features | Assumes independence between features, with specific distribution per class | Assumes Gaussian distribution with same covariance matrix for each class | Assumes Gaussian distribution with different covariance matrices for each class |
Decision Boundary | Linear | Linear or Non-linear | Linear | Quadratic (Non-linear) |
Computation Complexity | Moderate | Low | Low to Moderate | Moderate |
Robustness to Outliers | Moderate | Low | Low | Low |
Scalability | Good | Very Good | Moderate | Limited |
Use Case | Binary classification, where interpretability is important | Baseline for text classification, when features are independent | When the Gaussian assumption holds, and a linear classifier is preferred | When the Gaussian assumption holds but classes have different covariances, allowing for non-linear decision boundaries |
Considers Covariance | No | No | Yes | Yes |
Assumes Same Variance (Covariance) | Not Applicable | Yes, implicitly due to feature independence | Yes | No |
Limitations | Assumes linear relationship, which may not fit all datasets | Simplistic feature independence assumption can be unrealistic for complex datasets | Sensitive to outliers due to reliance on covariance; performs poorly if assumptions about Gaussian distribution and equal covariances are violated | Computationally intensive with high-dimensional data; sensitive to outliers; requires sufficient data to estimate covariances accurately for each class |
728x90