[ML_8] Ensemble : Voting , Bagging , Boosting(+)

[ML_8] Ensemble : Voting , Bagging , Boosting(+)

2025. 8. 22. 13:45ㆍpython/ML

Ensemble Method Definition

An ensemble method is a machine learning approach that combines multiple weak learners (individual models) to produce a stronger and more accurate predictive model.
The key idea is that by aggregating the predictions of several models, the ensemble can reduce variance, bias, or improve generalization compared to a single model.

Representative Types of Ensemble Methods

Voting
- Concept: Voting combines predictions from multiple models of different types (e.g., logistic regression, decision tree, k-nearest neighbors) that are all trained on the same dataset.
- How it works:
  - Hard Voting: Each model votes for a class label, and the class with the majority votes is selected.
  - Soft Voting: Each model outputs class probabilities, and the probabilities are averaged to make the final decision.
- Key Point: Voting is a model-agnostic ensemble technique since it can combine any type of algorithm.

#%% 
#### Voting classifier 
from sklearn.ensemble import VotingClassifier 
from sklearn.linear_model import LogisticRegression 
from sklearn.neighbors import KNeighborsClassifier 
from sklearn.model_selection import train_test_split 
from sklearn.metrics import accuracy_score 
from sklearn.datasets import load_breast_cancer 
import pandas as pd 

bc = load_breast_cancer() 
data_df = pd.DataFrame(bc.data, columns = bc.feature_names)   

#%% 
lr_ml = LogisticRegression(solver = 'liblinear') 
kc_ml = KNeighborsClassifier(n_neighbors= 8) 

vo_clf = VotingClassifier( estimators = [('LR', lr_ml), ('KNN', kc_ml)] , voting = 'soft') 

X_train, X_test, y_train, y_test = train_test_split(bc.data, bc.target, test_size = 0.2, random_state=156) 

vo_clf.fit(X_train, y_train) 
y_pred = vo_clf.predict(X_test) 
acc_score = accuracy_score(y_test, y_pred) 
print(f"voting acc_score : {acc_score}") 

classifiers = [lr_ml, kc_ml] 

for classifier in classifiers: 
    classifier.fit(X_train, y_train) 
    y_pred = classifier.predict(X_test) 
    class_name = classifier.__class__.__name__ 
    print(f"class_name : {class_name} ,acc_score : {accuracy_score(y_test,y_pred)}")

Bagging (Bootstrap Aggregating)
- Concept: Bagging uses the same algorithm (e.g., decision trees) but trains each model on a different bootstrapped sample of the data.
- Bootstrapping: Sampling with replacement, which means some data points may appear multiple times in a sample while others may not appear at all.
- How it works:
  - Each model (e.g., tree) is trained independently on its own random sample.
  - The final prediction is made by averaging (for regression) or voting (for classification) the outputs of all models.
- Example: Random Forest is a well-known bagging-based ensemble of decision trees.

Boosting
- Concept: Boosting trains models sequentially, where each new model focuses on correcting the errors of the previous models.
- How it works:
  - The first model is trained on the dataset.
  - Subsequent models are trained on data points that were misclassified or had high error, giving them higher weights.
  - Final predictions are made by combining all models with weighted voting or averaging.
- Key Intuition: Boosting “boosts” weak learners by focusing more on difficult cases.
- Examples: AdaBoost, Gradient Boosting, XGBoost, LightGBM.

Ref : https://medium.com/@chyun55555/ensemble-learning-voting-and-bagging-with-python-40de683b8ff0

Ensemble Learning — Voting and Bagging with Python

In machine learning, ensemble learning uses multiple learning algorithms in order to obtain better performance results. In classification…

medium.com

https://www.nb-data.com/p/comparing-model-ensembling-bagging

Comparing Model Ensembling: Bagging, Boosting, and Stacking - NBD Lite #7

Simple summary of the popular methodologies

www.nb-data.com

'python > ML' 카테고리의 다른 글

[ML] MNIST_Hand Digit with CrossEntropy and Matrix for (0)	2025.08.12
[ML] MNIST_Hand written code (0)	2025.08.09
[ML] Bayesian Concept learning (0)	2025.08.04
[Probability] Bayes Rule (0)	2025.08.03
[Linear_algebra] Null space (0)	2025.08.01

rudgh99

rudgh99

태그

최근글

댓글

공지사항

아카이브

Ensemble Method Definition

Representative Types of Ensemble Methods

'python > ML' 카테고리의 다른 글

관련글

티스토리툴바