[ML_2] Let's convert pd.Dataframe to train_test dataset.
2025. 7. 10. 06:49ㆍpython/ML
What if our dataset is a pd.DataFrame? In this case, we convert the DataFrame into a training dataset using the following code.
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import pandas as pd
iris_df = pd.DataFrame(iris_data.data, columns = iris_data.feature_names)
iris_df['target'] = iris_data.target
iris_df_data = iris_df.iloc[:,:-1]
iris_df_label = iris_df.iloc[:,-1]
X_train, X_test, y_train, y_test = train_test_split(iris_df_data, iris_df_label, test_size = 0.2 , random_state=121)
iris_model = DecisionTreeClassifier()
iris_model.fit(X_train, y_train)
y_pred = iris_model.predict(X_test)
accuracy_score(y_test, y_pred)
First of all, we import the necessary modules and create a DataFrame. We then add the target column to the DataFrame. Next, we index the DataFrame using the iloc method: iloc[:, :-1] and iloc[:, -1]. This means we are separating the features and labels. Finally, we split the dataset into training and test sets, fit the model, make predictions, and evaluate the results.
'python > ML' 카테고리의 다른 글
[ML_6] Prediction of pima diabetes using Scikitlearn (0) | 2025.07.17 |
---|---|
[ML_5] Prediction of Titanic survival by Scikitlearn (0) | 2025.07.15 |
[ML_4] Data preprocessing (2) | 2025.07.13 |
[ML_3] cross_validation (0) | 2025.07.12 |
[ML_1] Let's make iris classification model by Scikitlearn (0) | 2025.07.09 |