当前位置：首页 > article >正文

bagging(main: RF随机森林) 回归器

article 2024/11/18 22:31:54

"""
  一个原始数据的bagging回归
  编辑代码思想的步骤：
             1. 根据要实现的需求，导入数据处理和功能调用的包/模块
             2. 创建数据
             3. 创建变量n_tree:集成回归器棵数
             4. 创建存储回归器的存储器
             5. 循环1-n_tree的训练和预测：
                                     训练 01：训练循环体中选用抽取方式并调用
                                     训练 02：将x,y从数据表格中取出
                                     训练 03：实例化回归器
                                     训练 04：训练
                                     训练 05：每循环一次回归器存储到存储器
                                     预测 01：重新创建X,Y变量取出数据
                                     预测 02：初始化回归器计算的总值total
                                     预测 03：预测循环体中存储器每一次的predict()
                                     预测 04：total/n_tree 求平均
                                     预测 05：预测y
             6. 打分
"""

# 集成学习有3种思想：bagging(RF随机森林), boosting(AdaBoost and GBDT), and stacking
import numpy as np
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import r2_score

df_r = pd.DataFrame([[1, 10.56],
                   [2, 27],
                   [3, 39.1],
                   [4, 40.4],
                   [5, 58],
                   [6, 60.5],
                   [7, 79],
                   [8, 87],
                   [9, 90],
                   [10, 95]],
                    columns=['X', 'Y'])
print(df_r)

n_T = 10
Models = []

for i in range(n_T):
    df2 = df_r.sample(frac=1.0, replace=True)
    X = df_r.iloc[:, :-1]
    Y = df_r.iloc[:, -1]
    model = DecisionTreeRegressor(max_depth=1)
    model.fit(X, Y)
    Models.append(model)

# 预测
x = df_r.iloc[:,:-1]
y = df_r.iloc[:,-1]
total = np.zeros(df_r.shape[0])
for t in range(n_T):
    total += Models[t].predict(x)
y_hat = total/n_T
print('y_hat:', y_hat)
# 打分
print("R:", r2_score(y, y_hat))

# 回归器只回归一次的案例
model02 = DecisionTreeRegressor(max_depth=1)
model02.fit(x, y)
y_hat02 = model02.predict(x)
print('#' * 100)
print("y_hat02:", y_hat02)
print("One R:", r2_score(y, y_hat02))

X Y
0 1 10.56
1 2 27.00
2 3 39.10
3 4 40.40
4 5 58.00
5 6 60.50
6 7 79.00
7 8 87.00
8 9 90.00
9 10 95.00
y_hat: [29.265 29.265 29.265 29.265 78.25 78.25 78.25 78.25 78.25 78.25 ]
R: 0.7622123252516444
####################################################################################################
y_hat02: [29.265 29.265 29.265 29.265 78.25 78.25 78.25 78.25 78.25 78.25 ]
One R: 0.7622123252516444

Process finished with exit code 0

查看全文

http://www.kler.cn/a/15531.html