当前位置: 代码迷 >> 综合 >> Gradient boosting trees实现+特征值重要性+依赖相关——及相关包解释_整理
  详细解决方案

Gradient boosting trees实现+特征值重要性+依赖相关——及相关包解释_整理

热度:98   发布时间:2024-01-12 21:50:01.0

原理:

https://www.cnblogs.com/pinard/p/6140514.html

https://zhuanlan.zhihu.com/p/108641227

 

示例:

https://zhuanlan.zhihu.com/p/40356430

https://www.pythonf.cn/read/5079

 

随机选择训练+测试样本参数解释:

https://www.cnblogs.com/pinard/p/6143927.html

https://www.cnblogs.com/Yanjy-OnlyOne/p/11288098.html

 

调参

网格搜索 GridSearchCV:https://zhuanlan.zhihu.com/p/37310443

# 调参
cv_params = {'learning_rate': [0.1, 0.05,0.01], 'max_depth': [1,3,5,7,10], 'n_estimators': [100,200,300]}
ind_params = {'random_state': 10}optimized_GBM = GridSearchCV(GradientBoostingRegressor(**ind_params),cv_params,scoring='neg_mean_squared_error', cv=5, n_jobs=-1, verbose=10)optimized_GBM.fit(X_pr_train, y_pr_train)

 

随机搜索 RandomizedSearchCV:https://blog.csdn.net/juezhanangle/article/details/80051256

# 调参
param_dist = {'learning_rate': [0.1, 0.05,0.01], 'max_depth': [10,50,100],  'n_estimators': [100,200,300]}
ind_params = {'random_state': 10}n_iter_search = 20
random_search = RandomizedSearchCV(GradientBoostingRegressor(**ind_params), param_distributions=param_dist,n_iter=n_iter_search, cv = 3,scoring = 'roc_auc',n_jobs = -1)start = time()
random_search.fit(X_nopr_train, y_nopr_train)print("RandomizedSearchCV took %.2f seconds for %d candidates"" parameter settings." % ((time() - start), n_iter_search))
report(random_search.cv_results_)

 

贝叶斯优化: 一种更好的超参数调优方式

https://zhuanlan.zhihu.com/p/29779000

#贝叶斯优化
def rf_cv(n_estimators, min_samples_split, learning_rate, max_depth):val = cross_val_score(GradientBoostingRegressor(n_estimators=int(n_estimators),min_samples_split=int(min_samples_split),learning_rate = min(learning_rate, 0.999), # floatmax_depth=int(max_depth),random_state=2),X_nopr_train, y_nopr_train, scoring='roc_auc', cv=5).mean()return val#建立贝叶斯优化对象:
rf_bo = BayesianOptimization(rf_cv,{'n_estimators': (100, 300),'min_samples_split': (2, 25),'learning_rate': (0.01, 0.999),'max_depth': (10, 150)},n_jobs = -1)rf_bo.maximize()#最大值
rf_bo.max

 

 

plot_partial_dependence()官方文档:

https://scikit-learn.org/stable/modules/generated/sklearn.inspection.plot_partial_dependence.html

  相关解决方案