陈天奇博士xgboost讲义.pdf

发布时间：2022-05-31 发布人：admin 分类：说明书资料大小：1.37M 资料格式：pdf 举报版权申诉

u010496169-9881566-4744300845390683336.pdf-第1页.png

第1页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第2页.png

第2页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第3页.png

第3页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第4页.png

第4页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第5页.png

第5页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第6页.png

第6页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第7页.png

第7页 / 共41页

u010496169-9881566-4744300845390683336.pdf-第8页.png

第8页 / 共41页

文本预览

Introduction to Boosted Trees Tianqi Chen Oct. 22 2014

Outline • Review of key concepts of supervised learning • Regression Tree and Ensemble (What are we Learning) • Gradient Boosting (How do we Learn) • Summary

Elements in Supervised Learning • Notations: i-th training example • Model: how to make prediction given  Linear model: (include linear/logistic regression)  The prediction score can have different interpretations depending on the task  Linear regression: is the predicted score  Logistic regression: is predicted the probability of the instance being positive  Others… for example in ranking can be the rank score • Parameters: the things we need to learn from data  Linear model:

Elements continued: Objective Function • Objective function that is everywhere Training Loss measures how well model fit on training data Regularization, measures complexity of model • Loss on training data:  Square loss:  Logistic loss: • Regularization: how complicated the model is?  L2 norm:  L1 norm (lasso):

Putting known knowledge into context • Ridge regression:  Linear model, square loss, L2 regularization • Lasso:  Linear model, square loss, L1 regularization • Logistic regression:  Linear model, logistic loss, L2 regularization • The conceptual separation between model, parameter, objective also gives you engineering benefits.  Think of how you can implement SGD for both ridge regression and logistic regression

Objective and Bias Variance Trade-off Training Loss measures how well model fit on training data Regularization, measures complexity of model • Why do we want to contain two component in the objective? • Optimizing training loss encourages predictive models  Fitting well in training data at least get you close to training data which is hopefully close to the underlying distribution • Optimizing regularization encourages simple models  Simpler models tends to have smaller variance in future predictions, making prediction stable

Outline • Review of key concepts of supervised learning • Regression Tree and Ensemble (What are we Learning) • Gradient Boosting (How do we Learn) • Summary

Regression Tree (CART) • regression tree (also known as classification and regression tree):  Decision rules same as in decision tree  Contains one score in each leaf value Input: age, gender, occupation, … Does the person like computer games age < 15 Y N is male? Y N prediction score in each leaf +2 +0.1 -1

分享到：

赞收藏

资料库

陈天奇博士xgboost讲义.pdf

相关推荐

人工智能

热门标签

最新资料