Coding Guide

This file is intended to be notes about code structure in xgboost

Project Logical Layout

Dependency order: io->learner->gbm->tree
- All module depends on data.h
tree are implementations of tree construction algorithms.
gbm is gradient boosting interface, that takes trees and other base learner to do boosting.
- gbm only takes gradient as sufficient statistics, it does not compute the gradient.
learner is learning module that computes gradient for specific object, and pass it to GBM

.h files are data structures and interface, which are needed to use functions in that layer.
-inl.hpp files are implementations of interface, like cpp file in most project.
- You only need to understand the interface file to understand the usage of that layer
In each folder, there can be a .cpp file, that compiles the module of that layer

Add objective function: add to learner/objective-inl.hpp and register it in learner/objective.h CreateObjFunction
- You can also directly do it in python
Add new evaluation metric: add to learner/evaluation-inl.hpp and register it in learner/evaluation.h CreateEvaluator
Add wrapper for a new language, most likely you can do it by taking the functions in python/xgboost_wrapper.h, which is purely C based, and call these C functions to use xgboost