Light Gradient Boosting Machine

LGBM is a quick, distributed, and high-performance gradient lifting framework which is based upon a popular machine learning algorithm – Decision Tree. It can be used in classification, regression, and many more machine learning tasks. This algorithm grows leaf wise and chooses the maximum delta value to grow. LightGBM uses histogram-based algorithms. The advantages of this are as follows:

  • Less Memory Usage
  • Reduction in Communication Cost for parallel learning
  • Reduction in Cost for calculating gain for each split in the decision tree.

So as LightGBM gets trained much faster but also it can lead to the case of overfitting sometimes. So, let us see what parameters can be tuned to get a better optimal model.

To get the best fit following parameters must be tuned:

  1. num_leaves: Since LightGBM grows leaf-wise this value must be less than 2^(max_depth) to avoid an overfitting scenario. 
  2. min_data_in_leaf: For large datasets, its value should be set in hundreds to thousands.
  3. max_depth: A key parameter whose value should be set accordingly to avoid overfitting.

For Achieving Better Accuracy following parameters must be tuned:

  1. More Training Data Added to the Model can increase accuracy. (can be also external unseen data)
  2. num_leaves: Increasing its value will increase accuracy as the splitting is taking leaf-wise but overfitting also may occur.
  3. max_bin: High value will have a major impact on accuracy but will eventually go to overfitting.

LightGBM vs XGBOOST – Which algorithm is better

There are a lot of Data Enthusiasts who are taking part in a number of online competitive competitions in the domain of Machine Learning. Everyone has their own unique independent approach to determine the best model and predict the accurate output of the given problem statement. In Machine Learning, Feature Engineering is very much an integral part of the process and also consumes most of the time. On the other hand, Modeling becomes an important part where you cannot have much preprocessing or have certain constraints on the features. There are different ensemble methods that help in making strong robust models that can give very accurate predictions. But exactly what is this buzz of the word ‘Ensemble’? Let us understand what the term Ensemble means in detail.

Ensemble: Before moving into a technical definition straightaway let us take a simple real-life example and understand the same. Let us assume that you want to buy an electronic device – Mobile Phone. Your first approach would be searching on the Internet about different latest smartphones and comparing rates of different companies. You will also see the features in different models. After this first step, you will ask your friends about their opinion about the mobile phones which you have short-listed. In this fashion, you will take opinions and suggestions from a couple of people. Once you get a couple of positive reviews about a particular mobile phone you will go ahead and buy that mobile phone. This is the exact concept of the term ‘Ensemble’. Ensemble Methods is a machine learning technique in which several base models (weak learners) are combined in order to produce one powerful model. Let us now go further into detail with the Boosting Method.

Boosting: Boosting is one of the Sequential Ensemble techniques. This technique is usually applied to the data with high bias and low variance. Here we have a dataset ‘D’ with n records. We take a random sample of 5 records from the dataset. Here there is an equal probability of all the records to be selected. So, by random sampling, we have 1,3,5,7,25 records selected. We train a model (let us say decision tree) on this sample data. After that, we provide all the records of D dataset to this model to classify it. There would be some records that may be misclassified since Model 1 is a weak learner. The records misclassified are given more weight for the next sampling selection. So, these misclassified records will have a higher probability of selection than the other records in the dataset.  Here records 2,8, 1,16 were misclassified and record 5 also had a probability of selection so it got selected. Now again we train Model 2 and then provide the full data to the second model. This process is repeated for n models. These all models are weak learners. In the end, these models are aggregated and a final M* model is built which is a better performing model since the misclassification error has been minimized. The Below diagram represents the entire process neatly:

So, having understood what is Boosting let us discuss the competition between the two popular boosting algorithms that is Light Gradient Boosting Machine and Extreme Gradient Boosting (xgboost). These algorithms yield the best results in a lot of competitions and hackathons hosted on multiple platforms. Let us now understand in-depth the Algorithms and have a comparative study on the same.

Similar Reads

Light Gradient Boosting Machine:

LGBM is a quick, distributed, and high-performance gradient lifting framework which is based upon a popular machine learning algorithm – Decision Tree. It can be used in classification, regression, and many more machine learning tasks. This algorithm grows leaf wise and chooses the maximum delta value to grow. LightGBM uses histogram-based algorithms. The advantages of this are as follows:...

XGBOOST Algorithm:

A very popular and in-demand algorithm often referred to as the winning algorithm for various competitions on different platforms. XGBOOST stands for Extreme Gradient Boosting. This algorithm is an improved version of the Gradient Boosting Algorithm. The base algorithm is Gradient Boosting Decision Tree Algorithm. Its powerful predictive power and easy to implement approach has made it float throughout many machine learning notebooks.  Some key points of the algorithm are as follows:...