ML Algorithms – Which one is better ?
In every situation we went through numerous option same as here. Whenever we decided to work on ML, we do stuck into a thought process of selecting an algorithm which suits our need. Its also bit complicated to know all the minor details of each one of them. Then how can we decide X is better then Y ?
So Here is something which can make sense when we have some common agenda of comparing and analyzing these algorithm. I decided to get through regression and classification tasks of ML to write this article.
Requirement for Space
To execute an algorithm we need a space but its not handy if it is loading too much data into the working memory of a machine. i.e. Space constraint is an important factor which should be considered while selecting and algorithm.
Sample of Data and Required sample of Data for training, both are equally important for data analysis and Machine Learning. So its like number of training samples needed for an algorithm to train a model with valid generalization needs to be considered. Like neural network needs more complex data set then a commonly used KN-algorithm. It concludes based upon the complexity of data we can analyze and decide algorithm to work upon.
The two most powerful warriors are time and patienceLeo Tolstoy, War and Peace
Here if we talk about time constraint please do not misunderstood with reference of wall clock time required. Its all about computational complexity. i.e. How an elementary operations is getting handled by an algorithm. It has benefit of deciding on the ground of computer power, architecture and underlying programming language a developer has decided. Here important to note that time constraint can be different for training, validation and prediction. Like with Linear regression you can have more time required while training but validation can be more accurate.
Its mainly about how our algorithm handles tasks. almost similar to dividing a task to different workers at same time to enhance the pace. A parallel algorithm can complete more then one task at the same time. We can divide the workload within a machine only or on different machine using different processors. Sequential algorithm like gradient boost or decision tree is mean to work upon previous errors so its difficult to parallelize in such cases.
Online V/s Offline
Learning basically concern about updating or generating the parameters which can be used to predict some output. Online learning refers to a developing a parameter once a data presented and updating those values once a new set of data arrived. While in case of offline learning algorithm needs a fresh start which included earlier and latest sets of data all together to generate the parameter for predictions. It would be always better to work with online supported algorithm as in real time data can enhance the working of model with parameter updates.
In case of statistical machine learning model or learning we consider parametricity of an algorithm. In simple words if number of parameters developed by an algorithm is fixed even in case of new data elements added it is supposed to be a parametric algorithm based model while non-parametric model will result addition parameters when new data came into the picture. Assumptions of shapes of probability distribution of a data also lets us decide to choose a parametric or non-parametric model.
Bias and Variance trade off
With different ML algorithm we can see different bias variance trade off. Lets make it more clear. Bias means getting aligned towards certain result more as compared to other. Same here a bias error basically denotes model based upon a particular algorithm is biased towards a particular solution more as compared to other. for example if we apply linear regression over non-linear the result will be biased. Next Variance is the error which results because of variances of data sets. We calculate on the ground of average square differences of the prediction made and the expected result of the model
Objectives and Methods used
There is always an objective a Machine Learning model. To fullfill the objective we need to select a method so its very import to compare on the ground of reasons and logic to decide. i.e what are the possible optimization techniques that can be used through algorithm to generate a Machine Learning model. For example with linear regression we optimize the square loss of predicted and actual value while with Lasso regression we minimize MSE while restricting the parameters learned with regularization to overcome overfitting.
On conclusion it can be said an algorithm can be analyzed on different grounds. These process will always results into more better machine learning model for predictions and assumptions