Machine Learning

Briefing in Python

Machine Learning (ML) is that field of computer science with the help of which the computer systems can provide a sense to data in much the same way as human beings do. It is a type of artificial intelligence that extracts patterns out of raw data by using an algorithm or method. The main focus of Machine Learning is to allow computer systems learn from an experience without being specifically programmed or human interruption.

Need for Machine Learning

Human are the most intelligent and advanced species on earth because they can think, evaluate and solve complex problems. On the other hand, AI is still in its initial stage and haven’t surpassed human intelligence in many aspects. Then what is the need of Machine learning? The most suitable answer to this question is, “to make decisions, based on data, with efficiency and scale”.

Organizations are investing heavily in newer technologies like Artificial Intelligence, Machine Learning and Deep Learning to get the key information from data to perform several real-world tasks and solve problems. We can call it data-driven decisions taken by machines, in order to automate the process. These data-driven decisions can be used, instead of using programing logic, in the problems that cannot be programmed inherently. This is because that we all need to solve real-world problems with efficiency at a huge scale. That is why the need for machine learning arises.

Machine Learning Model (Random Forest Classifier)

Random forest is a supervised learning algorithm which is used for both classification as well as regression. But however, it is mainly used for classification problems. Both regression and Classification are the algorithms used for prediction and working with labeled datasets. The main difference between Regression and Classification algorithms that Regression algorithms are used to predict the continuous values such as price, salary, age, etc. and Classification algorithms are used to predict/Classify the discrete values such as Male or Female, True or False, Spam or Not Spam, etc. Random forest algorithm creates decision trees on data samples and then gets the prediction from each of them and finally selects the best solution by means of voting. It is an ensemble method which is better than a single decision tree because it reduces the over-fitting by averaging the result.

We can understand the working of Random Forest algorithm with the help of following steps −

Step 1 − First, start with the selection of random samples from a given dataset.
Step 2 − Next, this algorithm will construct a decision tree for every sample. Then it will get the prediction result from every decision tree.
Step 3 − In this step, voting will be performed for every predicted result.
Step 4 − At last, select the most voted prediction result as the final prediction result.

The following diagram will illustrate its working −