Feature Scaling and Hyper Parameter Tuning
Improve the Accuracy Of Model by 22.5%
Introduction :
In a Machine Learning model, Accuracy is the key to decide whether the model is precise to use or not.
We all thrive for a better model which gives higher accuracy. Accuracy of a model depends on various criteria such as Feature Selection, Handling of Null or Missing values in data set, Hyper Parameters of Algorithms, Feature Scaling and so on. This may seem a bit complicated but when we actually dive into it, we will get to know how easy it would be to improve the performance of our Machine learning Model.
So let's get started.
We are going to use Red wine - Quality data set which contains 11 features and a Target column.
Techniques to Improve Accuracy :
- Selecting appropriate Features
- Handling of Null or Missing Values
- Hyper Parameter Tuning
- Feature Scaling.
1. Selecting Appropriate Features
The given data set may contain many number of columns in which some of them are useful to us and some of them may not be useful.
In order to select appropriate feature columns we check the correlation between each and every feature. The correlation between can the features can be known by creating a Heat-map.
The Heat map in the figure shows the correlation between the features of the data set.
Code to get a Heat map:
import seaborn as sns
corr = df.corr()
sns.heatmap(corr)
2. Handling Null or Missing Values :
Sometimes in a given Data set there may be some missing values or observations that has no value. Such type of values are called as Null Attributes. There are some methods to get rid of those values. In some cases we replace the null values with Central Tendency values of that column i.e., Mean, Median or Mode. In some other cases the whole Row that contains a null value is omitted.
Code to drop the Row that contain a Null value:
- df.dropna()
Code to replace Null Values with Mean, Median or Mode:
- df = df.fillna(df[column].mean())
- df = df.fillna(df[column].median())
- df = df.fillna(df[column].mode())
Code to check which Column has Null values :
- df.isna().sum()
The number of Null Values in the column is equal to the sum.
Before going into next topic i.e., Hyper Parameter Tuning let us build our model without applying any mentioned Techniques.
We are going to build a KNN Classification Model which predicts the Quality of Red Wine. The data set is shown in the 1st figure which contains 11 independent features and a dependent target column which is "Quality".
Step 1: Import Pandas and Numpy and read the Data set
Step 2: Building the KNN model with a selected N-Neighbors
Accuracy of Basic Model :
From the above picture, the accuracy of KNN Model is 48.125% which is very less.
We can see from Out[4] in the figure, there are some parameters like algorithm, leaf_size, metric, metric_params, n_jobs, n_neighbors, p, weights.
These parameters are called HYPER PARAMETERS of the model.
3. Hyper Parameter Tuning :
Hyper Parameter Tuning is Selection of best combination of Hyper parameters of a model from given set of values for Machine Learning algorithms. In order to select best values of parameters from given set there are two algorithms called RandomizedSearchCV and GridSearchCV . These algorithms select best parameters out of given values by performing Permutations and combinations. So the criteria for selecting best parameters is Accuracy. The combination of parameters which result in Higher Accuracy is selected by these models.
So lets dive into step by step process of Hyper Parameter Tuning.
Step 1 : Import RandomizedSearchCV
- from sklearn.model_selection import RandomizedSearchCV
Step 2 : Give Hyper Parameters in dictionaries
In the above figure I have selected only some parameters like n_neighbors, weights and algorithm. I have given some values to n_neighbors and all available type of weight parameters and algorithm parameters. These parameters are given in the form a dictionary.
So the RandomizedSearchCV picks up the best combination of values of these parameters and give them as output. The best combination is the combination which give the highest accuracy.
Step 3 : Instantiate the RandomizedSearchCV and fit the data
Inside the RandomizedSearchCV we have given our estimator which is KNN. The parameters dictionary has to be passed which is here params. CV stands for the Cross-Validation generator which here generates 5 Cross - Validations. n_jobs is set to -1 so that all the CPU processors are engaged in the finding the best combination.
The data is fit to the instantiated model.
Step 4 : Find Best Estimator and Parameters, Apply them to the KNN model
From the Out[67], we can see the parameters which constitutes the best Estimators. With these parameters set to the best estimator values, we can achieve more accuracy.
Let us set the parameter values of our KNN Classifier to the above values, fit the data and find the accuracy of our model.
Step 4 : Assign the best parameter values to model and Predict
The Accuracy of the model has increased to 65% . The Accuracy of the model before performing Hyper Parameter Tuning is 48.125%. It has increased by 16.875% .
Varying K-Neighbors and plotting a graph for each K-Neighbor and Accuracy:
4. Feature Scaling :
Feature Scaling is a technique in which all the features of data set are scaled to a same scale. It is a method used to normalize the range of features of data set. It is also called data normalization and is generally performed during the data pre-processing step.
The need for Feature Scaling is that it brings all the features into a same dimension magnitudes
All the Algorithms don't require Feature Scaling. Algorithms like KNN, Linear Regression, Logistic Regression, K-Means Clustering, SVM, Neural Networks require Feature Scaling. The algorithms that uses gradient descent as an optimization technique require data to be scaled. Our KNN model calculates the Euclidean distance between the points. So it requires the data to be scaled.
There are mainly two types of Feature Scaling Techniques.
- Min Max Scalar/ Normalized Scaling
- Standard Scalar
1. Normalization :
It is a scaling technique in which values are shifted and re-scaled so that they end up ranging between 0 and 1.
The formula for Normalization is
- When the value of X is the minimum value in the column, the numerator will be 0, and hence X' is 0.
- When the value of X is the maximum value in the column, the numerator is equal to the denominator and thus the value of X' is 1.
- If the value of X is between the minimum and the maximum value, then the value of X' is between 0 and 1.
2. Standardization :
Standardization is a technique in which the values are scaled in such a way that the mean is Zero and the Standard Deviation is One.
In this case the values are not restricted to a particular range.
Which one to use? Normalization or Standardization
The best way to find which method is better for your model is to apply them both and check the accuracy because there is no precise theory stating that Standardization should be used for these models and Normalization should be used to those models.
However, we can say that Normalization is better to use when the given data do not follow Gaussian distribution. In algorithms like KNN, Neural networks where the models does not assume any distribution of data, Normalization can yield better results.
Standardization, on the other hand, can be helpful in cases where the data follows a Gaussian distribution. However, this does not have to be necessarily true. Also, unlike normalization, Standardization does not have a bounding range. So, even if we have outliers in our data, they will not be affected by Standardization.
Step 1 : Import StandardScaler and Instantiate
- from sklearn.preprocessing import StandardScaler
- scaler = StandardScaler()
Step 3 : Perform Hyper Parameter Tuning by fitting Scaled features to the Model
Step 5 : Check the Accuracy of model
Hence, the Final Accuracy of the model is 71.25%.
Final Step : Plot Graph between Accuracy and K-Neighbors for this model
End Notes
As promised in the headline of the blog, the accuracy of the model has improved from 48.125% to 71.25%. This is achieved by using Hyper Parameter Tuning and Feature Scaling.
Keep in mind that there is no correct answer to when to use normalization over standardization and vice-versa. It all depends on your data and the algorithm you are using.
As a next step, I encourage you to try out feature scaling with other algorithms and figure out what works best – normalization or standardization? And don't forget to share your Insights in the comments :)

















A very informative and interesting article! 👌🏼
ReplyDeleteAmazing 💯
ReplyDeletegood information
ReplyDeleteThank you :)
DeleteGood information bro...!
ReplyDeleteKeep going and All The best.
Thank you bro! Yeah, I'll come up with more content
Delete