What is Normalization?#
Feature Scaling is an essential step prior to modeling while solving prediction problems in Data Science. Machine Learning algorithms work well with the data that belongs to a smaller and standard scale.
This is when Normalization comes into picture. Normalization techniques enables us to reduce the scale of the variables and thus it affects the statistical distribution of the data in a positive manner.
In the subsequent sections, we will be having a look at some of the techniques to perform Normalization on the data values.
Normalize data in R - Log Transformation#
In the real world scenarios, to work with the data, we often come across situations wherein we find the datasets that are unevenly distributed. That is, they are either skewed or do not follow normalization of values.
In such cases, the easiest way to get values into proper scale is to scale them through the individual log values.
In the below example, we have scaled the huge data values present in the data frame ‘data’ using log() function from the R documentation.
Example:
rm(list = ls())
data = c(1200,34567,3456,12,3456,0985,1211)
summary(data)
log_scale = log(as.data.frame(data))