May 2020 Issue Vol.10 No.5
Pradeepan P#1, Gladston Raj S#2
#1 Varavilathoppu poovar, Trivandrum, Kerala, India
#2Associate Professor & Head of Computer Science, Government College Nedumangad, Trivandrum, Kerala, India
Abstract: The curse of dimensionality refers to all the problems that arise when working with data in the higher dimensions that did not exist in the lower dimensions. As the number of features increase, the number of samples also increases proportionally. The more features we have, the more number of samples we need to have all combinations of feature values well represented in our sample. As the number of features increases, the machine learning model becomes more complex. The more the number of features, the more the chances of over fitting. A machine learning model that is trained on a large number of features, gets increasingly dependent on the data, it was trained on and in turn over fitted, leading to poor performance on real data, beating the purpose avoiding over fitting is a major motivation for performing dimensionality reduction. The paper presents a review and systematic comparison of different dimensionality reduction techniques, also explains the strength and weakness of these techniques.