Comparison of Performance of Data Imputation Methods for Numeric Dataset

Jadhav, Anil and Pramod, Dhanya and Ramanathan, Krishnan (2019) Comparison of Performance of Data Imputation Methods for Numeric Dataset. Applied Artificial Intelligence, 33 (10). pp. 913-933. ISSN 0883-9514

[thumbnail of Comparison of Performance of Data Imputation Methods for Numeric Dataset.pdf] Text
Comparison of Performance of Data Imputation Methods for Numeric Dataset.pdf - Published Version

Download (2MB)

Abstract

Missing data is common problem faced by researchers and data scientists. Therefore, it is required to handle them appropriately in order to get better and accurate results of data analysis. Objective of this research paper is to provide better understanding of data missingness mechanism, data imputation methods, and to assess performance of the widely used data imputation methods for numeric dataset. It will help practitioners and data scientists to select appropriate method of data imputation for numeric dataset while performing data mining task. In this paper, we comprehensively compare seven data imputation methods namely mean imputation, median imputation, kNN imputation, predictive mean matching, Bayesian Linear Regression (norm), Linear Regression, non-Bayesian (norm.nob), and random sample. We have used five different numeric datasets obtained from UCI machine learning repository for analyzing and comparing performance of the data imputation methods. Performance of the data imputation methods is assessed using Normalized Root Mean Square Error (RMSE) method. The results of analysis show that kNN imputation method outperforms the other methods. It has also been found that performance of the data imputation method is independent of the dataset and percentage of missing values in the dataset.

Item Type: Article
Subjects: Grantha Library > Computer Science
Depositing User: Unnamed user with email support@granthalibrary.com
Date Deposited: 19 Jun 2023 10:20
Last Modified: 12 Sep 2024 04:25
URI: http://asian.universityeprint.com/id/eprint/1244

Actions (login required)

View Item
View Item