Dealing with Missing Data in Machine Learning Problems

Three popular techniques for dealing with missing values in data science and Machine Learning:

  1. Dropping the Columns with Missing Values: Simplest option, but unless a column is mostly missing it can lead to loss of a lot of useful information for training your ml model
  2. Imputation i.e. filling the missing value with some number: Not very accurate, but usually yields better results than dropping the entire column
  3. Imputating the missing values and adding a column that indicates their location in the dataset: For some datasets this leads to improved results when compared to (2) above.

Related articles

Popular Techniques for Explainable Artificial Intelligence

During the last couple of years, the interpretability and explainability of AI systems are gaining momentum as a result of the need to ensure the transparency and accountability of AI operations, while at the same […]

Learn More

Federated Machine Learning for COVID19 Spreading Estimation

COVID19 spreading estimation is currently based on simulations[1] that are driven by known epidemiological models[2]. The latter are adapted based on classical models used for viruses of the corona family[3]. Specially, the mathematical principles of […]

Learn More