Dealing with Missing Data in Machine Learning Problems

Three popular techniques for dealing with missing values in data science and Machine Learning:

  1. Dropping the Columns with Missing Values: Simplest option, but unless a column is mostly missing it can lead to loss of a lot of useful information for training your ml model
  2. Imputation i.e. filling the missing value with some number: Not very accurate, but usually yields better results than dropping the entire column
  3. Imputating the missing values and adding a column that indicates their location in the dataset: For some datasets this leads to improved results when compared to (2) above.

Related articles

Federated Machine Learning for COVID19 Spreading Estimation

COVID19 spreading estimation is currently based on simulations[1] that are driven by known epidemiological models[2]. The latter are adapted based on classical models used for viruses of the corona family[3]. Specially, the mathematical principles of […]

Learn More

Popular Techniques for Explainable Artificial Intelligence

During the last couple of years, the interpretability and explainability of AI systems are gaining momentum as a result of the need to ensure the transparency and accountability of AI operations, while at the same […]

Learn More