Dealing with Missing Data in Machine Learning Problems

Three popular techniques for dealing with missing values in data science and Machine Learning:

  1. Dropping the Columns with Missing Values: Simplest option, but unless a column is mostly missing it can lead to loss of a lot of useful information for training your ml model
  2. Imputation i.e. filling the missing value with some number: Not very accurate, but usually yields better results than dropping the entire column
  3. Imputating the missing values and adding a column that indicates their location in the dataset: For some datasets this leads to improved results when compared to (2) above.

Related articles

Federated Machine Learning for COVID19 Spreading Estimation

COVID19 spreading estimation is currently based on simulations[1] that are driven by known epidemiological models[2]. The latter are adapted based on classical models used for viruses of the corona family[3]. Specially, the mathematical principles of […]

Learn More

Popular Techniques for Explainable Artificial Intelligence

During the last couple of years, the interpretability and explainability of AI systems are gaining momentum as a result of the need to ensure the transparency and accountability of AI operations, while at the same […]

Learn More

Ethics in the Core of Europe’s AI Transformation: Reflections from the AI REDGIO 5.0 Ethics Workshop

In these last years, Artificial Intelligence has been intimately tied to Europe’s idea of its own industrial and social future. As companies and people, in ordinary days, start to adopt edge computing, automation, and cooperative […]

Learn More