top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

Data Pre-processing in Machine Learning

Writer's picture: ben othmen rabebben othmen rabeb

In this article, we will cover all the data pre-processing steps that need to be taken to convert raw data into a processed form.


Data Pre-processing

Data pre-processing includes the steps we need to take to transform or encode the data so that it can be easily analyzed by the machine.



The main goal for a model to be accurate and precise in predictions is for the algorithm to be able to easily interpret the characteristics of the data.


Data pre-processing steps





A. Data cleaning


Data cleaning is particularly done as part of data preprocessing to clean the data by filling in missing values, smoothing out noisy data, resolving inconsistencies, and removing outliers.


B. Data Integration


Data integration is one of the data preprocessing steps used to merge data present in multiple sources into a single larger data store, such as a data warehouse.


C. Data Transformation


After erasing the data, we need to consolidate the quality data into other forms by changing the value, structure or format of the data using the data transformation strategies : Generalization, standardization, Attribute Selection, aggregation.


D. Data Reduction


The size of the dataset in a data warehouse may be too large to be handled by data analysis and data mining algorithms.


0 comments

Recent Posts

See All

Commentaires


COURSES, PROGRAMS & CERTIFICATIONS

 

Advanced Business Analytics Specialization

Applied Data Science with Python (University of Michigan)

Data Analyst Professional Certificate (IBM)

Data Science Professional Certificate (IBM)

Data Science Specialization (John Hopkins University)

Data Science with Python Certification Training 

Data Scientist Career Path

Data Scientist Nano Degree Program

Data Scientist Program

Deep Learning Specialization

Machine Learning Course (Andrew Ng @ Stanford)

Machine Learning, Data Science and Deep Learning

Machine Learning Specialization (University of Washington)

Master Python for Data Science

Mathematics for Machine Learning (Imperial College London)

Programming with Python

Python for Everybody Specialization (University of Michigan)

Python Machine Learning Certification Training

Reinforcement Learning Specialization (University of Alberta)

Join our mailing list

Data Insight participates in affiliate programs and may sometimes get a commission through purchases made through our links without any additional cost to our visitors.

bottom of page