Connectionist AI models require well-prepared data to learn effectively. Data preparation or data preprocessing is the critical phase where we transform raw data into a clean and usable format for training. In simple terms, data preprocessing involves evaluating, filtering, manipulating, and encoding data so that a machine learning algorithm can understand it. This step is vital because, as the saying goes, “if garbage goes in, garbage comes out”, a model’s success depends on the quality of the input data. Below are break down of the key parts of data preparation.

Data Cleaning

Data Labeling

Feature Scaling and Normalization

Data Splitting

Data Augmenation

https://lakefs.io/blog/data-preprocessing-in-machine-learning/#:~:text=Data preprocessing is the process,useful for machine learning purposes

https://www.ibm.com/think/topics/data-labeling#:~:text=Data labeling involves identifying raw,them to make accurate predictions

https://milvus.io/ai-quick-reference/how-do-you-preprocess-data-for-a-neural-network

https://www.datacamp.com/tutorial/complete-guide-data-augmentation