This component is a universal solution for "supervised machine learning problem" solving. Supervised machine learning can be defined as learning a function that performs the mapping of the independent characteristics that describe objects or phenomena to the expected outcomes (categories, values, etc.). The inference of the analytical function is performed based on the trained examples - the model fitting process lies in feeding the data into the model and iterative adjustment of the model's weights until the model has been fitted appropriately.
Supervised learning requires the training set, which represents the connections between input parameters (features) and the desired output (target) - the type of the target variable defines the type of the corresponded Data Mining problem (classification or regression):
Classification is the supervised learning problem when the target variable is discrete. Classification is connected with finding a function, which performs the recognition of the object's class based on its features.
Examples: email spam detection, Image recognition, credit scoring, user compliance categorization.
Regression is the supervised learning problem when the target variable is continuous. Regression is used to understand the relationship between dependent and independent variables.
Examples: market trends prediction, time-series forecasting, risk assessment
Predictive Model brick automatically performs the detecting of the supervised learning problem's type, based on the selected target variable, as well as the selection of the input features that are appropriate for the modeling. There are two modes of Predictive Model brick settings:
Bricks → Analytics → AutoML → Predictive Model
Bricks → Analytics → Data Mining / ML → Classification Models → Predictive Model
Bricks → Analytics → Data Mining / ML → Regression Models → Predictive Model
Target Variable
The column that we want the model to predict. This variable can be both continuous and discrete (categorical) - this defines the type of the data mining problem (classification or regression).
Quick run
The binary flag, which determines the model fitting scenario - if True, the model will be tuned with the default parameters without the hyper-parameters tuning, if False - we sacrifice the computational performance in favor of model precision.
Select Problem
Advances option. A drop-down menu that allows selecting the desired data mining problem.
Filter Columns Settings
Advances option. This is aimed at the predictor's list composition.
Columns
List of possible columns for selection. It is possible to choose several columns for filtering by clicking on the '+' button in the brick settings and specify the way of their processing:
Remove all except selected
The binary flag, which determines the behavior in the context of the selected columns
Inputs
Brick takes the data set with a target column that meets the requirements to the supervised machine learning problem solving.