General Information

This component is a universal solution for "supervised machine learning problem" solving. Supervised machine learning can be defined as learning a function that performs the mapping of the independent characteristics that describe objects or phenomena to the expected outcomes (categories, values, etc.). The inference of the analytical function is performed based on the trained examples - the model fitting process lies in feeding the data into the model and iterative adjustment of the model's weights until the model has been fitted appropriately.

Supervised learning requires the training set, which represents the connections between input parameters (features) and the desired output (target) - the type of the target variable defines the type of the corresponded Data Mining problem (classification or regression):

Classification is the supervised learning problem when the target variable is discrete. Classification is connected with finding a function, which performs the recognition of the object's class based on its features.

Examples: email spam detection, Image recognition, credit scoring, user compliance categorization.
Regression is the supervised learning problem when the target variable is continuous. Regression is used to understand the relationship between dependent and independent variables.

Examples: market trends prediction, time-series forecasting, risk assessment

Predictive Model brick automatically performs the detecting of the supervised learning problem's type, based on the selected target variable, as well as the selection of the input features that are appropriate for the modeling. There are two modes of Predictive Model brick settings:

Simple mode - the user should define the target variable only, and the rest will be made automatically - the component defines the data mining problem, choose the list of predictors, select the appropriate model, and makes it tunning
Advanced mode - the user may not only define the target variable but select the type of data mining problem and compound the list of predictors.

Description

Brick Location

Bricks → Analytics → AutoML → Predictive Model

Bricks → Analytics → Data Mining / ML → Classification Models → Predictive Model

Bricks → Analytics → Data Mining / ML → Regression Models → Predictive Model

Brick Parameters

Target Variable

The column that we want the model to predict. This variable can be both continuous and discrete (categorical) - this defines the type of the data mining problem (classification or regression).
Quick run

The binary flag, which determines the model fitting scenario - if True, the model will be tuned with the default parameters without the hyper-parameters tuning, if False - we sacrifice the computational performance in favor of model precision.
Select Problem

Advances option. A drop-down menu that allows selecting the desired data mining problem.
Filter Columns Settings

Advances option. This is aimed at the predictor's list composition.

Columns

List of possible columns for selection. It is possible to choose several columns for filtering by clicking on the '+' button in the brick settings and specify the way of their processing:
- remove all mentioned columns from the dataset and proceed with the rest ones as with predictors
- use the selected columns as predictors and proceed with them
Remove all except selected

The binary flag, which determines the behavior in the context of the selected columns

Brick Inputs/Outputs

Inputs

Brick takes the data set with a target column that meets the requirements to the supervised machine learning problem solving.