Machine Learning key terms that you must know!
Features:
• Features are the
fields used as input.
• A feature is one column of the data in your input set.
• For instance, if you're trying to predict the type of pet
someone will choose, your input features might include age, home region, family
income, etc.
• Feature means property of your training data.
• A feature is the input you have fed to the model or system.
• The value of x variable in simple linear regression are the
features.
Label:
• The output you get from your model after training is called
a label.
• A label is the thing we're predicting.
• For example the value of y variable in simple linear
regression model is the label.
• Suppose you give your model data like a person’s age,
height, hair length and then your model predicts whether the person is male or
female. Then male or female is called the label.
Machine Learning Model:
• A model is
the relationship between features and the label.
• An ML
model is a mathematical model that generates predictions by finding patterns in
your data.
• ML Models
generate predictions using the patterns extracted from the input data.
• A model
represents what was learned by a machine learning algorithm.
• The model
is the “thing” that is saved after running a machine learning algorithm on
training data and represents the rules, numbers, and any other
algorithm-specific data structures required to make predictions.
1.
Data collection:
Data collection is the process of gathering
and measuring information from countless different sources. This is a critical
first step that involves gathering data from various sources such as databases,
files, and external repositories. Before starting the data collection process,
it’s important to articulate the problem you want to solve with an ML model.
2.
Data Preparation:
Data preparation/pre-processing techniques
generally refer to the addition, deletion, or transformation of training set
data. Since the collected data may be in an undesired format, unorganized, or
extremely large, further steps are needed to enhance its quality. The three
common steps for preprocessing data are
· formatting,
· cleaning,
· and sampling.
Data
preparation (also referred to as “data preprocessing”) is the process of
transforming raw data so that data scientists and analysts can run it through
machine learning algorithms to uncover insights or make predictions.
3.
Choose a ML model:
For different purpose, different ML models are
available. So it depends on the need that which ML model must be selected. The
choice of ML model to be selected depends on many factors like the problem
statement and the kind of output you want, type and size of the data, the
available computational time, number of features, and observations in the data,
etc.
4.
Train the model:
The process of training an ML model involves
providing an ML algorithm (that is, the learning algorithm) with training data
to learn from. Let's say that you want to train an ML model to predict if an
email is spam or not spam. You would provide ML model with training data that
contains emails for which you know the target (that is, a label that tells
whether an email is spam or not spam). Then the model should be trained by
using this data, resulting in a model that attempts to predict whether new
email will be spam or not spam.
5.
Evaluate the model:
Model
evaluation is a method of assessing the correctness of models on test data. The
test data consists of data points that have not been seen by the model before.
There are two methods of evaluating models in data science, Hold-Out and
Cross-Validation. To avoid overfitting, both methods use a test set (not seen
by the model) to evaluate model performance.
6.
Parameter Tuning:
Each model has
its own sets of parameters that need to be tuned to get optimal output. For
every model, our goal is to minimize the error or say to have predictions as
close as possible to actual values. This is one of the cores or say the major
objective of hyperparameter tuning. There are following three approaches to
Hyperparameter tuning:
•Manual
Search
•Random
Search
•Grid Search
7.
Make predictions:
“Prediction” refers to the output of an
algorithm after it has been trained on a historical dataset. Machine learning
has two main goals:
· prediction and inference.
After you
have a model, you can use that model to generate predictions which means to
give your model the inputs it has never seen before and obtain the answer the
model has predicted. In addition to making predictions on new data, you can use
machine-learning models to better understand the relationships between the
input features and the output target which is known as inference.
0 Comments