Machine Learning Fundamentals in Azure

Exam weight: 15–20%
Study roadmap for this topic:
ML models: features and labels, training, and validation
ML scenarios: regression, classification, and clustering
ML capabilities: automation, data/compute services, model management, and deployment
You’ve probably heard about Machine Learning before. In simple terms, it’s the process of teaching machines to learn patterns from data. The goal is to use data to build predictive models that can be embedded into real-world applications.
Today, data scientists and software developers work together to create predictive models that estimate unknown future values, such as:
An ice cream shop owner can use an application that combines historical sales data with weather records to predict how many ice creams they are likely to sell on a given day based on the weather forecast.
A doctor can use clinical data from previous patients to run automated tests that predict whether a new patient is at risk of diabetes, based on factors like weight, blood glucose level, and other measurements.
A researcher in Antarctica can use past observations to automatically identify different penguin species (such as Adelie, Gentoo, or Chinstrap) based on flipper measurements, beak size, and other physical attributes.
Machine Learning (ML) Models
A machine learning model is a software application that contains a mathematical function capable of producing an output value based on input values.
The process of defining how this function calculates results is called training.
Once trained, the model can be used to predict new values — this process is known as inference.
To train a model, we need historical observations so it can learn how to predict future outcomes. In mathematical terms:
X = features (inputs)
Y = labels (output)
In most real scenarios, we deal with large datasets. Therefore, X is usually a vector (a matrix with multiple values) like [x1, x2, x3, …], and Y represents the desired result.
Examples:
Ice cream sales scenario:
The goal is to train a model that predicts ice cream sales based on weather conditions. Weather measurements for the day (temperature, rainfall, wind speed, and so on) are the features (x), and the number of ice creams sold is the label (y).Features (x) = weather
Label (y) = number of ice creams sold
Medical scenario:
The goal is to predict whether a patient is at risk of diabetes based on clinical measurements. Patient data (weight, blood glucose level, etc.) are the features (x), and the diabetes risk (for example, 1 for at risk, 0 for not at risk) is the label (y).Features (x) = clinical measurements
Label (y) = diabetes risk
Antarctic research scenario:
The goal is to predict the species of a penguin based on its physical attributes. Measurements such as flipper length and beak width are the features (x), and the species (for example, 0 for Adelie, 1 for Gentoo, or 2 for Chinstrap) is the label (y).Features (x) = physical measurements
Label (y) = species
Simplified steps of training and inference
Obtain historical data that will be used as input for training
Prepare and transform the data
Train the model using an appropriate algorithm
Use the trained model to make predictions (inference)
Types of Machine Learning Models
There are several types of learning, and it’s important to choose the one that best fits your purpose.
Supervised Learning
Supervised learning is when we train a model using historical data that includes both features (x) and labels (y).
Regression
Regression is a machine learning approach where the label is a numeric value.
Examples:
Predicting the number of ice creams sold based on temperature, rainfall, and wind speed
Predicting the selling price of a property based on its size, number of rooms, and socioeconomic metrics
Predicting fuel efficiency (miles per gallon) based on engine size, weight, width, height, and length
Learn more about regression here.
Classification
Classification is a machine learning approach where the label represents a category or class, which can be binary or multiclass.
Binary Classification
Binary classification determines whether something is true or false.
Examples:
Predicting whether a patient is at risk of diabetes based on clinical metrics
Predicting whether a bank customer will default on a loan based on income, credit history, and age
Predicting whether a customer will respond positively to a marketing offer
Learn more about binary classification here.
Multiclass Classification
In multiclass classification, the output can belong to multiple possible classes.
Examples:
Predicting the species of a penguin (Adelie, Gentoo, or Chinstrap)
Predicting the genre of a movie (comedy, horror, romance, adventure, or science fiction)
Unsupervised Learning
Unsupervised learning involves training models with unknown labels. The algorithm finds patterns on its own.
Clustering
The most common use of unsupervised learning is clustering, which identifies similarities in the data and groups them accordingly.
Examples:
Grouping similar flowers based on size, number of leaves, and number of petals
Identifying groups of similar customers based on demographic attributes and purchasing behavior
In clustering, there are no predefined labels. The algorithm groups data purely based on feature similarity.
In some cases, clustering is used first to identify groups, and then classification is applied later using those group labels.
What is clustering, in simple terms?
Imagine a big box full of mixed toys: cars, balls, dolls, building blocks, and more.
Clustering is the process of organizing these toys into groups based on similarities — without anyone telling you how to do it.
The initial mess
You observe all the items and start noticing patterns: balls are round, cars have wheels, dolls have human shapes.Creating the groups (clusters)
Group 1: All balls
Group 2: All toy cars
Group 3: All dolls
The result
Each cluster contains items that are very similar to each other and different from items in other clusters.
Learn more about clustering here.
Deep Learning
Deep learning is an advanced form of machine learning that attempts to emulate how the human brain learns.
It is based on artificial neural networks that simulate the electrochemical activity of biological neurons using mathematical functions.
Deep neural networks, composed of multiple layers, are used for many machine learning problems, including regression, classification, and natural language processing.
Training involves iteratively feeding data, calculating outputs, validating the model, and adjusting weights to minimize loss.
Learn more here.
Introduction to Machine Learning in Azure
We constantly generate historical data through games, shopping apps, social networks, and more. These datasets enable modern AI applications to generate insights and solve real-world problems.
Below are the main steps that define a machine learning workflow:
Define the problem
Obtain and prepare the data
Train the model
Integrate the model
1. Defining the problem
At this stage, you define what problem will be solved, what the model’s output should be, which ML task will be used, and how success will be measured.
Common task types include:
Classification
Regression
Time series forecasting
Computer vision
Natural Language Processing (NLP)
Understanding the data and the problem before training ensures better results.
2. Obtaining and preparing the data
Data can come from many sources: CRMs, databases, images, and more.
Azure provides services such as:
Azure Synapse Analytics
Azure Databricks
Azure Machine Learning
These services help extract, transform, and prepare data for training.
3. Training the model
The service you choose depends on:
The type of model
The level of control you need
The time you can invest
Existing tools in your organization
Your programming language preference
Azure options include:
Azure Machine Learning
Azure Databricks
Microsoft Fabric
Azure AI Services
4. Deploying and integrating the model
Once the model is ready, it must be deployed to an endpoint for real-time or batch predictions.
Real-time predictions: immediate responses, such as product recommendations on a website
Batch predictions: executed at specific times, such as weekly sales forecasts
Compute options
Smaller tabular datasets usually work well with CPUs.
Larger datasets or unstructured data (images, text) benefit from GPUs, which are more efficient.
Consider compute costs
Always consider infrastructure costs when deploying models.
Real-time endpoints stay active continuously, while batch jobs are billed only when running.
Exercise and module links
The documentation includes an exercise to help you reinforce this; just follow the instructions here.
The module links is here.



