Skip to main content

Command Palette

Search for a command to run...

Natural Language Processing (NLP) in Azure

Published
6 min read
Natural Language Processing (NLP) in Azure

Exam weight: 15–20%

Study roadmap for this topic:

  • Entity recognition

  • Language detection

  • Sentiment analysis

  • Key phrase extraction

  • Speech recognition and synthesis

  • Language models

Natural Language Processing (NLP)

Natural Language Processing (NLP) is a field of artificial intelligence focused on enabling computers to understand, interpret, and respond to human language.
In this chapter, we take a deeper look at what NLP models are capable of when it comes to understanding human communication.

The Azure AI Language text analysis features include:

  • Named Entity Recognition (NER), which identifies people, locations, events, and more

  • Entity linking, which identifies known entities and returns references (for example, Wikipedia links)

  • PII detection (Personally Identifiable Information)

  • Language detection, returning the language name, ISO code, and confidence score

  • Sentiment analysis and opinion mining

  • Automatic text summarization

  • Key phrase extraction from unstructured text

Entity recognition and linking

This service identifies elements in text that refer to people, places, ages, dates, time, duration, and more.

Examples:

  • “Bill Gates”, “Roberto Carlos”, “Maria Bethânia” — Person type

  • “Paris”, “New York”, “Belo Horizonte” — Location type

  • “6” or “six” — Quantity type, numeric subtype

Language detection

Azure’s language service can detect more than 150 languages.
For each document analyzed, it returns:

  • The language name (e.g., “English”)

  • The ISO 639-1 language code (e.g., “en”)

  • A confidence score

Consider a scenario where restaurant customers leave reviews about the service.
Suppose the following customer reviews:

Review 1:

“Um lugar fantástico para o almoço. A sopa estava deliciosa.”

Review 2:

“Comida maravillosa y gran servicio.”

Review 3:

“O croque monsieur avec frites foi fantástico. Bon appetit!”

After analysis, the service returns the following results:

DocumentLanguageISO CodeScore
Review 1Portuguesept1.0
Review 2Spanishes1.0
Review 3Portuguesept0.9

Note that in Review 3, the model identifies the predominant language to determine the confidence score.

Sentiment analysis and opinion mining

Azure AI Language text analysis features can evaluate text and return sentiment labels and scores for each sentence.
This functionality is widely used to detect positive and negative sentiment in social media, customer reviews, discussion forums, and more.

The service returns sentiment scores across three categories:

  • Positive

  • Neutral

  • Negative

Each category receives a score between 0 and 1.

Example: restaurant reviews

Review 1:

“We dined at this restaurant last night and the first thing I noticed was how polite the staff was. We were greeted warmly and taken to our table immediately. The table was clean, the chairs were comfortable, and the food was amazing.”

  • Positive score: 0.90

  • Neutral score: 0.10

  • Negative score: 0.00

Review 2:

“Our dining experience at this restaurant was one of the worst I’ve ever had. The service was slow, and the food was terrible. I will never eat at this establishment again.”

  • Positive score: 0.00

  • Neutral score: 0.00

  • Negative score: 0.99

Key phrase extraction

The key phrase extraction service identifies the main points of a text — similar to asking an AI to summarize content into core topics.

For example, consider the following customer review:

“We dined here for a birthday celebration and had a fantastic experience. We were welcomed by a friendly hostess and taken to our table immediately. The atmosphere was relaxed, the food was amazing, and the service was outstanding. If you enjoy great food and attentive service, you should try this place.”

Key phrase extraction may return:

  • birthday celebration

  • fantastic experience

  • friendly hostess

  • great food

  • attentive service

  • dinner

  • table

  • atmosphere

  • place

Speech recognition and synthesis

Conversational AI enables dialogue between humans and AI systems.
This service makes it possible for chatbots to significantly reduce the human workload required to answer customer questions.

Question-answering features can respond instantly, address concerns accurately, and interact naturally with users using multiple transformations.

Azure AI Language supports CLU (Conversational Language Understanding), allowing applications to understand commands, questions, and user intent.
When combined with the Speech Service, it becomes possible to interpret voice commands such as:

  • “Turn on the lights”

  • “Turn on the TV”

How NLP is interpreted in a human-like way

Previously, we explored how Large Language Models (LLMs) understand vocabulary semantically — through speech, text, data extraction, and more.
Now, let’s dive deeper into how NLP systems interpret language in a way similar to humans.

NLP is used for:

  • Speech-to-text and text-to-speech conversion

  • Automatic translation

  • Text classification

  • Entity extraction

  • Question answering

  • Text summarization

How language is processed

For AI to work semantically with language and understand intent, everything is transformed into a body of text called a corpus.
From this corpus, the system can infer what a document is about.

The first step is tokenization, where the corpus is divided into tokens.
For example, the sentence:

“a escola está a alguns metros”

Becomes the following tokens:

  1. a

  2. escola

  3. está

  4. a

  5. alguns

  6. metros

Note that token 1 appears twice in the corpus, which can be represented as:
{1, 2, 3, 1, 4, 5}

Statistical techniques for NLP

Two important techniques form the foundation of NLP:

  • Naïve Bayes

  • Term Frequency–Inverse Document Frequency (TF-IDF)

Naïve Bayes

Naïve Bayes was originally used to filter emails and distinguish between spam and non-spam messages.
Using probability, the technique estimates which category tokens are most likely associated with, based on previously labeled data.

For example, an email containing words like promotion and discount has a high probability of being spam.

TF-IDF (Term Frequency – Inverse Document Frequency)

TF-IDF follows a similar approach by comparing how often a word appears in a document versus how frequently it appears across an entire corpus.

Its purpose is to measure how important a word is within a specific document relative to a collection of documents.
Words that appear frequently in a particular document but rarely elsewhere are considered more relevant.

Semantic language models

As NLP evolved, powerful deep learning models emerged.
At the core of these models is the encoding of tokens as vectors, enabling AI to understand the meaning of words rather than just recognizing letters or phrases.

While a simple model only “sees” text, a semantic model attempts to understand context.

For example, the sentences:

  • “The dog ran after the ball”

  • “The dog chased the ball”

Have nearly the same meaning. A semantic model can recognize this similarity even though the words differ.

Each word or phrase is converted into numbers (called vectors) that represent its meaning.
These vectors represent semantic direction and distance — similar concepts point in the same direction.

Examples:

  • “King” and “Queen” are close

  • “King” and “Banana” are far apart

Consider the following vectors:

  • 4 (“dog”): [10, 3, 2]

  • 8 (“cat”): [10, 3, 1]

  • 9 (“puppy”): [5, 2, 1]

  • 10 (“skateboard”): [-3, 3, 2]

In a three-dimensional space, dog and puppy follow a similar direction, cat is close, while skateboard points in a completely different semantic direction.

Machine learning for text classification

Another useful text analysis technique is training classification algorithms, such as logistic regression.

These models are commonly used to classify text as positive or negative, enabling sentiment analysis and opinion mining.

Using these models, AI can generate reports — for example, customer satisfaction reports based on online reviews and social media posts.

Azure AI Foundry portal

The Azure AI Foundry portal allows developers to integrate language capabilities into applications.
To use these features, you must provision the appropriate resource in your Azure subscription, choosing between language, translation, or Azure AI services.

Azure AI Foundry provides a unified platform for AI operations, model building, and application development, allowing you to experiment with services in a sandbox environment.
The portal also helps organize projects and resources efficiently.

🔗 Azure AI Foundry portal:
https://ai.azure.com/

Useful links

https://learn.microsoft.com/training/modules/get-started-language-azure/
https://learn.microsoft.com/training/modules/introduction-language/


Next post:

5 — Describe Generative AI workloads in Azure

AI-900

Part 4 of 5

Learning something new isn’t always easy — and AI is no exception. Many materials are too technical for beginners. So, I decided to turn my studies into accessible content, both for people who already work in tech and for those who want to start now.

Up next

Describing Generative AI capabilities in Azure

Exam weight: 20–25% Generative AI is the form of AI most people are familiar with today. It is what we use to chat, generate documents, ask questions, and perform everyday tasks.Generative AI aims to get as close as possible to the way humans communi...

Azure NLP Solutions Overview