DP-203: Data Engineering on Microsoft Azure

Name: DP-203: Data Engineering on Microsoft Azure
Availability: InStock

Modality:

Product Options

€295.00

| /

________________________________________________________________

Do you want to take this course remotely or in person?

Contact us by email: info@nanforiberica.com , phone: +34 91 031 66 78, WhatsApp: +34 685 60 05 91 , or contact Our Offices

________________________________________________________________

Microsoft will retire the DP-203: Data Engineering on Microsoft Azure course on December 31, 2025. Please note that the certification was retired on March 31, 2025. It will be replaced by the DP-700: Microsoft Fabric Data Engineer course.

DP-203 Course: Data Engineering on Microsoft Azure

In this course, students will learn about data engineering as it relates to working with batch and real-time analytics solutions using Azure data platform technologies. Students will begin by learning about the basic compute and storage technologies used to build an analytics solution. They will also learn how to interactively explore data stored in files within a data lake. They will learn about the various ingestion techniques that can be used to load data using the Apache Spark functionality included in Azure Synapse Analytics or Azure Databricks, or how to ingest data using Azure Data Factory or Azure Synapse pipelines. Students will also learn about the different ways they can transform data using the same technologies used for data ingestion. They will understand the importance of implementing security to ensure that data (at rest or in transit) is protected. Following this, they will be shown how to build a real-time analytics system to create real-time analytics solutions.

Course aimed at

The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies in Microsoft Azure. The secondary audience for this course includes data analysts and data scientists who work with Microsoft Azure-based analytical solutions.

Elements of the DP-203 formation

Introduction to data engineering in Azure (3 units)
Creating data analysis solutions with serverless Azure Synapse SQL groups (4 units)
Performing data engineering tasks with Apache Spark groups in Azure Synapse (3 units)
Data transfer and transformation using Azure Synapse Analytics pipelines (2 units)
Implementing a data analytics solution with Azure Synapse Analytics (6 units)
Working with data storage using Azure Synapse Analytics (4 units)
Using hybrid analytical and transactional processing solutions through Azure Synapse Analytics (3 units)
Implementing a data streaming solution with Azure Stream Analytics (3 units)
Implementing a data lake warehouse analytics solution with Azure Databricks (6 units)

Course content DP-203

Module 1: Exploring the processing and storage options for data engineering workloads

This module provides an overview of the Azure compute and storage technology options available to data engineers building analytical workloads. It teaches how to structure your data lake and optimize files for exploration, sequencing, and batch workloads. You will learn to organize your data lake into levels of data refinement as you transform files through batch and sequencing processing. Then, you will learn how to create indexes on your datasets, such as CSV, JSON, and Parquet files, and use them to potentially accelerate queries and workloads.

Lessons

Introduction to Azure Synapse Analytics
Azure Databricks description
Introduction to Azure Data Lake Storage
Description of the architecture of Delta Lake
I work with data streams using Azure Stream Analytics

Laboratory: Exploring processing and storage options for data engineering workloads

Combine batch and sequential processing in the same pipeline
Organize the data lake into file transformation tiers
Indexing the data lake storage to accelerate queries and workloads

After completing this module, students will be able to do the following:

Describe Azure Synapse Analytics
Azure Databricks description
Describe Azure Data Lake Storage
Describe the architecture of Delta Lake
Describe Azure Stream Analytics

Module 2: Running interactive queries with serverless SQL pools in Azure Synapse Analytics

In this module, students will learn to work with files stored in the data lake and external file sources using T-SQL statements executed by a serverless SQL pool in Azure Synapse Analytics. They will query Parquet files stored in a data lake, as well as CSV files stored in an external data warehouse. Then, they will create Azure Active Directory security groups and enforce access to the data lake files using role-based access control (RBAC) and access control lists (ACLs).

Lessons

Exploring the capabilities of Azure Synapse serverless SQL pools
Querying data in the lake using serverless Azure Synapse SQL pools
Creating metadata objects in serverless Azure Synapse SQL groups
Data protection and user management in serverless Azure Synapse SQL groups

Lab: Running interactive queries with serverless SQL pools

Query Parquet data with serverless SQL groups
Create external tables for Parquet and CSV files
Create views with serverless SQL groups
Protecting access to data in a data lake when using serverless SQL pools
Configure data lake security through role-based access control (RBAC) and access control lists (ACLs)

After completing this module, students will be able to do the following:

Describe the capabilities of Azure Synapse serverless SQL groups
Querying data in the lake using serverless Azure Synapse SQL pools
Creating metadata objects in serverless Azure Synapse SQL groups
Data protection and user management in serverless Azure Synapse SQL groups

Module 3: Data Exploration and Transformation in Azure Databricks

This module teaches you how to use various Apache Spark DataFrame methods to explore and transform data in Azure Databricks. Students will learn to use standard DataFrame methods to explore and transform data. They will also learn to perform more advanced tasks, such as removing duplicate data, manipulating date and time values, renaming columns, and aggregating data.

Lessons

Azure Databricks description
Reading and writing data in Azure Databricks
I work with DataFrame elements in Azure Databricks
I work with advanced DataFrame methods in Azure Databricks

Lab: Performing data explorations and transformations in Azure Databricks

Use DataFrames in Azure Databricks to explore and filter data
Store DataFrames in a cache to perform faster queries later.
Duplicate data removal
Manipulating date and time values
Remove columns from DataFrame and rename them
Add data stored in a DataFrame

After completing this module, students will be able to do the following:

Azure Databricks description
Reading and writing data in Azure Databricks
I work with DataFrame elements in Azure Databricks
I work with advanced DataFrame methods in Azure Databricks

Module 4: Exploring, transforming, and loading data into data stores with Apache Spark

This module teaches how to explore data stored in a data lake, transform that data, and load it into a relational data warehouse. Students will explore Parquet and JSON files and use techniques to query and transform JSON files with hierarchical structures. They will then use Apache Spark to load data into the data warehouse and join Parquet data in the data lake with data from a dedicated SQL pool.

Lessons

Defining big data engineering with Apache Spark in Azure Synapse Analytics
Data ingestion with Apache Spark notebooks in Azure Synapse Analytics
Data transformation with DataFrame objects from Apache Spark groups in Azure Synapse Analytics
Integrating SQL groups and Apache Spark into Azure Synapse Analytics

Lab: Exploring, transforming, and loading data into data stores with Apache Spark

Perform data explorations in Synapse Studio
Ingest data with Spark notebooks in Azure Synapse Analytics
Transform data with Azure Synapse Analytics Spark Groups DataFrames
Integrate SQL and Spark groups into Azure Synapse Analytics

After completing this module, students will be able to do the following:

Describe big data engineering with Apache Spark in Azure Synapse Analytics
Data ingestion with Apache Spark notebooks in Azure Synapse Analytics
Data transformation with DataFrame objects from Apache Spark groups in Azure Synapse Analytics
Integrating SQL groups and Apache Spark into Azure Synapse Analytics

Module 5: Data Ingestion and Loading into Data Storage

This module teaches students how to ingest data into data storage using T-SQL scripts and Synapse Analytics integration pipelines. Students will learn how to load data into dedicated Synapse SQL pools with PolyBase and COPY using T-SQL. They will also learn how to use workload management in conjunction with a copy activity in an Azure Synapse pipeline for petabyte-scale data ingestion.

Lessons

Using best practices for loading data into Azure Synapse Analytics
Ingest at petabyte scale with Azure Data Factory

Laboratory: Data ingestion and loading into data storage systems

Perform petabyte-scale ingestions with Azure Synapse pipelines
Import data with PolyBase and COPY using T-SQL
Using best practices for loading data into Azure Synapse Analytics

After completing this module, students will be able to do the following:

Using best practices for loading data into Azure Synapse Analytics
Ingest at petabyte scale with Azure Data Factory

Module 6: Data transformation with Azure Data Factory or Azure Synapse pipelines

This module teaches students how to create data integration pipelines to ingest data from multiple data sources, transform data using allocation data flows, and perform data movements to one or more data receivers.

Lessons

Data integration with Azure Data Factory or Azure Synapse pipeline
Performing no-code transformations now scales with Azure Data Factory or Azure Synapse pipelines

Lab: Data transformation using Azure Data Factory or Azure Synapse pipelines

Running code-free transformations now scales with Azure Synapse pipelines
Create a data pipeline to import poorly formatted CSV files
Create allocation data flows

After completing this module, students will be able to do the following:

Perform data integrations with Azure Data Factory
Performing code-free transformations now scales with Azure Data Factory

Module 7: Organizing data movements and transformations in Azure Synapse pipelines

In this module we will learn how to create linked services and organize the movement and transformation of data using notebooks in Azure Synapse pipelines.

Lessons

Orchestration of data movements and transformations in Azure Data Factory

Lab: Organizing data movements and transformations in Azure Synapse pipelines

Integrate Notebook data with Azure Data Factory or Azure Synapse pipelines

After completing this module, students will be able to do the following:

Organize data movements and transformations in Azure Synapse pipelines

Module 8: Comprehensive Security with Azure Synapse Analytics

In this module, students will learn how to secure a Synapse Analytics workspace and its supporting infrastructure. They will analyze the SQL Active Directory manager, manage IP firewall rules, manage secrets with Azure Key Vault, and access those secrets through a linked Key Vault service and pipeline activities. They will also learn how to implement column-level and row-level security and dynamic data masking using dedicated SQL groups.

Lessons

Creating a data warehouse in Azure Synapse Analytics
Configuring and managing secrets in Azure Key Vault
Implementation of compliance controls for confidential data

Lab: Comprehensive security with Azure Synapse Analytics

Protecting the infrastructure behind Azure Synapse Analytics
Protect the workspace and managed services of Azure Synapse Analytics
Protecting Azure Synapse Analytics workspace data

After completing this module, students will be able to do the following:

Creating a data warehouse in Azure Synapse Analytics
Configuring and managing secrets in Azure Key Vault
Implementation of compliance controls for confidential data

Module 9: Supporting hybrid transactional analytics with Azure Synapse Link

In this module, students will learn how Azure Synapse Link enables seamless connectivity between an Azure Cosmos DB account and a Synapse workspace. Students will see how to enable and configure Synapse Link and then how to query the Azure Cosmos DB analytics store using Apache Spark and serverless SQL.

Lessons

Design of hybrid analytical and transactional processing using Azure Synapse Analytics
Configuring Azure Synapse Link with Azure Cosmos DB
Azure Cosmos DB query with Apache Spark groups
Azure Cosmos DB query with serverless SQL groups

Lab: Supporting hybrid transactional analytics with Azure Synapse Link

Configuring Azure Synapse Link with Azure Cosmos DB
Query Azure Cosmos DB with Apache Spark for Synapse Analytics
Query Azure Cosmos DB with serverless SQL pools for Azure Synapse Analytics

After completing this module, students will be able to do the following:

Design of hybrid analytical and transactional processing using Azure Synapse Analytics
Configuring Azure Synapse Link with Azure Cosmos DB
Azure Cosmos DB query with Apache Spark for Azure Synapse Analytics
Query Azure Cosmos DB with serverless SQL for Azure Synapse Analytics

Module 10: Real-time Stream Processing with Stream Analytics

In this module, students will learn to process stream data using Azure Stream Analytics. They will ingest vehicle telemetry data from Event Hubs and then process it in real time using various window-based functions in Azure Stream Analytics. They will send the data to Azure Synapse Analytics. Finally, students will learn how to scale Stream Analytics workloads to increase throughput.

Lessons

Enabling trusted messaging for big data applications with Azure Event Hubs
I work with data streams using Azure Stream Analytics
Ingesting data streams with Azure Stream Analytics

Laboratory: Real-time sequence processing with Stream Analytics

Use Stream Analytics to process real-time data from Event Hubs
Use Stream Analytics window-based functions to create aggregates and send them to Synapse Analytics
Scale Azure Stream Analytics jobs to increase performance through partitioning
Re-partition the sequence input to optimize parallelization

After completing this module, students will be able to do the following:

Enabling trusted messaging for big data applications with Azure Event Hubs
I work with data streams using Azure Stream Analytics
Ingesting data streams with Azure Stream Analytics

Module 11: Creating a sequence processing solution with Event Hubs and Azure Databricks

In this module, students will learn how to ingest and process data streams at scale using Event Hubs and Spark structured streaming in Azure Databricks. Students will learn about the uses and key features of structured streaming. They will implement sliding windows to add data snippets and apply watermarks to remove outdated data. Finally, students will connect to Event Hubs to read and write streams.

Lessons

Streaming data processing with Azure Databricks Structured Streaming

Lab: Creating a sequence processing solution with Event Hubs and Azure Databricks

Analyze the uses and key characteristics of structured streaming.
Transmit data from a file and write it to a distributed file system
Use sliding windows to add snippets of data instead of all the data
Apply watermarks to remove outdated data
Connect to Event Hubs read and write workflows

After completing this module, students will be able to do the following:

Streaming data processing with Azure Databricks Structured Streaming

Prerequisites

Successful students begin this course with knowledge of cloud computing and data fundamentals, and professional experience with data solutions.

Specifically:

AZ-900: Azure Fundamentals
DP-900: Data Fundamentals in Microsoft Azure

Language

Course: English / Spanish
Labs: English / Spanish

Microsoft Associate Certification: Azure Data Engineer Associate

Microsoft Certified: Azure Developer Associate

Microsoft Certified: Azure Data Engineer Associate

Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure using a range of Azure services.

Level: Intermediate
Role: Data Engineer
Product: Azure
Subject: Data and AI

Information related to training

Training support: Always by your side

Always by your side

Do you need another training modality?

Self Learning - Virtual - In-person - Telepresence

Bonuses for companies

For companies

Invitaciones a eventos y actividades de Nanfor

Quick links

DP-203: Data Engineering on Microsoft Azure

Do you want to take this course remotely or in person?

Microsoft will retire the DP-203: Data Engineering on Microsoft Azure course on December 31, 2025. Please note that the certification was retired on March 31, 2025. It will be replaced by the DP-700: Microsoft Fabric Data Engineer course.

DP-203 Course: Data Engineering on Microsoft Azure

Course aimed at

Elements of the DP-203 formation

Course content DP-203

Prerequisites

Language

Microsoft Associate Certification: Azure Data Engineer Associate

Information related to training

Training support: Always by your side

Do you need another training modality?

Bonuses for companies

Register here to receive invitations to events and other Nanfor activities