DP-3027: Implement a data engineering solution with Azure Databricks

Name: DP-3027: Implement a data engineering solution with Azure Databricks
Availability: InStock

Modality:

Product Options

€495.00

| /

________________________________________________________________

Are you interested in this course in online or in-person format?
Contact us

📧info@nanforiberica.com • 📞+34 91 031 66 78 • 📱 +34 685 60 05 91 (WhatsApp) • 🏢 Our Offices

________________________________________________________________

NEW COURSE: The launch of the DP-3027 material: Implement a data engineering solution with Azure Databricks is delayed with a date to be determined.

Course DP-3027: Implement a data engineering solution with Azure Databricks

In this course, learn how to leverage the power of Apache Spark and the powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.

Level: Beginner - Role: Data Analyst, Data Engineer, Data Scientist - Product: Azure - Subject: Data Engineering

⏱️

Course duration: 70 hours

💻

Virtual modality with support always by your side

🔑

Access to the virtual classroom: 2 months

Course aimed at

Data engineers, data scientists, and ELT developers learn how to leverage the power of Apache Spark and the powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.

Objectives of the official course DP-3027

Understanding the Azure Databricks architecture: Become familiar with the platform's key components and how they integrate with other Azure services
Implement data ingestion techniques: Learn to capture data from multiple sources using tools such as Structured Streaming and Delta Lake
Perform data transformations and processing: Use Apache Spark to clean, transform, and prepare data for analysis or storage.
Develop scalable ETL flows: Build efficient and reusable data pipelines that support large volumes of information.
Optimize process performance: Apply tuning, autoscaling, and observability strategies to improve workflow efficiency.
Implement streaming architectures with Delta Live Tables: Design real-time solutions for continuous data processing.
Automate tasks with Azure Databricks Jobs: Orchestrate and schedule workflows to reduce manual intervention and accelerate the delivery of insights.
Apply CI/CD in data environments: Integrate continuous development practices to maintain the quality and stability of data solutions.

Content of the official Azure Databricks DP-3027 course

Module 1: Perform incremental processing with Spark structured streaming

Introduction
Configuring real-time data sources for incremental processing
Optimize Delta Lake for incremental processing on Azure Databricks
Control of delayed data and out-of-order events in incremental processing
Performance monitoring and tuning strategies for incremental processing in Azure Databricks
Exercise: Real-time data ingestion and processing with Delta Live Tables using Azure Databricks

Module 2 Implementation of streaming architecture patterns with Delta Live Tables

Introduction
Event-driven architectures with Delta Live Tables
Data ingestion with structured streaming
Maintaining data consistency and reliability with structured streaming
Scaling streaming workloads with Delta Live Tables
Exercise: End-to-end streaming channeling with Delta Live Tables

Module 3 Performance Optimization with Spark and Delta Live Tables

Introduction
Performance optimization with Spark and Delta Live Tables
Cost-based optimization and query adjustment
Use of Change Data Capture (CDC)
Use of improved auto-scaling
Implement observability and data quality metrics
Exercise: Optimizing data pipelines to improve performance in Azure Databricks

Module 4 Implementing CI/CD workflows in Azure Databricks

Introduction
Implementation of version control and Git integration
Performing unit tests and integration tests
Environment administration and configuration
Implementation of reversal and modernization strategies
Exercise: Implementation of CI/CD workflows

Module 5 Automating workloads with Azure Databricks jobs

Introduction
Implementation of job scheduling and automation
Workflow optimization with parameters
Control of the administration of dependencies
Implementation of error control and retry mechanisms
Exploration of recommended procedures and instructions
Exercise: Automation of data processing and ingestion

Module 6 Privacy Management and Data Governance with Azure Databricks

Introduction
Implementing data encryption techniques in Azure Databricks
Managing access controls in Azure Databricks
Implementing data masking and anonymization in Azure Databricks
Using compliance frameworks and secure data sharing in Azure Databricks
Use of data lineage and metadata management
Implementing governance automation in Azure Databricks
Exercise: Practice implementing the Unity catalog

Module 7 Using SQL Stores in Azure Databricks

Introduction
Introduction to SQL Warehouses
Creating databases and tables
Creating queries and panels
Exercise: Using an SQL store in Azure Databricks

Module 8 Running Azure Databricks notebooks with Azure Data Factory

Introduction
Description of Azure Databricks notebooks and pipelines
Creating a linked service for Azure Databricks
Using a Notebook activity in a pipeline
Using parameters in a notebook
Exercise: Running an Azure Databricks notebook with Azure Data Factory

Prerequisites

None

Language

Course: English / Spanish

💡 Did you know this course is included in LaaS Cert?

Take this course and many more with our LaaS Cert annual license . Unlimited training for only €1,295!

✅ Microsoft, Linux-LPI, SCRUM, ITIL and Nanfor technical courses

✅ Personalized support always by your side

✅ 100% online, official and updated

Get your license now!

Information related to training

Training support

Always by your side

Training modalities

Self Learning - Virtual - In-person - Telepresence

Bonuses

For companies

Invitaciones a eventos y actividades de Nanfor

Quick links