DP-3027: Implement a data engineering solution with Azure Databricks

Name: DP-3027: Implement a data engineering solution with Azure Databricks
Availability: InStock

Modality:

Product Options

€295.00

| /

________________________________________________________________

Would you like to take this course online or in person?

Contact us by email: info@nanforiberica.com , phone: +34 91 031 66 78 / +34 605 98 51 30, WhatsApp: +34 685 60 05 91 , or contact our offices

________________________________________________________________

NEW COURSE: The release of DP-3027: Implement a data engineering solution with Azure Databricks is delayed, with a date to be determined.

Course DP-3027: Implement a data engineering solution with Azure Databricks

In this course, learn how to leverage the power of Apache Spark and the powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.

Level: Beginner - Role: Data Analyst, Data Engineer, Data Scientist - Product: Azure - Subject: Data Engineering

Course aimed at

Data engineers, data scientists, and ELT developers learn how to leverage the power of Apache Spark and the powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.

Objectives of the official course DP-3027

Understanding Azure Databricks architecture: Become familiar with the key components of the platform and how they integrate with other Azure services.
Implement data ingestion techniques: Learn how to capture data from multiple sources using tools like Structured Streaming and Delta Lake
Perform data transformations and processing: Use Apache Spark to cleanse, transform, and prepare data for analysis or storage.
Develop scalable ETL workflows: Build efficient and reusable data pipelines that support large volumes of data.
Optimize process performance: Apply tuning, autoscaling, and observability strategies to improve workflow efficiency.
Implement streaming architectures with Delta Live Tables: Design real-time solutions for continuous data processing.
Automate tasks with Azure Databricks Jobs: Orchestrate and schedule workflows to reduce manual intervention and accelerate insight delivery.
Apply CI/CD to data environments: Integrate continuous development practices to maintain the quality and stability of data solutions.

Contents of the official Azure Databricks DP-3027 course

Module 1 Performing Incremental Processing with Spark Structured Streaming

Introduction
Configuring real-time data sources for incremental processing
Optimizing Delta Lake for incremental processing in Azure Databricks
Handling backlogs and out-of-order events in incremental processing
Performance monitoring and tuning strategies for incremental processing in Azure Databricks
Exercise: Real-time ingestion and processing with Delta Live Tables in Azure Databricks

Module 2: Implementing Streaming Architecture Patterns with Delta Live Tables

Introduction
Event-driven architectures with Delta Live Tables
Data ingestion with structured streaming
Maintaining data consistency and reliability with structured streaming
Scaling Streaming Workloads with Delta Live Tables
Exercise: End-to-End Streaming Pipeline with Delta Live Tables

Module 3: Performance Optimization with Spark and Delta Live Tables

Introduction
Optimizing Performance with Spark and Delta Live Tables
Performing cost-based optimization and query tuning
Using Change Data Capture (CDC)
Use of enhanced autoscaling
Implement observability and data quality metrics
Exercise: Optimizing data pipelines to improve performance in Azure Databricks

Module 4: Implementing CI/CD Workflows in Azure Databricks

Introduction
Implementing version control and Git integration
Performing unit tests and integration tests
Environment administration and configuration
Implementation of rollback and catch-up strategies
Exercise: Implementing CI/CD Workflows

Module 5: Automating Workloads with Azure Databricks Jobs

Introduction
Implementation of job scheduling and automation
Optimizing workflows with parameters
Dependency management control
Implementation of error control and retry mechanisms
Exploring recommended procedures and instructions
Exercise: Automating data processing and ingestion

Module 6: Managing Data Privacy and Governance with Azure Databricks

Introduction
Implementing data encryption techniques in Azure Databricks
Managing access controls in Azure Databricks
Implementing data masking and anonymization in Azure Databricks
Using compliance frameworks and secure data sharing in Azure Databricks
Using data lineage and metadata management
Implementing governance automation in Azure Databricks
Exercise: Practice implementing the Unity catalog

Module 7: Using SQL Warehouses in Azure Databricks

Introduction
Introduction to SQL Warehouses
Creating databases and tables
Creating queries and dashboards
Exercise: Using a SQL Warehouse in Azure Databricks

Module 8: Running Azure Databricks Notebooks with Azure Data Factory

Introduction
Understanding Azure Databricks Notebooks and Pipelines
Creating a linked service for Azure Databricks
Using a Notebook Activity in a Pipeline
Using parameters in a notebook
Exercise: Running an Azure Databricks Notebook with Azure Data Factory

Prerequisites

None

Language

Course: English / Spanish

💡 Did you know this course is included in LaaS Cert?

Take this course and many more with our LaaS Cert annual license . Unlimited training for only €1,295!

✅ Microsoft, Linux-LPI, SCRUM, ITIL and Nanfor technical courses

✅ Personalized support always by your side

✅ 100% online, official and updated

Get your license now!

Information related to training

Training support

Always by your side

Training modalities

Self Learning - Virtual - In-person - Telepresence

Bonuses

For companies

Invitaciones a eventos y actividades de Nanfor

Quick links