2024 Mastering Dbt (Data Build Tool) - From Beginner To Pro
Last updated 2/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 3.32 GB | Duration: 7h 26m
Hands-on Analytics Engineering Bootcamp With: Theory, Building a dbt Project from Scratch, and Deploying to dbt Cloud
What you'll learn
How to build a complete dbt project from scratch
The main benefits of dbt, and a bit of background as to how it came about
All of the dbt fundamentals: sources, models, tests, documentation, snapshots, seeds, macros, hooks, and operations
How to structure a dbt project: staging, intermediate, and mart models - and naming conventions
How to version control changes to your code with GitHub and VSCode
Advanced dbt testing - creating your own custom singular & generic tests, setting severity, and setting warn/error thresholds
Advanced dbt data modelling - model materialisation and governance (access, contracts, and versions)
Advanced dbt commands - how to use different selectors, different profiles, tags, indirect test selection and building a local dbt documents site
Advanced dbt jinja & macros - creating your own macros to use in hooks / functions / operations, using jinja for loops and variables, and the target function
How to deploy your project on dbt Cloud, how to use the dbt Cloud UI, and using environment variables
How to use tests & macros from external packages to supercharge your dbt project
Best practises to use when running a dbt project (based on lots of experience!)
How to create a complete setup for Mac or Windows: installing all of the tools and getting a dbt specific VSCode setup!
Requirements
Basic SQL
No Python experience needed
Mac / Windows machine which is capable of installing Python, Git, and VSCode (we'll run through all of this in the course!)
Description
A complete course to help anyone with basic SQL skills learn advanced dbt, a key tool for Analytics Engineering!Welcome to the 2024 Mastering dbt (data build tool) course! This course runs through everything from the theory behind dbt to building an advanced dbt project (from scratch) and deploying it on dbt Cloud.I have over 8 years of experience across Analytics / Analytics Engineering / Data Science, including 4 years using dbt on a daily basis. I was also involved in the rollout of dbt in my time at Monzo Bank! In this course I've taken everything I've learnt over the past 4 years, and what I use on a daily basis, and condensed it to take anyone who knows SQL to an advanced level of dbt as quickly as possible.MY APPROACH TO THIS COURSE:We'll cover everything you need to know about dbt: from the basic data modelling right through to all of the advanced features such as creating custom tests and macros. We'll be doing this step by step, and build from the basics upwards.It's focused on practical outcomes - we won't be spending ages on database theory, or going into lots of detail on the eCommerce dataset we'll be using, instead we'll be aiming to get you up to advanced dbt levels as quickly as possible.For every video where we're writing code, I've created lesson attachments with the final outputs. This means you can either code as you go along, or watch the videos and look at the handouts afterwards! I've also included some theory with these handouts to help hammer home the points made in the videos.There's also a public GitHub repository (which you'll be using for this course) that contains a model final project you can reference throughout.This course isn't static! I'd love to hear your feedback and will be updating this course on an ongoing basis.COURSE STRUCTURE:This course focuses on first getting a good understanding of what problems dbt solves, then building a basic dbt project, before layering on more advanced concepts and finally deploying our project with dbt Cloud.IntroductionSome theory (<1 hour) around dbt, what problems existed in the data stack before it came along, and how it solves them.Tool setupGetting set up with Python, GitHub, Google BigQuery, VSCode, and of course dbt! If you're familiar with any of these tools already then you are more than welcome to skip the appropriate lessons.We'll also be exploring the fictional eCommerce dataset that we'll be using throughout the course.Building our basic dbt projectThis section focuses on creating our project from scratch, including how we will structure our project. We'll be building out staging (stg), intermediate (int), and mart data models, including documentation & testing with the out-of-the-box dbt tests.Advanced dbt testingWe'll start to build on our basic dbt project by setting test severity & thresholds, using the dbt-utils and dbt-expectations external packages for their excellent selection of tests, creating our own custom singular & generic tests, and testing the freshness of our source data.Advanced data modelling with dbtNext, we'll be looking at how we can create reusable documentation, seed files (version controlled .csv files), snapshots (capturing changes to data tables), and materialisation methods.Most of this section will be focused on the last part - the materialisation methods: ephemeral, view, table, and incremental. By this point we'll have encountered view & table models and we will be building both an incremental and an ephemeral model - and you will gain an understanding of what to use and when.This section includes all model governance features from dbt version 1.5! This includes model access, groups, contracts, and versions.Advanced dbt commandsThis section will focus less on changing our dbt project, but instead all of the major dbt commands and how (and when) to use them.Advanced Jinja & macrosThe final changes to our project will involve using Jinja - a core feature of dbt and arguably it's most complex but powerful feature - and using it to create our own macros.This section will run through how you can use Jinja macros for hooks, operations, and as reusable functions in your SQL models. It'll also run through some theory around Jinja, common mistakes, and what I (personally) find to be what it's most useful for!dbt CloudFinally, we'll be exploring how to take our project and deploy it on dbt Cloud - including how to schedule it to run on a regular basis. We'll also be looking at dbt Cloud itself and its main benefits.
Overview
Section 1: Introduction
Lecture 1 Instructor Introduction
Lecture 2 Course outline
Lecture 3 Course Introduction
Lecture 4 A Brief History of the Data Stack
Lecture 5 Benefits of dbt - Inferring Dependencies
Lecture 6 Benefits of dbt - Documentation & Testing
Lecture 7 Benefits of dbt - Python-Like Functionality
Lecture 8 How dbt Has Solved a Lot of Problems in the Data Stack
Lecture 9 How dbt Fits in the Data Stack
Lecture 10 dbt Core vs. dbt Cloud
Lecture 11 Section Recap
Section 2: Getting Set Up with Your Tools
Lecture 12 Section Overview
Lecture 13 Note on Continual Course Updates
Lecture 14 Help If You Get Stuck During This Course
Lecture 15 Creating a Gmail Account
Lecture 16 Setting up a BigQuery Project With Billing
Lecture 17 (Optional) If You Have Issues With BigQuery Billing
Lecture 18 The BigQuery UI
Lecture 19 The Dataset You'll Be Using
Lecture 20 (Mac) Installing Python 3.10
Lecture 21 (Windows) Installing Python 3.10
Lecture 22 Downloading VSCode and Setting Up Shortcuts
Lecture 23 Creating a GitHub account
Lecture 24 Forking Vs. Cloning
Lecture 25 Forking the Repository
Lecture 26 (Optional) If You Have Issues Syncing Your Forked Repository
Lecture 27 Installing the recommended VSCode Extensions
Lecture 28 What's a Virtual Environment (venv)?
Lecture 29 Setting Up Our Virtual Environment and Installing Packages
Lecture 30 Setting Up dbt for BigQuery
Lecture 31 Trialling Our Model dbt Project
Lecture 32 (Optional) Setting Up dbt Autocomplete
Lecture 33 Run Through of How Our Final Project Will Look
Lecture 34 Section Recap
Section 3: Building the Basic dbt Project
Lecture 35 Section Overview
Lecture 36 The dbt init Command
Lecture 37 Version Control with GitHub
Lecture 38 Setting up dbt Power User
Lecture 39 How We'll Structure Our Project
Lecture 40 Creating Our First Source (src) yml File
Lecture 41 (Windows) Issues with the dbt Power User extension
Lecture 42 Creating Our First Staging (stg) SQL Model
Lecture 43 Running Our First Staging (stg) SQL Model
Lecture 44 Creating Our First Model yml File
Lecture 45 Adding Tests to Our First Model yml File
Lecture 46 Setting Up Our Models to Materialise as Tables Instead of Views
Lecture 47 Getting the Rest of Our Staging (stg) SQL Models Set Up
Lecture 48 Using dbt clean to Get Table Materialisation Working
Lecture 49 Getting the Rest of the Staging (stg) yml Files Set Up
Lecture 50 Taking Stock of Our Staging (stg) Data Models
Lecture 51 The Target Folder
Lecture 52 Getting Our First Intermediate (int) SQL Model Set Up
Lecture 53 Getting Our First Intermediate (int) yml File Set Up
Lecture 54 Getting Our Mart SQL Model Set Up
Lecture 55 Getting Our Mart yml File Set Up
Lecture 56 Our Basic dbt Project Is Now Complete!
Lecture 57 Section Recap
Section 4: Advanced dbt: Testing
Lecture 58 Section Overview
Lecture 59 Setting Default Test Severity
Lecture 60 Setting Test Severity and Thresholds
Lecture 61 The External dbt Packages We'll Be Using
Lecture 62 dbt_utils and dbt_expectations
Lecture 63 Custom Singular Tests
Lecture 64 Custom Generic Tests
Lecture 65 Applying Advanced Tests to Our Whole Project
Lecture 66 Source Freshness Tests
Lecture 67 Section Recap
Section 5: Advanced dbt: Data Modelling
Lecture 68 Section Overview
Lecture 69 The doc Function
Lecture 70 Seed Files
Lecture 71 dbt Snapshots
Lecture 72 Materialisation Types
Lecture 73 Materialisation: Ephemeral Models
Lecture 74 Materialisation: Incremental Models
Lecture 75 (Optional) Partitioning a Table in BigQuery
Lecture 76 Model Governance Overview
Lecture 77 Model Governance - Access & Groups
Lecture 78 Model Governance - Contracts
Lecture 79 Model Governance - Versions
Lecture 80 Section Recap
Section 6: Advanced dbt: Commands and Selectors
Lecture 81 Section Overview
Lecture 82 Commands For a Clean dbt run
Lecture 83 Using Different dbt Profiles
Lecture 84 Selectors
Lecture 85 Tags
Lecture 86 Indirect Test Selection
Lecture 87 dbt test With -warn-error
Lecture 88 dbt build
Lecture 89 dbt docs generate / serve
Lecture 90 Section Recap
Section 7: Advanced dbt: Jinja and Macros
Lecture 91 Section Overview
Lecture 92 Jinja Comments, Statements, and Expressions
Lecture 93 The 3 Types of Macro: Functions, Hooks, Operations
Lecture 94 (Optional) dbt Jinja Function Reference
Lecture 95 Macros: Operations
Lecture 96 Macros: Functions (Building a Basic Macro)
Lecture 97 Macros: Hooks
Lecture 98 Jinja Statements: for Loops and Setting Variables
Lecture 99 (Optional) Jinja: Using the Target Function
Lecture 100 Section Recap
Section 8: dbt Cloud
Lecture 101 Section Overview
Lecture 102 Creating a dbt Cloud Account
Lecture 103 Setting Up a Service Account
Lecture 104 Connecting GitHub to dbt Cloud
Lecture 105 The dbt Cloud IDE
Lecture 106 Deploying Jobs on dbt Cloud
Lecture 107 Section Recap
Data Analysts,Data Scientists,Analytics Engineers,Data Engineers,BI Professionals,Anyone interested in getting into data!
What you'll learn
How to build a complete dbt project from scratch
The main benefits of dbt, and a bit of background as to how it came about
All of the dbt fundamentals: sources, models, tests, documentation, snapshots, seeds, macros, hooks, and operations
How to structure a dbt project: staging, intermediate, and mart models - and naming conventions
How to version control changes to your code with GitHub and VSCode
Advanced dbt testing - creating your own custom singular & generic tests, setting severity, and setting warn/error thresholds
Advanced dbt data modelling - model materialisation and governance (access, contracts, and versions)
Advanced dbt commands - how to use different selectors, different profiles, tags, indirect test selection and building a local dbt documents site
Advanced dbt jinja & macros - creating your own macros to use in hooks / functions / operations, using jinja for loops and variables, and the target function
How to deploy your project on dbt Cloud, how to use the dbt Cloud UI, and using environment variables
How to use tests & macros from external packages to supercharge your dbt project
Best practises to use when running a dbt project (based on lots of experience!)
How to create a complete setup for Mac or Windows: installing all of the tools and getting a dbt specific VSCode setup!
Requirements
Basic SQL
No Python experience needed
Mac / Windows machine which is capable of installing Python, Git, and VSCode (we'll run through all of this in the course!)
Description
A complete course to help anyone with basic SQL skills learn advanced dbt, a key tool for Analytics Engineering!Welcome to the 2024 Mastering dbt (data build tool) course! This course runs through everything from the theory behind dbt to building an advanced dbt project (from scratch) and deploying it on dbt Cloud.I have over 8 years of experience across Analytics / Analytics Engineering / Data Science, including 4 years using dbt on a daily basis. I was also involved in the rollout of dbt in my time at Monzo Bank! In this course I've taken everything I've learnt over the past 4 years, and what I use on a daily basis, and condensed it to take anyone who knows SQL to an advanced level of dbt as quickly as possible.MY APPROACH TO THIS COURSE:We'll cover everything you need to know about dbt: from the basic data modelling right through to all of the advanced features such as creating custom tests and macros. We'll be doing this step by step, and build from the basics upwards.It's focused on practical outcomes - we won't be spending ages on database theory, or going into lots of detail on the eCommerce dataset we'll be using, instead we'll be aiming to get you up to advanced dbt levels as quickly as possible.For every video where we're writing code, I've created lesson attachments with the final outputs. This means you can either code as you go along, or watch the videos and look at the handouts afterwards! I've also included some theory with these handouts to help hammer home the points made in the videos.There's also a public GitHub repository (which you'll be using for this course) that contains a model final project you can reference throughout.This course isn't static! I'd love to hear your feedback and will be updating this course on an ongoing basis.COURSE STRUCTURE:This course focuses on first getting a good understanding of what problems dbt solves, then building a basic dbt project, before layering on more advanced concepts and finally deploying our project with dbt Cloud.IntroductionSome theory (<1 hour) around dbt, what problems existed in the data stack before it came along, and how it solves them.Tool setupGetting set up with Python, GitHub, Google BigQuery, VSCode, and of course dbt! If you're familiar with any of these tools already then you are more than welcome to skip the appropriate lessons.We'll also be exploring the fictional eCommerce dataset that we'll be using throughout the course.Building our basic dbt projectThis section focuses on creating our project from scratch, including how we will structure our project. We'll be building out staging (stg), intermediate (int), and mart data models, including documentation & testing with the out-of-the-box dbt tests.Advanced dbt testingWe'll start to build on our basic dbt project by setting test severity & thresholds, using the dbt-utils and dbt-expectations external packages for their excellent selection of tests, creating our own custom singular & generic tests, and testing the freshness of our source data.Advanced data modelling with dbtNext, we'll be looking at how we can create reusable documentation, seed files (version controlled .csv files), snapshots (capturing changes to data tables), and materialisation methods.Most of this section will be focused on the last part - the materialisation methods: ephemeral, view, table, and incremental. By this point we'll have encountered view & table models and we will be building both an incremental and an ephemeral model - and you will gain an understanding of what to use and when.This section includes all model governance features from dbt version 1.5! This includes model access, groups, contracts, and versions.Advanced dbt commandsThis section will focus less on changing our dbt project, but instead all of the major dbt commands and how (and when) to use them.Advanced Jinja & macrosThe final changes to our project will involve using Jinja - a core feature of dbt and arguably it's most complex but powerful feature - and using it to create our own macros.This section will run through how you can use Jinja macros for hooks, operations, and as reusable functions in your SQL models. It'll also run through some theory around Jinja, common mistakes, and what I (personally) find to be what it's most useful for!dbt CloudFinally, we'll be exploring how to take our project and deploy it on dbt Cloud - including how to schedule it to run on a regular basis. We'll also be looking at dbt Cloud itself and its main benefits.
Overview
Section 1: Introduction
Lecture 1 Instructor Introduction
Lecture 2 Course outline
Lecture 3 Course Introduction
Lecture 4 A Brief History of the Data Stack
Lecture 5 Benefits of dbt - Inferring Dependencies
Lecture 6 Benefits of dbt - Documentation & Testing
Lecture 7 Benefits of dbt - Python-Like Functionality
Lecture 8 How dbt Has Solved a Lot of Problems in the Data Stack
Lecture 9 How dbt Fits in the Data Stack
Lecture 10 dbt Core vs. dbt Cloud
Lecture 11 Section Recap
Section 2: Getting Set Up with Your Tools
Lecture 12 Section Overview
Lecture 13 Note on Continual Course Updates
Lecture 14 Help If You Get Stuck During This Course
Lecture 15 Creating a Gmail Account
Lecture 16 Setting up a BigQuery Project With Billing
Lecture 17 (Optional) If You Have Issues With BigQuery Billing
Lecture 18 The BigQuery UI
Lecture 19 The Dataset You'll Be Using
Lecture 20 (Mac) Installing Python 3.10
Lecture 21 (Windows) Installing Python 3.10
Lecture 22 Downloading VSCode and Setting Up Shortcuts
Lecture 23 Creating a GitHub account
Lecture 24 Forking Vs. Cloning
Lecture 25 Forking the Repository
Lecture 26 (Optional) If You Have Issues Syncing Your Forked Repository
Lecture 27 Installing the recommended VSCode Extensions
Lecture 28 What's a Virtual Environment (venv)?
Lecture 29 Setting Up Our Virtual Environment and Installing Packages
Lecture 30 Setting Up dbt for BigQuery
Lecture 31 Trialling Our Model dbt Project
Lecture 32 (Optional) Setting Up dbt Autocomplete
Lecture 33 Run Through of How Our Final Project Will Look
Lecture 34 Section Recap
Section 3: Building the Basic dbt Project
Lecture 35 Section Overview
Lecture 36 The dbt init Command
Lecture 37 Version Control with GitHub
Lecture 38 Setting up dbt Power User
Lecture 39 How We'll Structure Our Project
Lecture 40 Creating Our First Source (src) yml File
Lecture 41 (Windows) Issues with the dbt Power User extension
Lecture 42 Creating Our First Staging (stg) SQL Model
Lecture 43 Running Our First Staging (stg) SQL Model
Lecture 44 Creating Our First Model yml File
Lecture 45 Adding Tests to Our First Model yml File
Lecture 46 Setting Up Our Models to Materialise as Tables Instead of Views
Lecture 47 Getting the Rest of Our Staging (stg) SQL Models Set Up
Lecture 48 Using dbt clean to Get Table Materialisation Working
Lecture 49 Getting the Rest of the Staging (stg) yml Files Set Up
Lecture 50 Taking Stock of Our Staging (stg) Data Models
Lecture 51 The Target Folder
Lecture 52 Getting Our First Intermediate (int) SQL Model Set Up
Lecture 53 Getting Our First Intermediate (int) yml File Set Up
Lecture 54 Getting Our Mart SQL Model Set Up
Lecture 55 Getting Our Mart yml File Set Up
Lecture 56 Our Basic dbt Project Is Now Complete!
Lecture 57 Section Recap
Section 4: Advanced dbt: Testing
Lecture 58 Section Overview
Lecture 59 Setting Default Test Severity
Lecture 60 Setting Test Severity and Thresholds
Lecture 61 The External dbt Packages We'll Be Using
Lecture 62 dbt_utils and dbt_expectations
Lecture 63 Custom Singular Tests
Lecture 64 Custom Generic Tests
Lecture 65 Applying Advanced Tests to Our Whole Project
Lecture 66 Source Freshness Tests
Lecture 67 Section Recap
Section 5: Advanced dbt: Data Modelling
Lecture 68 Section Overview
Lecture 69 The doc Function
Lecture 70 Seed Files
Lecture 71 dbt Snapshots
Lecture 72 Materialisation Types
Lecture 73 Materialisation: Ephemeral Models
Lecture 74 Materialisation: Incremental Models
Lecture 75 (Optional) Partitioning a Table in BigQuery
Lecture 76 Model Governance Overview
Lecture 77 Model Governance - Access & Groups
Lecture 78 Model Governance - Contracts
Lecture 79 Model Governance - Versions
Lecture 80 Section Recap
Section 6: Advanced dbt: Commands and Selectors
Lecture 81 Section Overview
Lecture 82 Commands For a Clean dbt run
Lecture 83 Using Different dbt Profiles
Lecture 84 Selectors
Lecture 85 Tags
Lecture 86 Indirect Test Selection
Lecture 87 dbt test With -warn-error
Lecture 88 dbt build
Lecture 89 dbt docs generate / serve
Lecture 90 Section Recap
Section 7: Advanced dbt: Jinja and Macros
Lecture 91 Section Overview
Lecture 92 Jinja Comments, Statements, and Expressions
Lecture 93 The 3 Types of Macro: Functions, Hooks, Operations
Lecture 94 (Optional) dbt Jinja Function Reference
Lecture 95 Macros: Operations
Lecture 96 Macros: Functions (Building a Basic Macro)
Lecture 97 Macros: Hooks
Lecture 98 Jinja Statements: for Loops and Setting Variables
Lecture 99 (Optional) Jinja: Using the Target Function
Lecture 100 Section Recap
Section 8: dbt Cloud
Lecture 101 Section Overview
Lecture 102 Creating a dbt Cloud Account
Lecture 103 Setting Up a Service Account
Lecture 104 Connecting GitHub to dbt Cloud
Lecture 105 The dbt Cloud IDE
Lecture 106 Deploying Jobs on dbt Cloud
Lecture 107 Section Recap
Data Analysts,Data Scientists,Analytics Engineers,Data Engineers,BI Professionals,Anyone interested in getting into data!
Code:
Bitte
Anmelden
oder
Registrieren
um Code Inhalt zu sehen!
Code:
Bitte
Anmelden
oder
Registrieren
um Code Inhalt zu sehen!