Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks Certified Data Engineer Professional Topic 5 Question 17 Discussion

Actual exam question for Databricks's Databricks Certified Data Engineer Professional exam
Question #: 17
Topic #: 5
[All Databricks Certified Data Engineer Professional Questions]

A team of data engineer are adding tables to a DLT pipeline that contain repetitive expectations for many of the same data quality checks.

One member of the team suggests reusing these data quality rules across all tables defined for this pipeline.

What approach would allow them to do this?

Show Suggested Answer Hide Answer
Suggested Answer: A

Maintaining data quality rules in a centralized Delta table allows for the reuse of these rules across multiple DLT (Delta Live Tables) pipelines. By storing these rules outside the pipeline's target schema and referencing the schema name as a pipeline parameter, the team can apply the same set of data quality checks to different tables within the pipeline. This approach ensures consistency in data quality validations and reduces redundancy in code by not having to replicate the same rules in each DLT notebook or file.


Databricks Documentation on Delta Live Tables: Delta Live Tables Guide

Contribute your Thoughts:

Luke
10 months ago
I'm just picturing the team arguing over which option is best, like a bunch of data ninjas fighting over the perfect data quality kata.
upvoted 0 times
Fernanda
9 months ago
D) Maintain data quality rules in a separate Databricks notebook that each DLT notebook of file.
upvoted 0 times
...
Anglea
9 months ago
A) Maintain data quality rules in a Delta table outside of this pipeline's target schema, providing the schema name as a pipeline parameter.
upvoted 0 times
...
...
Alba
10 months ago
Option D is the one for me! Keeping the data quality rules in a separate notebook is like a data engineer's version of 'Keep Calm and Carry On.'
upvoted 0 times
Sharita
9 months ago
I agree, having a separate notebook for data quality rules makes it easier to manage.
upvoted 0 times
...
Nieves
9 months ago
Option D is a good choice. It helps keep things organized.
upvoted 0 times
...
...
Zachary
10 months ago
Using global Python variables (option B) feels a bit hacky. I'd prefer a more structured approach like option A or D.
upvoted 0 times
Lashawnda
9 months ago
D) Maintain data quality rules in a separate Databricks notebook that each DLT notebook of file.
upvoted 0 times
...
Leanora
9 months ago
A) Maintain data quality rules in a Delta table outside of this pipeline's target schema, providing the schema name as a pipeline parameter.
upvoted 0 times
...
...
Jose
10 months ago
I agree with Val, option A seems like the most practical solution.
upvoted 0 times
...
Doug
10 months ago
I'm feeling option C. Adding constraints through an external job with access to the pipeline config seems like a robust solution.
upvoted 0 times
Kandis
9 months ago
Let's go with option C then, it seems like the most practical approach.
upvoted 0 times
...
Becky
9 months ago
It would definitely streamline the process and make it easier to manage.
upvoted 0 times
...
Dominque
9 months ago
I agree, having an external job handle the constraints seems efficient.
upvoted 0 times
...
Delbert
9 months ago
Option C sounds like a good idea. It would centralize the data quality rules.
upvoted 0 times
...
...
Val
10 months ago
But with option A, we can easily maintain and update the data quality rules.
upvoted 0 times
...
Mike
10 months ago
I disagree, I believe option D would be more efficient.
upvoted 0 times
...
Val
10 months ago
I think option A is the best approach.
upvoted 0 times
...
Antonio
11 months ago
Option A is the way to go! Maintaining data quality rules in a separate Delta table is a clean and organized approach.
upvoted 0 times
Lennie
10 months ago
That sounds like a smart solution to ensure consistency and efficiency in the data quality checks.
upvoted 0 times
...
Detra
10 months ago
I agree, it would make it easier to manage and update the data quality rules for all tables in the pipeline.
upvoted 0 times
...
Tequila
10 months ago
Option A is the way to go! Maintaining data quality rules in a separate Delta table is a clean and organized approach.
upvoted 0 times
...
...

Save Cancel