U.S. Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Data Engineer Professional Exam Questions

Exam Name: Databricks Certified Data Engineer Professional Exam
Exam Code: Databricks Certified Data Engineer Professional
Related Certification(s): Databricks Data Engineer Professional Certification
Certification Provider: Databricks
Actual Exam Duration: 120 Minutes
Number of Databricks Certified Data Engineer Professional practice questions in our database: 215 (updated: Jun. 23, 2026)
Expected Databricks Certified Data Engineer Professional Exam Topics, as suggested by Databricks :
  • Topic 1: Databricks Tooling: The Databricks Tooling topic encompasses the various features and functionalities of Delta Lake. This includes understanding the transaction log, Optimistic Concurrency Control, Delta clone, indexing optimizations, and strategies for partitioning data for optimal performance in the Databricks SQL service.
  • Topic 2: Data Processing: The topic covers understanding partition hints, partitioning data effectively, controlling part-file sizes, updating records, leveraging Structured Streaming and Delta Lake, implementing stream-static joins and deduplication. Additionally, it delves into utilizing Change Data Capture and addressing performance issues related to small files.
  • Topic 3: Data Modeling: It focuses on understanding the objectives of data transformations, using Change Data Feed, applying Delta Lake cloning, designing multiplex bronze tables. Lastly it discusses implementing incremental processing and data quality enforcement, implementing lookup tables, and implementing Slowly Changing Dimension tables, and implementing SCD Type 0, 1, and 2 tables.
  • Topic 4: Security & Governance: It discusses creating Dynamic views to accomplishing data masking and using dynamic views to control access to rows and columns.
  • Topic 5: Monitoring & Logging: This topic includes understanding the Spark UI, inspecting event timelines and metrics, drawing conclusions from various UIs, designing systems to control cost and latency SLAs for production streaming jobs, and deploying and monitoring both streaming and batch jobs.
  • Topic 6: Testing & Deployment: It discusses adapting notebook dependencies to use Python file dependencies, leveraging Wheels for imports, repairing and rerunning failed jobs, creating jobs based on common use cases, designing systems to control cost and latency SLAs, configuring the Databricks CLI, and using the REST API to clone a job, trigger a run, and export the run output.
Disscuss Databricks Databricks Certified Data Engineer Professional Topics, Questions or Ask Anything Related
0/2000 characters

Margaret Stewart

4 days ago
I managed to pass after focusing on governance details like Unity Catalog permissions, credential passthrough concepts, and secure access patterns. Several questions were subtle about who can do what and where policies actually apply, so I reviewed real workspace scenarios.
upvoted 0 times
...

Data Processing Johnson

21 days ago
expect scenario questions about Structured Streaming that probe watermarking, stateful aggregations, and exactly-once semantics when late data arrives. Practice building streaming pipelines, checkpoint behavior, and idempotent sinks so you can explain tradeoffs, I passed the exam after repeatedly testing those failure modes.
upvoted 0 times
...

Emily Campbell

1 month ago
I passed the Databricks Certified Data Engineer Professional exam by drilling Spark SQL and DataFrame transformations until I could reason about performance tradeoffs quickly. The trickiest part was knowing when to use built in functions versus custom logic without overcomplicating pipelines.
upvoted 0 times
...

Data Modeling Johnson

2 months ago
many questions give a business requirement and ask you to choose between star schema, wide denormalized tables, or normalized models based on query patterns and update frequency, they loved slow changing dimension scenarios. Study star and snowflake designs, Delta Lake schema evolution and partitioning impacts on reads and writes, I practiced those patterns and it helped me pass the exam, thanks Pass4Success for a concise set of practice questions that sped my prep.
upvoted 0 times

Data Processing Garcia

1 month ago
expect performance tuning questions that show a Spark job and ask why a stage is slow or how to reduce shuffle, often with multiple valid options and one best tradeoff. Focus on explain plans, join strategies like broadcast versus shuffle, caching and partitioning strategy, and run a few jobs to see physical plans in action. Security and Governance you'll see scenario questions about access control where you must determine the minimal permissions, or how Unity Catalog and workspace permissions interact with table-level grants. Learn identity federation, fine grained privileges, and encryption options so you can reason about least privilege and audit trails in exam scenarios.
upvoted 0 times

Databricks Tooling Thompson

17 days ago
exam items often present a pipeline and ask which Databricks feature to use for job orchestration, secret management, or ML experiment tracking, testing your practical knowledge of the platform. Review cluster types and autoscaling, jobs API, DBFS paths, workspace structure and MLflow basics so you can choose the most operationally sound solution. Testing and Deployment look out for questions that require designing a CI/CD pipeline for Spark jobs or validating deployments with unit and integration tests under different environments. Practice writing pytest-based Spark tests, dbx or Terraform deployment flows, and the rollback and monitoring strategies used in production.
upvoted 0 times
...
...
...

Steven Adams

2 months ago
During the exam the question about Delta Lake merge semantics for CDC and handling overlapping upserts was surprisingly tricky. Practicing edge cases with small notebooks helped.
upvoted 0 times

Michael Flores

2 months ago
Interestingly, the performance questions forced me to weigh partitioning and join strategy tradeoffs instead of selecting an obvious choice.
upvoted 0 times

Ryan Bell

2 months ago
For pragmatic advice, learning to read the Spark UI and identify expensive shuffles resolved many performance guessing games.
upvoted 0 times

Crystal Brown

1 month ago
Another confusing area was workspace security because some scenarios mixed cluster policies with table ACLs and required careful reasoning.
upvoted 0 times
...
...
...

Gary Walker

2 months ago
Also, writing unit tests that simulate late-arriving records and conflicting keys made the merge behavior click for me.
upvoted 0 times
...

Donald Collins

2 months ago
Honestly, when I practiced on Databricks, hands-on labs around structured streaming watermarking clarified how late events and windowing interact under failure conditions.
upvoted 0 times
...
...

Alishia

3 months ago
Performance tuning and cost optimization in Delta Lake was rough, but the practice exams highlighted the right knobs to tweak and how to justify them.
upvoted 0 times
...

Dulce

3 months ago
I was anxious about complex data pipelines, but pass4success helped me break topics into manageable chunks and simulate exam conditions. You can do this—study smart and stay calm.
upvoted 0 times
...

Pearlie

3 months ago
Successfully passing the Databricks Certified Data Engineer Professional exam was a great experience, and Pass4Success practice questions were a big help. There was a tricky question about the different Databricks tools for data engineering. I was a bit confused about their specific use cases, but I managed to pass.
upvoted 0 times
...

Evelynn

4 months ago
I recently passed the Databricks Certified Data Engineer Professional exam, and the Pass4Success practice questions were invaluable. One question I remember was about setting up role-based access control (RBAC) for different users in Databricks. I wasn't entirely sure about the best practices, but I still succeeded.
upvoted 0 times
...

Yoko

4 months ago
Before the test, I doubted my timing and understanding; Pass4Success offered structured practice and actionable feedback that boosted my momentum. Keep practicing, your breakthrough is near.
upvoted 0 times
...

Louvenia

4 months ago
Don't underestimate the importance of understanding the fundamentals. The pass4success practice tests really drilled down into the core concepts.
upvoted 0 times
...

Glory

4 months ago
I felt the pressure rising as the date approached, yet Pass4Success provided focused reviews and strategic tips that turned doubt into clarity. Stay steady, and you'll conquer it like I did.
upvoted 0 times
...

Mona

5 months ago
Data compliance scenarios were presented. Understand how to implement data masking, auditing, and access controls for sensitive information in Databricks.
upvoted 0 times
...

Mattie

5 months ago
Passing the Databricks Certified Data Engineer Professional exam was a significant achievement, thanks to Pass4Success practice questions. A challenging question involved the different types of data processing, including batch and incremental processing. I was unsure about some optimization techniques, but I managed to pass.
upvoted 0 times
...

Lavonda

5 months ago
Revise your notes thoroughly. The Pass4Success practice questions mirrored the exam format, so I knew exactly what to expect.
upvoted 0 times
...

Antonio

5 months ago
Unity Catalog data discovery features were emphasized. Know how to implement and use data search, lineage, and tagging functionalities.
upvoted 0 times
...

Billye

6 months ago
The tricky part was understanding job orchestration in Airflow vs Databricks workflows; Pass4Success questions mirrored the exact decision points I faced.
upvoted 0 times
...

Rosio

6 months ago
The most challenging topic was streaming ETL and state management; the practice sets simulated burst loads and edge cases, which was invaluable.
upvoted 0 times
...

Kimbery

6 months ago
SQL window functions and complex joins were brutal, yet Pass4Success practice exposed the common missteps and gave me solid examples to memorize.
upvoted 0 times
...

Noe

6 months ago
I passed the Databricks Certified Data Engineer Professional exam, and the Pass4Success practice questions were a big help. One question I found difficult was about optimizing batch processing jobs in Databricks. I wasn't sure about the best optimization techniques, but I managed to pass.
upvoted 0 times
...

Sharen

7 months ago
Observability and monitoring scenarios came up. Understand how to set up alerts and dashboards for Databricks jobs and clusters.
upvoted 0 times
...

Mitsue

7 months ago
My nerves almost got the best of me, but pass4success gave me a proven study roadmap and realistic mock exams that built real confidence. You're capable—believe in your prep and acing is within reach.
upvoted 0 times
...

Lacresha

7 months ago
I struggled with Delta Lake ACID transactions and time travel, but the practice exams drilled those scenarios with real-world twists, making the tricky questions feel manageable.
upvoted 0 times
...

Domitila

7 months ago
Confidence is key! The Pass4Success practice exams boosted my self-assurance and allowed me to tackle the real exam with ease.
upvoted 0 times
...

Cassi

8 months ago
I was jittery before the Databricks exam, unsure I'd tackled the toughest topics; Pass4Success structured practice, clear explanations, and timed drills, and I walked out confident. To anyone still prepping: you've got this, keep at it and trust the process.
upvoted 0 times
...

Chau

8 months ago
Manage your time wisely during the exam. The Pass4Success practice tests gave me a great sense of the pacing and question types I'd encounter.
upvoted 0 times
...

Nadine

8 months ago
External data source integration was tested. Practice connecting to and querying various data sources like Redshift, Snowflake from Databricks.
upvoted 0 times
...

Sharee

8 months ago
The hardest part for me was optimizing Spark jobs and understanding Catalyst optimizations; pass4success drills helped me see the common pitfalls in query plans and how to tune shuffles.
upvoted 0 times
...

Niesha

9 months ago
Passing the Databricks Certified Data Engineer Professional exam was a game-changer for me. The Pass4Success practice exams really helped me identify my weak areas and focus my study efforts.
upvoted 0 times
...

Mary

9 months ago
Passing the Databricks Certified Data Engineer Professional exam was a milestone for me, and Pass4Success practice questions were crucial. A question that stood out was about creating star and snowflake schemas in data modeling. I was unsure about when to use each schema, but I still passed.
upvoted 0 times
...

Ming

9 months ago
I am thrilled to have passed the Databricks Certified Data Engineer Professional exam, with the help of Pass4Success practice questions. One challenging question involved the steps for deploying Databricks jobs using CI/CD pipelines. I wasn't entirely sure about the best practices, but I succeeded.
upvoted 0 times
...

Dante

9 months ago
Data encryption questions were included. Know how to implement encryption at rest and in transit, including key management in Databricks.
upvoted 0 times
...

Margot

10 months ago
Successfully passing the Databricks Certified Data Engineer Professional exam was a great experience, and Pass4Success practice questions were a big help. There was a tricky question about the different Databricks tools for data engineering. I was a bit confused about their specific use cases, but I managed to pass.
upvoted 0 times
...

Lindsey

10 months ago
Nailed the Databricks Data Engineer exam thanks to Pass4Success. Their prep was invaluable!
upvoted 0 times
...

Ryan

10 months ago
Cluster cost optimization scenarios were presented. Understand autoscaling, spot instances, and how to balance performance with cost in Databricks.
upvoted 0 times
...

Fernanda

12 months ago
Delta Live Tables questions appeared. Know how to design and implement end-to-end streaming pipelines with built-in quality checks.
upvoted 0 times
...

Stacey

1 year ago
Unity Catalog metadata management was a focus. Understand how to organize and discover data assets across multiple workspaces.
upvoted 0 times
...

Rosann

1 year ago
Databricks exam success! Pass4Success materials were spot-on and time-efficient.
upvoted 0 times
...

Marti

1 year ago
Exam tested knowledge on handling large-scale data processing. Study techniques for optimizing shuffle operations and managing skew in Spark.
upvoted 0 times
...

Ellen

1 year ago
Just became a Databricks Certified Data Engineer! Pass4Success, you're a game-changer for quick study.
upvoted 0 times
...

Emmett

1 year ago
Databricks API usage scenarios were included. Practice automating common tasks like job scheduling and cluster management via REST API.
upvoted 0 times
...

Cherry

1 year ago
Questions on data lake design principles came up. Understand bronze, silver, gold architecture and how to implement it using Delta Lake.
upvoted 0 times
...

Alana

1 year ago
Pass4Success made Databricks exam prep a breeze. Passed with confidence!
upvoted 0 times
...

Jovita

1 year ago
CI/CD pipeline design for Databricks projects was tested. Know best practices for version control and automated testing of notebooks and jobs.
upvoted 0 times
...

Beatriz

1 year ago
Passed the Databricks cert! Pass4Success questions were eerily similar to the real thing.
upvoted 0 times
...

Leslie

1 year ago
Multi-cloud scenarios were presented. Understand how to design portable Databricks solutions that can run on different cloud platforms.
upvoted 0 times
...

Michael

1 year ago
Performance tuning questions appeared. Study techniques for optimizing Spark jobs, including partitioning, bucketing, and Z-ordering in Delta tables.
upvoted 0 times
...

Laurena

1 year ago
Thanks to Pass4Success, I conquered the Databricks Data Engineer exam in no time. Highly recommend!
upvoted 0 times
...

Remedios

1 year ago
Data quality checks were emphasized. Know how to implement and automate data validation using Delta expectations and quality rules.
upvoted 0 times
...

Dana

1 year ago
MLflow integration was tested. Understand how to track experiments, log metrics, and deploy models using MLflow within Databricks.
upvoted 0 times
...

Brittni

1 year ago
Databricks certification achieved! Pass4Success, you're the real MVP for quick and effective prep.
upvoted 0 times
...

Laurel

1 year ago
I recently passed the Databricks Certified Data Engineer Professional exam, and the Pass4Success practice questions were invaluable. One question I remember was about setting up monitoring and logging for Databricks jobs. I wasn't completely confident in my answer, but I still succeeded.
upvoted 0 times
...

Nidia

1 year ago
Complex ETL scenarios using Databricks notebooks were presented. Practice designing multi-step transformations with error handling and notifications.
upvoted 0 times
...

Lezlie

2 years ago
Data governance questions were prevalent. Familiarize yourself with ACID properties in Delta Lake and how they enhance data reliability.
upvoted 0 times
...

Dana

2 years ago
Pass4Success nailed it with their Databricks exam prep. Passed on my first try!
upvoted 0 times
...

Renato

2 years ago
Passing the Databricks Certified Data Engineer Professional exam was a significant achievement, thanks to Pass4Success practice questions. A challenging question involved the different types of data processing, including batch and incremental processing. I was unsure about some optimization techniques, but I managed to pass.
upvoted 0 times
...

Yaeko

2 years ago
Cluster configuration scenarios were tricky. Know how to size and configure clusters for various workloads, including ML training and ETL jobs.
upvoted 0 times
...

Dean

2 years ago
I am excited to have passed the Databricks Certified Data Engineer Professional exam, with the help of Pass4Success practice questions. One question that puzzled me was about implementing security and governance policies in Databricks. I wasn't entirely sure about the best practices, but I still passed.
upvoted 0 times
...

Son

2 years ago
Structured Streaming questions popped up. Understand windowing functions, watermarking, and how to handle late-arriving data in Databricks.
upvoted 0 times
...

Alex

2 years ago
Couldn't have passed the Databricks Data Engineer exam without Pass4Success. Their questions were so relevant!
upvoted 0 times
...

Effie

2 years ago
Passing the Databricks Certified Data Engineer Professional exam was a milestone for me, and Pass4Success practice questions played a crucial role. There was a question about setting up monitoring and logging for Databricks clusters. I was a bit uncertain about the specific tools and configurations, but I succeeded.
upvoted 0 times
...

Maybelle

2 years ago
Cloud integration is key. Be prepared to design solutions that leverage Azure Data Factory or AWS Glue for orchestration with Databricks workflows.
upvoted 0 times
...

Stefany

2 years ago
I passed the Databricks Certified Data Engineer Professional exam, and the Pass4Success practice questions were a big help. One question I found difficult was about optimizing batch processing jobs in Databricks. I wasn't sure about the best optimization techniques, but I managed to pass.
upvoted 0 times
...

Heike

2 years ago
Unity Catalog permissions were a hot topic. Know how to manage access control at table, view, and column levels. Practice scenarios involving multiple catalogs and metastores.
upvoted 0 times
...

Gearldine

2 years ago
Databricks exam was tough, but Pass4Success prep made it manageable. Passed with flying colors!
upvoted 0 times
...

Misty

2 years ago
Successfully passing the Databricks Certified Data Engineer Professional exam was made easier with Pass4Success practice questions. A question that stood out was about the different Databricks tools available for data engineering tasks. I was unsure about the specific use cases for some tools, but I still passed.
upvoted 0 times
...

Charlesetta

2 years ago
Encountered questions on data modeling best practices. Understand star schema vs. snowflake schema trade-offs and when to use each in Databricks environments.
upvoted 0 times
...

Alesia

2 years ago
I am thrilled to have passed the Databricks Certified Data Engineer Professional exam, and the Pass4Success practice questions were a key resource. One challenging question involved the steps for deploying a Databricks job using CI/CD pipelines. I wasn't completely confident in my answer, but I managed to get through.
upvoted 0 times
...

Aretha

2 years ago
Wow, aced the Databricks cert in record time! Pass4Success materials were a lifesaver.
upvoted 0 times
...

Gary

2 years ago
Exam focus: Databricks SQL warehouse optimization. Be ready to interpret query plans and suggest improvements. Study execution modes and caching strategies.
upvoted 0 times
...

Mozell

2 years ago
Passing the Databricks Certified Data Engineer Professional exam was a great achievement for me, thanks to the Pass4Success practice questions. There was a tricky question about creating star and snowflake schemas in data modeling. I was a bit confused about when to use each schema, but I still succeeded.
upvoted 0 times
...

Sharen

2 years ago
I recently passed the Databricks Certified Data Engineer Professional exam, and the Pass4Success practice questions were incredibly helpful. One question I remember was about setting up role-based access control (RBAC) for different users in Databricks. I wasn't entirely sure about the best practices for implementing RBAC, but I managed to pass the exam.
upvoted 0 times
...

Isabella

2 years ago
Just passed the Databricks Certified Data Engineer Professional exam! Grateful to Pass4Success for their spot-on practice questions. Tip: Know your Delta Lake operations inside out, especially MERGE and time travel features.
upvoted 0 times
...

Sheridan

2 years ago
Just passed the Databricks Data Engineer Professional exam! Thanks Pass4Success for the spot-on practice questions.
upvoted 0 times
...

Adolph

2 years ago
Passing the Databricks Certified Data Engineer Professional exam was a rewarding experience, and I owe a big thanks to Pass4Success for their helpful practice questions. The exam covered topics like controlling part-file sizes and implementing stream-static joins. One question that I recall was about deduplicating data efficiently using Delta Lake. It required a good grasp of deduplication techniques, but I managed to tackle it successfully.
upvoted 0 times
...

Jaime

2 years ago
My exam experience was great, thanks to Pass4Success practice questions. I found the topics of Delta Lake and Structured Streaming to be particularly challenging. One question that I remember was about leveraging Change Data Capture to track changes in data over time. It required a deep understanding of how CDC works, but I was able to answer it confidently.
upvoted 0 times
...

Elmira

2 years ago
Just became a Databricks Certified Data Engineer Professional! Pass4Success's prep materials were crucial. Thanks for the efficient study resource!
upvoted 0 times
...

Jesusita

2 years ago
I recently passed the Databricks Certified Data Engineer Professional exam with the help of Pass4Success practice questions. The exam covered topics like Databricks Tooling and Data Processing. One question that stood out to me was related to optimizing performance in the Databricks SQL service by utilizing indexing optimizations. It was a bit tricky, but I managed to answer it correctly.
upvoted 0 times
...

Richelle

2 years ago
Just passed the Databricks Certified Data Engineer Professional exam! Pass4Success's questions were spot-on and saved me tons of prep time. Thanks!
upvoted 0 times
...

Denny

2 years ago
Wow, that exam was tough! Grateful for Pass4Success's relevant practice questions. Couldn't have passed without them!
upvoted 0 times
...

Alysa

2 years ago
Passed the Databricks cert! Pass4Success's exam prep was a lifesaver. Highly recommend for quick, effective studying.
upvoted 0 times
...

Herman

2 years ago
Success! Databricks Certified Data Engineer Professional exam done. Pass4Success, your questions were invaluable. Thank you!
upvoted 0 times
...

Thad

2 years ago
Databricks SQL warehouses were a significant focus. Questions involved scaling and performance tuning. Familiarize yourself with cluster configurations and caching mechanisms. Pass4Success's practice questions were spot-on for this topic.
upvoted 0 times
...

Free Databricks Databricks Certified Data Engineer Professional Exam Actual Questions

Note: Premium Questions for Databricks Certified Data Engineer Professional were last updated On Jun. 23, 2026 (see below)

Question #1

A junior data engineer has been asked to develop a streaming data pipeline with a grouped aggregation using DataFrame df. The pipeline needs to calculate the average humidity and average temperature for each non-overlapping five-minute interval. Events are recorded once per minute per device.

Streaming DataFrame df has the following schema:

"device_id INT, event_time TIMESTAMP, temp FLOAT, humidity FLOAT"

Code block:

Choose the response that correctly fills in the blank within the code block to complete this task.

Reveal Solution Hide Solution
Correct Answer: B

This is the correct answer because the window function is used to group streaming data by time intervals. The window function takes two arguments: a time column and a window duration. The window duration specifies how long each window is, and must be a multiple of 1 second. In this case, the window duration is ''5 minutes'', which means each window will cover a non-overlapping five-minute interval. The window function also returns a struct column with two fields: start and end, which represent the start and end time of each window. The alias function is used to rename the struct column as ''time''. Verified Reference: [Databricks Certified Data Engineer Professional], under ''Structured Streaming'' section;Databricks Documentation, under ''WINDOW'' section. https://www.databricks.com/blog/2017/05/08/event-time-aggregation-watermarking-apache-sparks-structured-streaming.html


Question #2

Which statement describes integration testing?

Reveal Solution Hide Solution
Correct Answer: A

This is the correct answer because it describes integration testing. Integration testing is a type of testing that validates interactions between subsystems of your application, such as modules, components, or services. Integration testing ensures that the subsystems work together as expected and produce the correct outputs or results. Integration testing can be done at different levels of granularity, such as component integration testing, system integration testing, or end-to-end testing. Integration testing can help detect errors or bugs that may not be found by unit testing, which only validates behavior of individual elements of your application. Verified Reference: [Databricks Certified Data Engineer Professional], under ''Testing'' section; Databricks Documentation, under ''Integration testing'' section.


Question #3

The data engineering team is migrating an enterprise system with thousands of tables and views into the Lakehouse. They plan to implement the target architecture using a series of bronze, silver, and gold tables. Bronze tables will almost exclusively be used by production data engineering workloads, while silver tables will be used to support both data engineering and machine learning workloads. Gold tables will largely serve business intelligence and reporting purposes. While personal identifying information (PII) exists in all tiers of data, pseudonymization and anonymization rules are in place for all data at the silver and gold levels.

The organization is interested in reducing security concerns while maximizing the ability to collaborate across diverse teams.

Which statement exemplifies best practices for implementing this system?

Reveal Solution Hide Solution
Correct Answer: A

This is the correct answer because it exemplifies best practices for implementing this system. By isolating tables in separate databases based on data quality tiers, such as bronze, silver, and gold, the data engineering team can achieve several benefits. First, they can easily manage permissions for different users and groups through database ACLs, which allow granting or revoking access to databases, tables, or views. Second, they can physically separate the default storage locations for managed tables in each database, which can improve performance and reduce costs. Third, they can provide a clear and consistent naming convention for the tables in each database, which can improve discoverability and usability. Verified Reference: [Databricks Certified Data Engineer Professional], under ''Lakehouse'' section; Databricks Documentation, under ''Database object privileges'' section.


Question #4

A data engineering team has a time-consuming data ingestion job with three data sources. Each notebook takes about one hour to load new data. One day, the job fails because a notebook update introduced a new required configuration parameter. The team must quickly fix the issue and load the latest data from the failing source.

Which action should the team take?

Reveal Solution Hide Solution
Correct Answer: A

The repair run capability in Databricks Jobs allows re-execution of failed tasks without re-running successful ones. When a parameterized job fails due to missing or incorrect task configuration, engineers can perform a repair run to fix inputs or parameters and resume from the failed state.

This approach saves time, reduces cost, and ensures workflow continuity by avoiding unnecessary recomputation. Additionally, updating the task definition with the missing parameter prevents future runs from failing.

Running the job manually (B) loses run context; (C) alone does not prevent recurrence; (D) delays resolution. Thus, A follows the correct operational and recovery practice.


Question #5

The data engineer team has been tasked with configured connections to an external database that does not have a supported native connector with Databricks. The external database already has data security configured by group membership. These groups map directly to user group already created in Databricks that represent various teams within the company.

A new login credential has been created for each group in the external database. The Databricks Utilities Secrets module will be used to make these credentials available to Databricks users.

Assuming that all the credentials are configured correctly on the external database and group membership is properly configured on Databricks, which statement describes how teams can be granted the minimum necessary access to using these credentials?

Reveal Solution Hide Solution
Correct Answer: C

In Databricks, using the Secrets module allows for secure management of sensitive information such as database credentials. Granting 'Read' permissions on a secret key that maps to database credentials for a specific team ensures that only members of that team can access these credentials. This approach aligns with the principle of least privilege, granting users the minimum level of access required to perform their jobs, thus enhancing security.


Databricks Documentation on Secret Management: Secrets


Unlock Premium Databricks Certified Data Engineer Professional Exam Questions with Advanced Practice Test Features:
  • Select Question Types you want
  • Set your Desired Pass Percentage
  • Allocate Time (Hours : Minutes)
  • Create Multiple Practice tests with Limited Questions
  • Customer Support
Get Full Access Now

Save Cancel