New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Data Engineer Professional Exam - Topic 2 Question 32 Discussion

Actual exam question for Databricks's Databricks Certified Data Engineer Professional exam
Question #: 32
Topic #: 2
[All Databricks Certified Data Engineer Professional Questions]

A Data engineer wants to run unit's tests using common Python testing frameworks on python functions defined across several Databricks notebooks currently used in production.

How can the data engineer run unit tests against function that work with data in production?

Show Suggested Answer Hide Answer
Suggested Answer: A

The best practice for running unit tests on functions that interact with data is to use a dataset that closely mirrors the production data. This approach allows data engineers to validate the logic of their functions without the risk of affecting the actual production data. It's important to have a representative sample of production data to catch edge cases and ensure the functions will work correctly when used in a production environment.


Databricks Documentation on Testing: Testing and Validation of Data and Notebooks

Contribute your Thoughts:

0/2000 characters
Adelle
2 months ago
D is a good approach if you want to keep things organized.
upvoted 0 times
...
Jerlene
2 months ago
Wait, can you really run tests in production notebooks? That seems sketchy!
upvoted 0 times
...
Amie
3 months ago
I disagree, B could be more efficient with version control.
upvoted 0 times
...
Gladys
3 months ago
C sounds risky, mixing tests with production code isn't ideal.
upvoted 0 times
...
Boris
3 months ago
A is the safest option for testing without risking production data.
upvoted 0 times
...
Carline
3 months ago
I think importing unit test functions from a separate notebook is a good way to keep tests modular, but I wonder if it complicates the setup too much.
upvoted 0 times
...
Delisa
4 months ago
I feel like defining unit tests in the same notebook could lead to confusion, but I can't recall if that's a common practice or not.
upvoted 0 times
...
Ruby
4 months ago
I'm a bit unsure about the best approach, but I think using Files in Repos could help keep things organized, like we practiced in our last session.
upvoted 0 times
...
Antonio
4 months ago
I remember we discussed the importance of using non-production data for testing to avoid any issues, so I think option A makes sense.
upvoted 0 times
...
Eloisa
4 months ago
I'm a bit confused on the best way to approach this. Should I just define everything in the same notebook? Or is it better to separate the unit tests and functions? I'll need to review the options carefully.
upvoted 0 times
...
Desire
4 months ago
Okay, I've got a plan. I'll use non-production data that closely mirrors the real thing, and define the unit tests and functions in separate notebooks. That way I can test everything thoroughly without impacting the live system.
upvoted 0 times
...
Nieves
5 months ago
Hmm, I think defining the functions and unit tests in separate files or notebooks might be the way to go. That way I can keep everything organized and easily reusable.
upvoted 0 times
...
Ilene
5 months ago
This looks like a tricky one. I'm not sure if I should use production data or not for the unit tests. I'll have to think carefully about the pros and cons of each approach.
upvoted 0 times
...
Tamesha
11 months ago
Option D is the clear winner here. Separate the tests from the actual code - that's the professional way to do it. Unless, of course, you're a mad scientist who enjoys mixing it all together. *cue evil laughter*
upvoted 0 times
Vanna
9 months ago
Rolande: That's a valid point, but separating them can help with debugging and maintenance.
upvoted 0 times
...
Izetta
9 months ago
User 3: I prefer having everything in one notebook for easier access.
upvoted 0 times
...
Rolande
10 months ago
User 2: Agreed, it's important to maintain clean and organized code.
upvoted 0 times
...
Jettie
10 months ago
User 1: Option D is definitely the way to go. Keep the tests separate from the code.
upvoted 0 times
...
...
Una
11 months ago
I think option C is the most practical solution.
upvoted 0 times
...
Thaddeus
11 months ago
Importing the unit test functions from a separate notebook is the way to go. Nice and modular, just like how I like my code.
upvoted 0 times
...
Jolene
11 months ago
I prefer option B, it's easier to manage.
upvoted 0 times
...
Hubert
11 months ago
I disagree, I believe option D is more efficient.
upvoted 0 times
...
Marci
11 months ago
Putting the unit tests in the same notebook as the functions? That just seems messy and hard to manage. Definitely going with option D.
upvoted 0 times
Cristen
10 months ago
Defining and unit testing functions using Files in Repos might provide a more organized way to manage the tests.
upvoted 0 times
...
Remedios
10 months ago
Running unit tests against non-production data that closely mirrors production could also be a good approach.
upvoted 0 times
...
Eden
10 months ago
I agree, putting unit tests in the same notebook can get messy. Option D seems like a better choice.
upvoted 0 times
...
...
Evan
11 months ago
I think option A is the best choice.
upvoted 0 times
...
Mayra
11 months ago
Defining functions in a repository and unit testing them there sounds like a good approach to me. Keeps things organized and maintainable.
upvoted 0 times
Glory
10 months ago
Defining functions in a repository and unit testing them there sounds like a good approach to me. Keeps things organized and maintainable.
upvoted 0 times
...
Andree
10 months ago
B) Define and unit test functions using Files in Repos
upvoted 0 times
...
Dion
10 months ago
A) Run unit tests against non-production data that closely mirrors production
upvoted 0 times
...
...
Sonia
11 months ago
Option A makes the most sense. Running tests against non-production data ensures we don't disrupt the actual production environment.
upvoted 0 times
Trevor
11 months ago
B) Define and unit test functions using Files in Repos
upvoted 0 times
...
Helga
11 months ago
That's a good point. We definitely don't want to mess with the production data.
upvoted 0 times
...
Carmela
11 months ago
A) Run unit tests against non-production data that closely mirrors production
upvoted 0 times
...
...

Save Cancel