A data engineer has created a new database using the following command:
CREATE DATABASE IF NOT EXISTS customer360;
In which of the following locations will the customer360 database be located?
dbfs:/user/hive/warehouse Thereby showing 'dbfs:/user/hive/warehouse/customer360.db
The location of the customer360 database depends on the value of thespark.sql.warehouse.dirconfiguration property, which specifies the default location for managed databases and tables. If the property is not set, the default value isdbfs:/user/hive/warehouse. Therefore, the customer360 database will be located indbfs:/user/hive/warehouse/customer360.db. However, if the property is set to a different value, such asdbfs:/user/hive/database, then the customer360 database will be located indbfs:/user/hive/database/customer360.db. Thus, more information is needed to determine the correct response.
Option A is not correct, asdbfs:/user/hive/database/customer360is not the default location for managed databases and tables, unless thespark.sql.warehouse.dirproperty is explicitly set todbfs:/user/hive/database.
Option B is not correct, asdbfs:/user/hive/warehouseis the default location for the root directory of managed databases and tables, not for a specific database. The database name should be appended with.dbto the directory path, such asdbfs:/user/hive/warehouse/customer360.db.
Option C is not correct, asdbfs:/user/hive/customer360is not a valid location for a managed database, as it does not follow the directory structure specified by thespark.sql.warehouse.dirproperty.
[Databricks Data Engineer Professional Exam Guide]
Which of the following SQL keywords can be used to convert a table from a long format to a wide format?
2:Reshaping Data - Long vs Wide Format | Databricks on AWS
5:TRANSFORM | Databricks on AWS
: [SUM | Databricks on AWS]
Which tool is used by Auto Loader to process data incrementally?
Auto Loader in Databricks utilizes Spark Structured Streaming for processing data incrementally. This allows Auto Loader to efficiently ingest streaming or batch data at scale and to recognize new data as it arrives in cloud storage. Spark Structured Streaming provides the underlying engine that supports various incremental data loading capabilities like schema inference and file notification mode, which are crucial for the dynamic nature of data lakes.
Reference: Databricks documentation on Auto Loader: Auto Loader Overview
A data engineer wants to create a new table containing the names of customers who live in France.
They have written the following command:
CREATE TABLE customersInFrance
_____ AS
SELECT id,
firstName,
lastName
FROM customerLocations
WHERE country = 'FRANCE';
A senior data engineer mentions that it is organization policy to include a table property indicating that the new table includes personally identifiable information (Pll).
Which line of code fills in the above blank to successfully complete the task?
To include a property indicating that a table contains personally identifiable information (PII), the TBLPROPERTIES keyword is used in SQL to add metadata to a table. The correct syntax to define a table property for PII is as follows:
CREATE TABLE customersInFrance
USING DELTA
TBLPROPERTIES ('PII' = 'true')
AS
SELECT id,
firstName,
lastName
FROM customerLocations
WHERE country = 'FRANCE';
The TBLPROPERTIES ('PII' = 'true') line correctly sets a table property that tags the table as containing personally identifiable information. This is in accordance with organizational policies for handling sensitive information.
Reference: Databricks documentation on Delta Lake: Delta Lake on Databricks
Which tool is used by Auto Loader to process data incrementally?
Auto Loader in Databricks utilizes Spark Structured Streaming for processing data incrementally. This allows Auto Loader to efficiently ingest streaming or batch data at scale and to recognize new data as it arrives in cloud storage. Spark Structured Streaming provides the underlying engine that supports various incremental data loading capabilities like schema inference and file notification mode, which are crucial for the dynamic nature of data lakes.
Reference: Databricks documentation on Auto Loader: Auto Loader Overview
In
2 days agoJoaquin
15 days agoYoulanda
22 days agoShanice
1 months agoAretha
2 months agoRhea
2 months agoKandis
2 months agoKindra
2 months agoFrance
2 months agoArlene
3 months agoMoira
3 months agoDiego
4 months ago