質問 1:A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values.
Which of the following describes why Auto Loader inferred all of the columns to be of the string type?
A. JSON data is a text-based format
B. Auto Loader cannot infer the schema of ingested data
C. Auto Loader only works with string data
D. All of the fields had at least one null value
E. There was a type mismatch between the specific schema and the inferred schema
正解:A
解説: (Topexam メンバーにのみ表示されます)
質問 2:A data engineer is running code in a Databricks Repo that is cloned from a central Git repository. A colleague of the data engineer informs them that changes have been made and synced to the central Git repository. The data engineer now needs to sync their Databricks Repo to get the changes from the central Git repository.
Which of the following Git operations does the data engineer need to run to accomplish this task?
A. Pull
B. Merge
C. Push
D. Clone
E. Commit
正解:A
解説: (Topexam メンバーにのみ表示されます)
質問 3:Which of the following describes a benefit of creating an external table from Parquet rather than CSV when using a CREATE TABLE AS SELECT statement?
A. Parquet files have a well-defined schema
B. Parquet files can be partitioned
C. CREATE TABLE AS SELECT statements cannot be used on files
D. Parquet files have the ability to be optimized
E. Parquet files will become Delta tables
正解:A
解説: (Topexam メンバーにのみ表示されます)
質問 4:A data engineer only wants to execute the final block of a Python program if the Python variable day_of_week is equal to 1 and the Python variable review_period is True.
Which of the following control flow statements should the data engineer use to begin this conditionally executed code block?
A. if day_of_week = 1 and review_period:
B. if day_of_week == 1 and review_period:
C. if day_of_week = 1 and review_period = "True":
D. if day_of_week = 1 & review_period: = "True":
E. if day_of_week == 1 and review_period == "True":
正解:B
解説: (Topexam メンバーにのみ表示されます)
質問 5:A single Job runs two notebooks as two separate tasks. A data engineer has noticed that one of the notebooks is running slowly in the Job's current run. The data engineer asks a tech lead for help in identifying why this might be the case.
Which of the following approaches can the tech lead use to identify why the notebook is running slowly as part of the Job?
A. They can navigate to the Runs tab in the Jobs UI and click on the active run to review the processing notebook.
B. They can navigate to the Tasks tab in the Jobs UI to immediately review the processing notebook.
C. There is no way to determine why a Job task is running slowly.
D. They can navigate to the Tasks tab in the Jobs UI and click on the active run to review the processing notebook.
E. They can navigate to the Runs tab in the Jobs UI to immediately review the processing notebook.
正解:D
解説: (Topexam メンバーにのみ表示されます)
質問 6:A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following commands could the data engineering team use to access sales in PySpark?
A. spark.table("sales")
B. spark.sql("sales")
C. SELECT * FROM sales
D. There is no way to share data between PySpark and SQL.
E. spark.delta.table("sales")
正解:A
解説: (Topexam メンバーにのみ表示されます)
質問 7:A data engineer runs a statement every day to copy the previous day's sales into the table transactions. Each day's sales are in their own file in the location "/transactions/raw".
Today, the data engineer runs the following command to complete this task:
After running the command today, the data engineer notices that the number of records in table transactions has not changed.
Which of the following describes why the statement might not have copied any new records into the table?
A. The names of the files to be copied were not included with the FILES keyword.
B. The previous day's file has already been copied into the table.
C. The COPY INTO statement requires the table to be refreshed to view the copied rows.
D. The format of the files to be copied were not included with the FORMAT_OPTIONS keyword.
E. The PARQUET file format does not support COPY INTO.
正解:B
解説: (Topexam メンバーにのみ表示されます)
Databricks Databricks-Certified-Data-Engineer-Associate 認定試験の出題範囲:
トピック | 出題範囲 |
---|
トピック 1 | - ELT with Apache Spark: It focuses on extracting data, identifying the prefix, creating a view, duplicating rows, creating a new table, utilizing the dot, parsing JSON, and defining a SQL UDF. Moreover, the topic delves into describing the security model, identifying the location of a function, and identifying the PIVOT.
|
トピック 2 | - Databricks Lakehouse Platform: This topic covers the relationship between the data lakehouse and the data warehouse, the improvement in data quality, comparing and contrasting silver and gold tables, elements of the Databricks Platform Architecture, and differentiating between all-purpose clusters and jobs clusters. Moreover, it identifies how cluster software is versioned, how clusters can be filtered, how to use multiple languages, how to run one notebook, how notebooks can be shared, Git operations, and limitations in Databricks Notebooks. Lastly, the topic describes how clusters are terminated, how to use multiple languages, and how Databricks Repos enables CI
- CD workflows.
|
トピック 3 | - Data Governance: It identifies one of the four areas of data governance, Unity Catalog securables, and the cluster security modes. It also discusses how to create a UC-enabled all-purpose cluster and a DBSQL warehouse. The topic explains how to implement data object access control, create a DBSQL warehouse, and e a UC-enabled all-purpose cluster.
|
トピック 4 | - Production Pipelines: It focuses on identifying the advantages of using multiple tasks in Jobs, a suitable scenario where predecessor task should be set up, CRON as an opportunity for scheduling opportunity, and how an alert can be sent via email. The topic also discusses setting up a predecessor task in Jobs, reviewing a task's execution history, and debugging a failed task. Lastly, it delves into setting up a retry policy in case of failure and creating an alert in the case of a failed task.
|
トピック 5 | - Incremental Data Processing: In this topic questions about identifying Delta Lake, benefits of ACID transactions, a scenario to use an external table, location of a table, the benefits of Zordering, the kind of files, CTAS as a solution, the impact of ON VIOLATION DROP ROW and ON VIOLATION FAIL UPDATE, and the necessary component to create a new DLT pipeline. Moreover, the topic also discusses directory structure of Delta Lake files, generated column, adding a table comment, and the benefits of the MERGE command.
|
参照:https://www.databricks.com/learn/certification/data-engineer-associate
TopExamは君にDatabricks-Certified-Data-Engineer-Associateの問題集を提供して、あなたの試験への復習にヘルプを提供して、君に難しい専門知識を楽に勉強させます。TopExamは君の試験への合格を期待しています。
弊社のDatabricks Databricks-Certified-Data-Engineer-Associateを利用すれば試験に合格できます
弊社のDatabricks Databricks-Certified-Data-Engineer-Associateは専門家たちが長年の経験を通して最新のシラバスに従って研究し出した勉強資料です。弊社はDatabricks-Certified-Data-Engineer-Associate問題集の質問と答えが間違いないのを保証いたします。
この問題集は過去のデータから分析して作成されて、カバー率が高くて、受験者としてのあなたを助けて時間とお金を節約して試験に合格する通過率を高めます。我々の問題集は的中率が高くて、100%の合格率を保証します。我々の高質量のDatabricks Databricks-Certified-Data-Engineer-Associateを利用すれば、君は一回で試験に合格できます。
安全的な支払方式を利用しています
Credit Cardは今まで全世界の一番安全の支払方式です。少数の手続きの費用かかる必要がありますとはいえ、保障があります。お客様の利益を保障するために、弊社のDatabricks-Certified-Data-Engineer-Associate問題集は全部Credit Cardで支払われることができます。
領収書について:社名入りの領収書が必要な場合、メールで社名に記入していただき送信してください。弊社はPDF版の領収書を提供いたします。
一年間の無料更新サービスを提供します
君が弊社のDatabricks Databricks-Certified-Data-Engineer-Associateをご購入になってから、我々の承諾する一年間の更新サービスが無料で得られています。弊社の専門家たちは毎日更新状態を検査していますから、この一年間、更新されたら、弊社は更新されたDatabricks Databricks-Certified-Data-Engineer-Associateをお客様のメールアドレスにお送りいたします。だから、お客様はいつもタイムリーに更新の通知を受けることができます。我々は購入した一年間でお客様がずっと最新版のDatabricks Databricks-Certified-Data-Engineer-Associateを持っていることを保証します。
弊社は失敗したら全額で返金することを承諾します
我々は弊社のDatabricks-Certified-Data-Engineer-Associate問題集に自信を持っていますから、試験に失敗したら返金する承諾をします。我々のDatabricks Databricks-Certified-Data-Engineer-Associateを利用して君は試験に合格できると信じています。もし試験に失敗したら、我々は君の支払ったお金を君に全額で返して、君の試験の失敗する経済損失を減少します。
弊社は無料Databricks Databricks-Certified-Data-Engineer-Associateサンプルを提供します
お客様は問題集を購入する時、問題集の質量を心配するかもしれませんが、我々はこのことを解決するために、お客様に無料Databricks-Certified-Data-Engineer-Associateサンプルを提供いたします。そうすると、お客様は購入する前にサンプルをダウンロードしてやってみることができます。君はこのDatabricks-Certified-Data-Engineer-Associate問題集は自分に適するかどうか判断して購入を決めることができます。
Databricks-Certified-Data-Engineer-Associate試験ツール:あなたの訓練に便利をもたらすために、あなたは自分のペースによって複数のパソコンで設置できます。