A company's data analyst needs to ensure that queries executed in Amazon Athena cannot scan more than a prescribed amount of data for cost control purposes. Queries that exceed the prescribed threshold must be canceled immediately.
What should the data analyst do to achieve this?
https://docs.aws.amazon.com/athena/latest/ug/manage-queries-control-costs-with-workgroups.html
A retail company has 15 stores across 6 cities in the United States. Once a month, the sales team requests a visualization in Amazon QuickSight that provides the ability to easily identify revenue trends across cities and stores. The visualization also helps identify outliers that need to be examined with further analysis.
Which visual type in QuickSight meets the sales team's requirements?
A company currently uses Amazon Athena to query its global datasets. The regional data is stored in Amazon S3 in the us-east-1 and us-west-2 Regions. The data is not encrypted. To simplify the query process and manage it centrally, the company wants to use Athena in us-west-2 to query data from Amazon S3 in both Regions. The solution should be as low-cost as possible.
What should the company do to achieve this goal?
A company receives datasets from partners at various frequencies. The datasets include baseline data and incremental data. The company needs to merge and store all the datasets without reprocessing the data.
Which solution will meet these requirements with the LEAST development effort?
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics1. It can process datasets from various sources and formats, such as JDBC, Amazon S3, Amazon RDS, etc.
AWS Glue job bookmarks are a feature that helps AWS Glue track data that has already been processed during a previous run of an ETL job.This can prevent the reprocessing of old data and enable the processing of new data when rerunning on a scheduled interval2. Job bookmarks can handle both baseline data and incremental data from different sources.
Amazon S3 is a highly scalable, durable, and secure object storage service that can store any amount and type of data3. It can be used as a data lake to store the merged and processed datasets from AWS Glue. It can also integrate with other AWS services, such as Amazon Athena, Amazon Redshift Spectrum, Amazon EMR, etc., for further analysis and processing.
A company is using an AWS Lambda function to run Amazon Athena queries against a cross-account AWS Glue Data Catalog. A query returns the following error:
HIVE METASTORE ERROR
The error message states that the response payload size exceeds the maximum allowed payload size. The queried table is already partitioned, and the data is stored in an
Amazon S3 bucket in the Apache Hive partition format.
Which solution will resolve this error?
Starr
2 years agoBurma
2 years agoSalena
2 years agoGalen
2 years agoSherita
2 years agoKimberlie
2 years ago