A company receives data sets coming from external providers on Amazon S3. Data sets from different providers are dependent on one another. Data sets will drive at different and is no particular order.
A data architect needs to design a solution that enables the company to do the following:
* Rapidly perform cross data set analysis as soon as the data becomes available
* Manage dependencies between data sets that arrives at different times
Which architecture strategy offers a scalable and cost-effective solution that meets these requirements?
Currently there are no comments in this discussion, be the first to comment!