IAPP AIGP Exam - Topic 2 Question 5 Discussion

Actual exam question for IAPP's AIGP exam

Question #: 5
Topic #: 2

You are the chief privacy officer of a medical research company that would like to collect and use sensitive data about cancer patients, such as their names, addresses, race and ethnic origin, medical histories, insurance claims, pharmaceutical prescriptions, eating and drinking habits and physical activity.

The company will use this sensitive data to build an Al algorithm that will spot common attributes that will help predict if seemingly healthy people are more likely to get cancer. However, the company is unable to obtain consent from enough patients to sufficiently collect the minimum data to train its model.

Which of the following solutions would most efficiently balance privacy concerns with the lack of available data during the testing phase?

ADeploy the current model and recalibrate it over time with more data.

BExtend the model to multi-modal ingestion with text and images.

CUtilize synthetic data to offset the lack of patient data.

DRefocus the algorithm to patients without cancer.

Show Suggested Answer

Suggested Answer: C

Utilizing synthetic data to offset the lack of patient data is an efficient solution that balances privacy concerns with the need for sufficient data to train the model. Synthetic data can be generated to simulate real patient data while avoiding the privacy issues associated with using actual patient data. This approach allows for the development and testing of the AI algorithm without compromising patient privacy, and it can be refined with real data as it becomes available. Reference: AIGP Body of Knowledge on Data Privacy and AI Model Training.

by Billy at Jun 19, 2024, 03:28 PM

Limited Time Offer

25%

Off

Get Premium AIGP Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Karl

5 months ago

I think focusing on non-cancer patients is a good idea too!

upvoted 0 times

...

Tenesha

5 months ago

Wait, can synthetic data really replace real patient info?

upvoted 0 times

...

Stephanie

6 months ago

Extending to multi-modal ingestion could be interesting.

upvoted 0 times

...

Margot

6 months ago

I disagree, we need real patient data for accuracy.

upvoted 0 times

...

Lajuana

6 months ago

Using synthetic data sounds like a smart move!

upvoted 0 times

...

Coral

6 months ago

Refocusing the algorithm to patients without cancer seems like a way to sidestep the consent issue, but I wonder if that would really help in predicting cancer risks.

upvoted 0 times

...

Louisa

6 months ago

I feel like extending the model to include more data types could complicate things. We might lose focus on the core issue of data privacy.

upvoted 0 times

...

Shantay

7 months ago

I think using synthetic data could be a good option, but I'm a bit unsure about how realistic it would be for training the model effectively.

upvoted 0 times

...

Nobuko

7 months ago

I remember we discussed the importance of patient consent in our last class, so I'm not sure if deploying the current model without enough data is ethical.

upvoted 0 times

...

Nickolas

7 months ago

Refocusing the algorithm to patients without cancer seems like it could be a good way to sidestep the privacy issues, but I'm not sure if that would still meet the company's goals. I'll need to really understand the problem and the options.

upvoted 0 times

...

Carri

7 months ago

I think the key here is finding a way to balance the privacy needs with the data requirements. Maybe a combination of approaches could work, like using some synthetic data but also trying to get more consent.

upvoted 0 times

...

Shonda

7 months ago

Hmm, using synthetic data could be an interesting approach, but I'm not sure if that would be considered a robust enough solution. I'll have to think that one through.

upvoted 0 times

...

Chan

7 months ago

This is a tricky one. I'm not sure if I have a clear strategy yet, but I'll need to carefully weigh the privacy concerns against the need for data.

upvoted 0 times

...

Ira

7 months ago

I'm a little confused by this question. Blockchains don't use any data structures? That doesn't sound right. I'll have to review my blockchain knowledge before answering this.

upvoted 0 times

...

Rebbecca

7 months ago

Okay, let's see here. The key is that the web service needs to return the exponential of the scored label as the predicted price, and the input data shouldn't include the price column. I think I've got a plan!

upvoted 0 times

...

Curtis

2 years ago

Hey, at least they're not asking us to just deploy the current model and 'recalibrate it over time'! That would be like playing cancer prediction roulette.

upvoted 0 times

Chara

2 years ago

B) Extend the model to multi-modal ingestion with text and images.

upvoted 0 times

...

Tamala

2 years ago

C) Utilize synthetic data to offset the lack of patient data.

upvoted 0 times

...

Deeann

2 years ago

A) Deploy the current model and recalibrate it over time with more data.

upvoted 0 times

...

Rolland

2 years ago

That's a good point, using synthetic data could help balance privacy concerns while still training the model effectively.

upvoted 0 times

...

Alva

2 years ago

C) Utilize synthetic data to offset the lack of patient data.

upvoted 0 times

...

Garry

2 years ago

A) Deploy the current model and recalibrate it over time with more data.

upvoted 0 times

...

Carissa

2 years ago

Refocusing the algorithm to patients without cancer is an interesting idea, but I'm not sure it would give us the insights we need to predict cancer risk effectively.

upvoted 0 times

Audria

2 years ago

C) Utilize synthetic data to offset the lack of patient data.

upvoted 0 times

...

Johanna

2 years ago

A) Deploy the current model and recalibrate it over time with more data.

upvoted 0 times

...

Cyril

2 years ago

I'm not sure extending the model to multi-modal ingestion is the best idea. That could just end up making the privacy concerns even worse.

upvoted 0 times

Jani

2 years ago

D) Refocus the algorithm to patients without cancer.

upvoted 0 times

...

Carol

2 years ago

I agree, using synthetic data could be a good solution to balance privacy concerns.

upvoted 0 times

...

Amie

2 years ago

C) Utilize synthetic data to offset the lack of patient data.

upvoted 0 times

...

Nu

2 years ago

A) Deploy the current model and recalibrate it over time with more data.

upvoted 0 times

...

Hubert

2 years ago

I see your point, Joye, but I think option D could also be a viable solution.

upvoted 0 times

...

Estrella

2 years ago

Using synthetic data to offset the lack of patient data seems like the most ethical solution here. It allows us to build the model without compromising patient privacy.

upvoted 0 times