1
00:00:00,050 --> 00:00:00,770
Case study.

2
00:00:00,770 --> 00:00:06,440
Balancing accuracy and interpretability in AI metastasis journey in healthcare diagnostics.

3
00:00:06,470 --> 00:00:12,380
AI model selection is crucial for achieving the delicate balance between accuracy and interpretability,

4
00:00:12,380 --> 00:00:16,310
influencing the effectiveness and acceptance of AI systems.

5
00:00:16,970 --> 00:00:22,400
In this case study, we explore the experiences of Medistat, a healthcare analytics company, as it

6
00:00:22,430 --> 00:00:28,730
tackles the interplay between these two critical dimensions while developing an AI diagnostic tool.

7
00:00:29,930 --> 00:00:36,680
Doctor Emily Carter, chief data scientist at Medistat, faces a pivotal decision her team is developing

8
00:00:36,680 --> 00:00:41,420
an AI model to predict the likelihood of patients developing type two diabetes.

9
00:00:41,720 --> 00:00:47,030
The goal is to provide early diagnosis and intervention, potentially improving patient outcomes.

10
00:00:47,510 --> 00:00:53,510
The challenge is to choose between a highly accurate but opaque deep learning model and a more interpretable

11
00:00:53,510 --> 00:00:56,720
but slightly less accurate logistic regression model.

12
00:00:58,100 --> 00:01:03,360
As the team analyzes data from various electronic health records, they find that the deep learning

13
00:01:03,360 --> 00:01:09,960
model achieves a 95% accuracy rate in predicting diabetes onset, while the logistic regression model

14
00:01:09,960 --> 00:01:12,390
achieves an 88% accuracy rate.

15
00:01:13,050 --> 00:01:16,740
Doctor Carter must decide which model to recommend for deployment.

16
00:01:17,580 --> 00:01:23,280
This decision brings up the first critical question how important is the marginal increase in accuracy

17
00:01:23,280 --> 00:01:27,330
compared to the need for interpretability in the health care context?

18
00:01:29,130 --> 00:01:34,500
The team knows that health care professionals, including doctors and nurses, must understand how AI

19
00:01:34,500 --> 00:01:37,950
models make predictions to trust and effectively use them.

20
00:01:38,490 --> 00:01:40,650
This requirement leads to another question.

21
00:01:41,310 --> 00:01:46,950
Can the complex deep learning model be simplified or made interpretable enough to meet regulatory and

22
00:01:46,950 --> 00:01:48,180
ethical standards?

23
00:01:48,930 --> 00:01:54,930
The regulatory environment in health care demands transparency, especially when decisions impact patient

24
00:01:54,930 --> 00:01:55,500
care.

25
00:01:56,310 --> 00:02:02,730
Doctor Carter ponders if tools like local interpretable, model agnostic explanations or Shapley additive

26
00:02:02,730 --> 00:02:07,970
explanations could be used to provide sufficient transparency for the deep learning model.

27
00:02:09,320 --> 00:02:15,380
While considering this, another aspect surfaces the potential biases within the model.

28
00:02:15,950 --> 00:02:21,500
The team needs to ensure that the chosen model does not perpetuate biases inherent in the training data,

29
00:02:21,530 --> 00:02:28,100
prompting the question what measures can be taken to identify and mitigate biases in the chosen AI model?

30
00:02:28,850 --> 00:02:34,250
Doctor Carter recalls studies showing that models trained on biased data can lead to unfair treatment

31
00:02:34,250 --> 00:02:39,410
of certain patient groups, affecting their trust, and the overall efficacy of the AI system.

32
00:02:40,520 --> 00:02:46,490
To explore these concerns, the team deploys both models in a pilot phase across two hospitals.

33
00:02:47,090 --> 00:02:52,130
They collect feedback from clinicians and patients, revealing that while the deep learning model's

34
00:02:52,130 --> 00:02:58,220
predictions are highly accurate, clinicians are hesitant to act on them due to a lack of clear rationale

35
00:02:58,250 --> 00:02:59,630
behind the predictions.

36
00:03:00,380 --> 00:03:07,190
This feedback emphasizes the importance of transparency and leads to another key question how can user

37
00:03:07,190 --> 00:03:13,940
feedback be effectively incorporated into the model selection process to ensure both accuracy and interpretability

38
00:03:13,940 --> 00:03:15,560
are adequately balanced?

39
00:03:17,450 --> 00:03:22,850
In parallel, the team also evaluates the performance of the logistic regression model in real world

40
00:03:22,850 --> 00:03:23,750
scenarios.

41
00:03:24,230 --> 00:03:29,960
Although less accurate, it provides clear insights into how different features such as age, BMI,

42
00:03:29,960 --> 00:03:32,900
and blood sugar levels contribute to the prediction.

43
00:03:33,380 --> 00:03:39,980
This clarity boosts clinician confidence, supporting patient trust and adherence to recommended interventions.

44
00:03:40,580 --> 00:03:46,670
The pilot results suggest that despite the lower accuracy, the logistic regression model might be more

45
00:03:46,670 --> 00:03:50,150
practical for real world application in this context.

46
00:03:51,470 --> 00:03:57,260
To further analyze the tradeoffs, the team examines cases where the logistic regression model made

47
00:03:57,260 --> 00:03:58,790
incorrect predictions.

48
00:03:59,330 --> 00:04:05,360
They find that in many instances, the errors were due to the model's inability to capture complex interactions

49
00:04:05,360 --> 00:04:06,650
between features.

50
00:04:06,920 --> 00:04:11,960
This finding raises another important question can the accuracy of the logistic regression model be

51
00:04:11,960 --> 00:04:18,500
improved by incorporating non-linear relationships without significantly compromising interpretability?

52
00:04:21,350 --> 00:04:27,290
Doctor Carter considers generalized additive models, which extend logistic regression by allowing non-linear

53
00:04:27,320 --> 00:04:30,230
relationships while maintaining interpretability.

54
00:04:31,250 --> 00:04:37,520
By experimenting with Gams, the team achieves an accuracy rate of 92%, a notable improvement over

55
00:04:37,520 --> 00:04:39,260
the logistic regression model.

56
00:04:40,250 --> 00:04:45,680
The Gams also provide clearer insights into the influence of each feature, striking a better balance

57
00:04:45,680 --> 00:04:47,960
between accuracy and interpretability.

58
00:04:49,430 --> 00:04:54,290
With these findings, Doctor Carter prepares a detailed report for the board, highlighting the trade

59
00:04:54,320 --> 00:04:59,090
offs and recommending the adoption of Gams for their AI diagnostic tool.

60
00:04:59,480 --> 00:05:05,840
The board, comprising clinicians, data scientists and AI governance professionals, reviews the report.

61
00:05:06,230 --> 00:05:12,110
They deliberate on another essential question what additional steps should be taken to ensure the selected

62
00:05:12,110 --> 00:05:15,680
model remains transparent and robust over time?

63
00:05:17,480 --> 00:05:22,880
The discussion centers around implementing continuous monitoring and updating of the model to ensure

64
00:05:22,880 --> 00:05:25,730
it adapts to new data and maintains fairness.

65
00:05:26,090 --> 00:05:32,150
They agreed to set up a dedicated team for ongoing evaluation, and to use model agnostic interpretability

66
00:05:32,180 --> 00:05:36,500
tools to periodically reassess the model's decision making process.

67
00:05:36,920 --> 00:05:43,130
This proactive approach ensures the model remains both accurate and interpretable, aligning with regulatory

68
00:05:43,130 --> 00:05:45,470
standards and ethical considerations.

69
00:05:47,570 --> 00:05:53,960
In conclusion, Metastatic Journey underscores the complex interplay between accuracy and interpretability

70
00:05:53,960 --> 00:05:59,480
in AI model selection by piloting both the deep learning and logistic regression models.

71
00:05:59,510 --> 00:06:04,820
They gather valuable insights that inform their decision to adopt generalized additive models.

72
00:06:05,270 --> 00:06:11,000
This choice offers a balanced solution enhancing patient outcomes while fostering trust and compliance

73
00:06:11,000 --> 00:06:12,710
among health care professionals.

74
00:06:14,930 --> 00:06:21,050
The analysis reveals that prioritizing interpretability in high stakes domains like healthcare is crucial

75
00:06:21,050 --> 00:06:24,710
for gaining user trust and meeting regulatory requirements.

76
00:06:25,520 --> 00:06:31,130
Techniques such as lime and shape, combined with inherently interpretable models like Gams, provide

77
00:06:31,130 --> 00:06:33,950
practical solutions to address these challenges.

78
00:06:34,520 --> 00:06:40,160
Continuous monitoring and iterative improvement further ensure that the AI system remains transparent,

79
00:06:40,160 --> 00:06:42,380
fair, and effective over time.

80
00:06:43,910 --> 00:06:48,680
By reflecting on the thought provoking questions posed throughout the process, students can better

81
00:06:48,680 --> 00:06:54,980
understand how to navigate the trade offs between accuracy and interpretability in AI model selection.

82
00:06:55,490 --> 00:07:01,520
This case study illustrates the importance of considering context, regulatory requirements, and stakeholder

83
00:07:01,520 --> 00:07:07,880
needs when developing AI systems, ultimately guiding professionals toward responsible and impactful

84
00:07:07,910 --> 00:07:09,110
AI deployment.