1
00:00:00,050 --> 00:00:00,830
Case study.

2
00:00:00,830 --> 00:00:05,330
Enhancing customer churn prediction a multi algorithm approach at Technova.

3
00:00:05,360 --> 00:00:10,700
Sera, a data scientist at Technova, was tasked with improving the accuracy of their customer churn

4
00:00:10,730 --> 00:00:11,750
prediction model.

5
00:00:12,350 --> 00:00:17,570
This model used historical customer data to predict whether a customer would leave the service in the

6
00:00:17,570 --> 00:00:18,470
near future.

7
00:00:19,400 --> 00:00:24,410
Sarah's objective was to refine the model to better support the company's retention strategies.

8
00:00:24,980 --> 00:00:30,290
She began by assessing the current model, which was based on logistic regression given its simplicity

9
00:00:30,290 --> 00:00:31,640
and interpretability.

10
00:00:34,460 --> 00:00:40,430
The first question Sarah considered was how could the model be improved by exploring other supervised

11
00:00:40,430 --> 00:00:41,720
learning algorithms?

12
00:00:42,770 --> 00:00:48,710
She knew that logistic regression was effective for binary classification, but might not capture complex

13
00:00:48,710 --> 00:00:50,390
relationships in the data.

14
00:00:50,660 --> 00:00:56,030
Given the rich set of features in the data set, she decided to experiment with support vector machines

15
00:00:56,030 --> 00:00:57,350
and neural networks.

16
00:00:58,100 --> 00:01:04,210
SVMs could help by creating a non-linear boundary that better separated the churners from the non churners.

17
00:01:04,210 --> 00:01:05,650
While a neural network might.

18
00:01:05,680 --> 00:01:10,330
Uncover intricate patterns in the data due to its multiple layers of neurons.

19
00:01:12,370 --> 00:01:18,790
Sarah collected and pre-processed the data, handling missing values and normalizing numerical features

20
00:01:18,790 --> 00:01:22,120
to ensure each feature contributed equally to the model.

21
00:01:22,630 --> 00:01:26,920
She wondered, what is the impact of feature engineering on model performance?

22
00:01:27,130 --> 00:01:33,190
To explore this, she used recursive feature elimination to identify the most relevant features and

23
00:01:33,190 --> 00:01:39,280
created new interaction features based on domain knowledge, such as combining service usage patterns

24
00:01:39,280 --> 00:01:41,170
with customer demographics.

25
00:01:42,460 --> 00:01:48,040
After preparing the data, Sarah trained the SVM model using a radial basis function kernel.

26
00:01:48,610 --> 00:01:53,650
The model performed better than logistic regression, but Sarah noticed some overfitting on the training

27
00:01:53,650 --> 00:01:54,250
data.

28
00:01:54,910 --> 00:02:00,100
This led her to consider what regularization techniques could be applied to mitigate overfitting.

29
00:02:00,640 --> 00:02:07,050
She applied L2 regularization to the SVM, adjusting the penalty parameter to balance bias and variance,

30
00:02:07,050 --> 00:02:09,480
which improved the model's generalization.

31
00:02:11,370 --> 00:02:17,100
To further validate the model, Sarah implemented K-fold cross validation, splitting the data into

32
00:02:17,100 --> 00:02:21,420
ten subsets and rotating the validation set across these folds.

33
00:02:22,320 --> 00:02:27,270
This helped her evaluate the model's robustness and ensured it did not rely too heavily on any single

34
00:02:27,270 --> 00:02:28,380
subset of data.

35
00:02:29,850 --> 00:02:35,670
Sarah then asked herself, how does cross-validation improve the reliability of model performance metrics

36
00:02:35,670 --> 00:02:37,740
compared to a simple train test split?

37
00:02:38,010 --> 00:02:43,350
The cross-validation results provided a more stable estimate of the model's performance, mitigating

38
00:02:43,350 --> 00:02:47,610
the risk of overestimating accuracy due to favorable splits.

39
00:02:48,810 --> 00:02:54,510
After observing promising results with the SVM, Sarah turned her attention to neural networks.

40
00:02:54,510 --> 00:03:00,870
She constructed a deep neural network with multiple hidden layers, each layer containing several neurons

41
00:03:00,870 --> 00:03:02,790
with activation functions.

42
00:03:03,090 --> 00:03:09,010
She understood that the choice of hyperparameters, such as the number of layers in neurons, significantly

43
00:03:09,010 --> 00:03:10,810
influence the model's performance.

44
00:03:10,810 --> 00:03:16,300
Thus, she questioned what methods can be used to efficiently tune hyperparameters.

45
00:03:16,900 --> 00:03:22,960
Sarah used a combination of grid search and Bayesian optimization to find the optimal set of hyperparameters,

46
00:03:22,960 --> 00:03:29,860
balancing computation time and performance with the hyperparameter tuned DNN performing even better

47
00:03:29,860 --> 00:03:30,910
than the SVM.

48
00:03:30,940 --> 00:03:33,130
Sarah decided to deploy the model.

49
00:03:33,550 --> 00:03:36,790
However, she knew that continuous monitoring was crucial.

50
00:03:37,180 --> 00:03:42,280
She wondered, how can model performance be maintained over time in a dynamic environment?

51
00:03:42,610 --> 00:03:48,460
By setting up automated pipelines to periodically retrain the model with new data and monitor key performance

52
00:03:48,460 --> 00:03:52,780
indicators, she ensured the model adapted to changes in customer behavior.

53
00:03:55,360 --> 00:04:01,360
Sarah's efforts were validated when the updated model successfully predicted churn with higher accuracy,

54
00:04:01,360 --> 00:04:06,070
leading to targeted retention efforts that reduced overall churn rates.

55
00:04:06,970 --> 00:04:11,610
This experience highlighted several critical insights for machine learning practitioners.

56
00:04:12,780 --> 00:04:18,120
First, exploring various algorithms beyond the initial choice can uncover models better suited to the

57
00:04:18,120 --> 00:04:19,590
problem's complexity.

58
00:04:20,040 --> 00:04:26,190
In Sara's case, SVMs and DNNs outperformed the initial logistic regression model by capturing more

59
00:04:26,190 --> 00:04:27,990
nuanced patterns in the data.

60
00:04:28,680 --> 00:04:33,180
Second, feature engineering plays a vital role in enhancing model performance.

61
00:04:33,570 --> 00:04:40,140
Techniques like recursive feature elimination and new feature creation based on domain knowledge significantly

62
00:04:40,140 --> 00:04:42,600
improved the predictive power of the models.

63
00:04:43,980 --> 00:04:50,070
Third, regularization techniques such as L2 regularization are essential for preventing overfitting,

64
00:04:50,070 --> 00:04:53,880
especially in complex models like SVMs and neural networks.

65
00:04:54,420 --> 00:05:00,090
Fourth, cross-validation provides a more reliable estimate of model performance by reducing the variance

66
00:05:00,090 --> 00:05:02,580
associated with a single train test split.

67
00:05:03,210 --> 00:05:09,830
Fifth, efficient hyperparameter tuning methods such as grid search and Bayesian optimization are critical

68
00:05:09,830 --> 00:05:14,420
for optimizing model performance without excessive computational costs.

69
00:05:15,260 --> 00:05:20,840
Finally, continuous monitoring and maintenance are crucial for sustaining model accuracy in production

70
00:05:20,840 --> 00:05:21,800
environments.

71
00:05:22,100 --> 00:05:28,250
Regularly updating the model with new data and adjusting hyperparameters helps adapt to changing patterns,

72
00:05:28,250 --> 00:05:32,030
ensuring the model remains relevant and accurate over time.

73
00:05:34,160 --> 00:05:40,130
In summary, Sarah's journey to improve Tennovas churn prediction model encapsulates the core principles

74
00:05:40,130 --> 00:05:46,190
of machine learning by exploring various supervised learning algorithms, engaging in thorough feature

75
00:05:46,190 --> 00:05:53,390
engineering, applying regularization, conducting cross-validation, tuning hyperparameters effectively,

76
00:05:53,390 --> 00:05:54,950
and maintaining the model.

77
00:05:54,980 --> 00:06:00,350
She demonstrated a holistic approach to developing robust machine learning models.

78
00:06:00,350 --> 00:06:05,810
These strategies can be broadly applied to different domains, enabling professionals to harness the

79
00:06:05,810 --> 00:06:10,670
power of machine learning to drive innovation and solve complex problems.