Correction to filling data with Scikit-Learn

The next video contains techniques to deal with missing data and turning categorical (non-numerical) data into numbers using Scikit-Learn.

All of the code in the video is correct, however, there is one improvement which should be noted.

In a nutshell, the video shows filling and transforming the entire dataset (X) and although the code works and runs, it's best to fill and transform training and test sets separately.

I've fixed the code on GitHub for both notebooks (all previous links to these notebooks will work) to reflect this as well as created an end-to-end Colab notebook to reflect the change:

The main takeaways:

Keep these in mind when you watch the upcoming video, and remember, full working code is available in the links above.

Thank you Robert for pointing this out on the QA forums.