Learning under Different Training and Testing Distributions
Tutorial, IADS: Data Science and Decision Making Summer School, University of Essex, 2021, Colchester, UK
In this talk, I will focus on the Learning under Different Training and Testing Distributions. Systems based on machine learning methods often suffer a major challenge when applied to the real-world datasets. The conditions under which the system was developed will differ from those in which we use the system. Few sophisticated examples could be email spam filtering, stock prediction, health diagnostic, and brain-computer interface (BCI) systems, that took a few years to develop. Will this system be usable, or will it need to be adapted because the distribution has changed since the system was first built? Apparently, any form of real-world data analysis is cursed with such problems, which arise for reasons varying from the sample selection bias or operating in non-stationary environments. This tutorial will focus on the issues of dataset shifts (e.g. covariate shift, prior-probability shift, and concept shift) and will cover transfer learning for managing to learn a satisfactory model.