Learning Under Dataset Shifts


Dataset shift is a challenging situation where the joint distribution of inputs and outputs differs between the training and test stages. Covariate shift is a simpler particular case of dataset shift where only the input distribution changes (covariate denotes input), while the conditional distribution of the outputs given the inputs p(y|x) remains unchanged. Dataset shift is present in most practical applications or reasons ranging from the bias introduced by experimental design, to the mere irreproducibility of the testing conditions at training time. For example, in an image classification task, training data might have been recorded under controlled laboratory conditions, whereas the test data may show different lighting conditions.