Question 1

Suppose that you have a credit scoring task, where you have to create a ML model that approximates expert evaluation of an individual's creditworthiness. Which of the following can potentially be a data leakage? Select all that apply.

Correct answers:

Incorrect answers:

Question 2

What is the most foolproof way to set up a time series competition?

Correct answers:

Incorrect answers:

Question 3

Suppose that you have a binary classification task being evaluated by logloss metric. You know that there are 10000 rows in public chunk of test set and that constant 0.3 prediction gives the public score of 1.01. Mean of target variable in train is 0.44. What is the mean of target variable in public part of test data (up to 4 decimal places)?

Correct answers:

Question 4

Suppose that you are solving image classification task. What is the label of this picture?

Correct answer is 3. Check image name!