Data Science Challenge

Instacart Data Science Challenge


Welcome ! Instacart runs on data - and our awesome shoppers. One of the important ways we make customers happy is by delivering their groceries on-time. To do this, we start by asking “How long will a shopping trip take?” Let's find out.

This data can be useful!

Your goal is to predict the shopping time (the difference between shopping_started_at and shopping_ended_at in seconds) for trips in the test set. The shopping time only includes the time it takes for a shopper to pick the items in the store. It does not include the driving and delivering parts.

  1. Perform any data cleaning, exploratory analysis and visualizations you may need to understand the data.
  2. Construct a predictive model and discuss why you chose your approach.
  3. Assess performance of your model, alternatives you consider or concerns you may have.
  4. Generate a csv file called predictions.csv containing the predictions of the test trips in the following format:



note: shopping_time has to be in seconds.

We want you to have the greatest chance of succeeding in this challenge, so please do the following:

  • Use either Python or R (these are the primary tools we use) and any open source libraries you'd like - submissions in any other language will not be reviewed
  • Make sure your output is formatted in the exact format specified as above
  • Include a written and / or visual summary of your work (such as R Markdown, Jupyter notebook or even just a text file or google doc) in addition to your code


IMPORTANT: Please put predictions.csv, your code files and you written summary in a zip file named <your-name>



  • predictions.csv
  • code (.py, .r, .rmd, .ipynb, etc..)
  • summary.txt


If you have any question, do not hesitate to contact us at

Have fun, and good luck!



ANSWER - Please upload your file here
You can change your submission later.