Provincialeweg 11, Zaandam, Netherlands

Innovation Center for AI Meetup

ICAI AIRLab Edition - Search and recommendation

We share recent research on AI and recommender systems for retail. Two talks, one from the data science team at Bol.com and one from the University of Amsterdam.

14.45 Doors open

15.00 Opening

15.15 Rolf Jagerman (U. Amsterdam) will talk about Non-stationary Offline Evaluation of Recommender Systems

15.45 Barrie Kersbergen and Pim Nauts (Bol.com) will talk about From Reco to Relevance

16.15 Drinks

Talk 1.
Title: Non-stationary Offline Evaluation of Recommender Systems
Speaker: Rolf Jagerman (U. Amsterdam)
Abstract: Collecting online metrics via A/B testing is a gold standard for evaluating a new policy, e.g. a new feature or recommendation model. However, A/B tests are not without pitfalls. They can be expensive in terms of engineering or logistic overhead and even harmful to the user experience as an untested policy is exposed to the end-users of the system.
Methods from off-policy evaluation tell us how historical interaction data can be leveraged to estimate the performance of a new policy offline, alleviating some of the problems surrounding A/B testing. However, existing off-policy estimators do not work well in non-stationary environments, ones where the users’ behavior changes over time.
I will provide a brief introduction to the non-stationary off-policy evaluation problem and introduce two new estimators that work significantly better than standard estimators in non-stationary environments. Our findings open the way for off-policy estimators to be applied in practical real-world settings where non-stationarity is prevalent.

Talk 2.
Title: From Reco to Relevance
Speakers: Barrie Kersbergen & Pim Nauts (Bol.com)
Abstract: Smart is one thing, scale is another. What are the challenges we face when computing the interest graphs that help our customers discover the products they want, when they want them? In this talk Barrie and Pim will share our approach to recommending products at scale, from the data we collect and use to the operational nitty gritty of dealing with a catalog of tens of millions of products and 8 million users.