2nd Workshop: Perspectives on the Evaluation of Recommender Systems
Workshop at ACM Recommender Systems 2022


Evaluation is essential when conducting rigorous research in recommender systems (RS). It may span the evaluation of early ideas and approaches up to elaborate systems in operation; it may target a wide spectrum of different aspects being evaluated. Naturally, we do (and have to) take various perspectives on the evaluation of RS. Thereby, the term “perspective” may, for instance, refer to various purposes of a RS, the various stakeholders affected by a RS, or the potential risks that ought to be minimized. Further, we have to consider that various methodological approaches and experimental designs represent different perspectives on evaluation. The perspective on the evaluation of RS may also be substantially characterized by the available resources. The access to resources will likely be different for PhD students compared to established researchers in industry.

Acknowledging that there are various perspectives on the evaluation of RS, we want to put into discussion whether there is a “golden standard” for the evaluation of RS, and—if so—if it indeed is “golden” in any sense. We postulate that the various perspectives are valid and reasonable, and aim to reach out to the community to discuss and reason about.

The goal of the workshop is to capture the current state of evaluation, and gauge whether there is, or should be, a different target that RS evaluation should strive for. The workshop will address the question: where should we go from here as a community? and aims at coming up with concrete steps for action.

We have a particularly strong commitment to invite and integrate researchers at the beginning of their careers and want to equally integrate established researchers and practitioners, from industry and academia alike. It is our particular concern to give a voice to the various perspectives involved.

Call for Papers

Topics of interest include, but are not limited to, the following:

  • Case studies of difficult, hard-to-evaluate scenarios
  • Evaluations with contradicting results
  • Showcasing (structural) problems in RS evaluation
  • Integration of offline and online experiments
  • Multi-Stakeholder evaluation
  • Divergence between evaluation goals and what is actually captured by the evaluation
  • Nontrivial and unexpected experiences from practitioners

We deliberately solicit papers reporting problems and (negative) experiences regarding RS evaluation, as we consider the reflection on unsuccessful, inadequate or insufficient evaluations as a fruitful source for yet another perspective on RS evaluation that can spark discussions at the workshop. This also includes papers reporting negative study results. Accordingly, submissions may also address the following themes:

(a) “lessons learned” from the successful application of RS evaluation or from “post mortem” analyses describing specific evaluation strategies that failed to uncover decisive elements,
(b) “overview papers” analyzing patterns of challenges or obstacles to evaluation,
(c) “solution papers” presenting solutions for specific evaluation scenarios, and
(d) “visionary papers” discussing novel and future evaluation aspects will be considered as well.


We solicit two forms of contributions. First, we solicit paper submissions that will undergo peer review. Accepted papers will be published and presented at the workshop. Second, we offer the opportunity to present ideas without a paper submission. In this case, we call for the submission of abstracts that will be reviewed by the workshop organizers. Accepted abstracts will be presented at the workshop, but not published.

Paper Submissions

We solicit papers with 4 up to 12 pages (excluding references). Along the lines of this year’s call for papers of the main conference, we do not distinguish between full and short (or position) papers. Papers should be formatted in CEURART’s single column template:

Submitted papers must not be under review in any other conference, workshop, or journal at the time of submission. Papers should be submitted through the workshop’s EasyChair page at

Submissions will undergo single-blind peer review by a minimum of three program committee members and will be selected based on quality, novelty, clarity, and relevance. Authors of accepted papers will be invited to present their work during the workshop and will be published as open access workshop proceedings via At least one author of each accepted paper must attend the workshop and present the work.

Abstract Submissions

We solicit abstracts with 200-350 words, to be submitted through the workshop’s EasyChair EasyChair page at

The workshop organizers will select abstracts based on quality, clarity, relevance, and their potential to spark interesting discussion during the workshop. Authors of accepted abstracts will be invited to present their work during the workshop.

Important Dates

  • Paper submission: July 29th, 2022 AoE extended deadline: August, 4th 2022 AoE
  • Author notification: August 26th, 2022
  • Camera-ready version: September 9th, 2022
  • Workshop (on-site meeting): September 22, 2022, 9:00–17:30, Seattle, WA, USA (in scope of RecSys 2022), Ballroom B
  • Workshop (virtual): September 22, 2022, 9:00–12:30 (PDT, UTC/GMT -7 hours) (in scope of RecSys 2022), Room Cedar


Please watch the videos of the accepted papers before the workshop takes place. There will be no time at the workshop to view videos, instead we will focus on discussion and active participation.

Thursday, September 22nd, 2022, 09:00-17:00

Part 1 (hybrid): Ballroom B

09:00-09:20 Welcome & Introduction
09:20-10:10 Keynote & Q&A: Kim Falk
10:10-10:30 Topic pitch (5 min pitch, 5 min questions)

10:10-10:20 Topic 1 - What aspects of recommendation should be tested and evaluated Video
10:20-10:30 Topic 2 - Challenges, Approaches and Evaluation Metrics

10:30-11:00 Break

11:00-12:15 Group discussions (topic 1 and topic 2) on site (breakout rooms on Zoom)
12:10-12:30 Wrap up

12:30-14:00 Lunch Break

Part 2 (onsite only), Room Cedar

14:00-16:30 Open round-table discussion
16:30-17:00 Wrap up

Times are in Pacific Daylight Time (PDT) (Seattle, WA, local time).


Keynote: What is the end goal and how do we evaluate it?

Kim Falk

Kim Falk is a Staff Recommender Engineer at Shopify, where he is the technical lead of the Product Recommendations team. Kim has experience in machine learning but is specialized in Recommender systems. Has previously worked on recommenders for customers like BT and RTL+. Worked on personalizing at IKEA. He is author of Practical Recommender Systems.

Kim Falk
Kim Falk

Recording of the Keynote

Accepted Contributions

All Teaser Videos on a single page.

Extended abstract about the workshop as part of the RecSys proceedings.

Proceedings (ceur-ws).

Accepted Papers

Learning Choice Models for Simulating Users' Interactions with Recommender Systems
Naieme Hazrati, Francesco Ricci

Towards Comparing Recommendation to Multiple-Query Search Sessions for Talent Search
Mesut Kaya, Toine Bogers

Recommender Systems Are Not Everything: Towards a Broader Perspective in Recommender Evaluation
Benedikt Loepp

Diversifying Sentiments in News Recommendation
Mete Sertkan, Sophia Althammer, Sebastian Hofstätter, Julia Neidhardt

Are We Forgetting Something? Correctly Evaluate a Recommender System With an Optimal Training Window
Robin Verachtert, Lien Michiels, Bart Goethals

CaMeLS: Cooperative Meta-Learning Service for Recommender Systems
Lukas Wegmeth, Joeran Beel

Accepted Abstracts

Should Algorithm Evaluation Extend to Testing? We Think So
Lien Michiels, Robin Verachtert, Kim Falk, Bart Goethals

Multi-domain news recommender systems at Globo: Challenges, Approaches and Evaluation Metrics
Joel Pinho Lucas, Letícia Freire de Figueiredo, Felipe Alves Ferreira

Program Committee

Workshop Chairs

Program Committee

  • Alejandro Bellogin (Universidad Autónoma de Madrid, Spain)
  • Toine Bogers (Aalborg University Copenhagen, Denmark)
  • Paolo Cremonesi (Politecnico di Milano, Italy)
  • Amra Delić (University of Sarajevo, Bosnia and Herzegovina)
  • Mehdi Elahi (University of Bergen, Norway)
  • Andrés Ferraro (McGill University & Mila (Quebec AI Institute), Canada)
  • Bruce Ferwerda (Jönköping University, Sweden)
  • Hanna Hauptmann (Utrecht University, The Netherlands)
  • Dietmar Jannach (AAU Klagenfurt, Austria)
  • Mesut Kaya (Aalborg University Copenhagen, Denmark)
  • Jaehun Kim (Delft University of Technology, The Netherlands)
  • Bart Knijnenburg (Clemson University, USA)
  • Dominik Kowald (Know-Center Graz, Austria)
  • Julia Neidhardt (TU Wien, Austria)
  • Maria Soledad Pera (Boise State University, USA)
  • Manel Slokom (TU Delft, The Netherlands)
  • Marko Tkalčič (University of Primorska, Slovenia)
  • Martijn C. Willemsen (Eindhoven University of Technology, The Netherlands)
  • Markus Zanker (Free University of Bozen-Bolzano, Italy)