CD Laboratory for Advancing the State-of-the-Art of Recommender Systems in Multi-Domain Settings

Head of research unit

Ass.Prof. Mag. Dr. Julia Neidhardt

Technische Universität Wien
Institute of Information Systems Engineering / Research Unit E-Commerce
Karlsplatz 13
1040 Wien

Commercial Partner

Preisvergleich Internet Services AG, Falter-Verlags-Gesellschaft m.b.H., YKMB Software GmbH

Duration

01.01.2022 - 31.12.2028

Thematic Cluster

Mathematics, Computer Sciences, Electronic Engineering

(c) snyGGG, visualaddiction, sdecoret | stock.adobe.com — Ähnliche Produktempfehlungen können schnell eintönig wirken. Dies lässt sich ändern, indem die Empfehlungen diverser ausfallen und für einen Überraschungsmoment sorgen. Dabei sollte aber auch das Gesamtverhalten des Systems berücksichtigt werden, da Feedback-Schleifen zu mehr Bias und Filterblasen führen können.

In recent years, recommender systems and personalisation have gained more and more attention in science, industry and society. The number of websites, social media platforms and online systems that support users' information needs and product searches with personalised suggestions is growing rapidly. With the increasing spread of systems based on artificial intelligence and machine learning, there is also more and more discussion about their role and impact. The extensive public discourse on filter bubbles, fake news, micro-targeting and related phenomena during the COVID crisis and recent elections shows a strong societal interest in this topic. Personalisation and recommender systems can be held responsible for some of these problems - at least in part.

Many traditional recommendation techniques are designed to predict accuracy, i.e. they aim to estimate how a user will rate certain products based on log files of previous interactions. As a result, recommendations from these systems for a particular user tend to be similar to products that that user has previously purchased or consumed. Therefore, the "accuracy" of these systems is often quite high, i.e. the system's recommendations are correct, but people are not really surprised by what is suggested to them. The strong consideration of "accuracy" can also lead to the system becoming increasingly unfair, i.e. exhibiting an increasingly strong bias. This is due to iterative reinforcement, in which the system adapts its recommendations more and more to the user's feedback, which leads to an ever narrower recommendation space ("filter bubble"). In general, however, it is undisputed today that the quality of a recommender system goes far beyond measuring "accuracy". This is why other solution dimensions are increasingly being taken into account ("beyond accuracy" solutions, e.g. novelty, diversity or serendipity). Although this has led to significant improvements, many key challenges remain unresolved. In particular, a deeper understanding of how the "Accuracy" of a system is affected by optimising for "Beyond Accuracy" solutions is lacking. Furthermore, there is a clear lack of research investigating how both accuracy and beyond accuracy solutions are dynamically related to bias, e.g. whether systems that optimise for diversity etc. also end up being unfair. In addition, very simple user models are usually taken into account and Beyond Accuracy solutions are neither adapted to the needs or preferences of specific users or user groups nor to different domains (e.g. news vs. fashion). However, studies clearly show that these aspects should be taken into account, even when analysing fairness or bias.

This CD Laboratory addresses such questions. Novel approaches are being developed that are better adapted to the domain and to the preferences and needs of different users or user groups, while at the same time detecting bias. To this end, "Beyond Accuracy" measures such as novelty, diversity and serendipity are taken into consideration. To achieve this, a multi-level user model is introduced that captures users in relation to three different levels, namely the individual level, the group level and the network level. To make these levels jointly accessible and usable in the context of recommendation algorithms, rich and multi-layered features are extracted and used for learning user and product embeddings in a common metric space. Furthermore, in order to obtain a robust user model that actually helps to systematically investigate "beyond accuracy" solutions for different domains, machine learning and data mining are combined with frameworks from the social sciences (in particular social network analysis and Bourdieu's theory of social space). Another very important and under-researched goal is to model the temporal dynamics of different interrelated outcome dimensions (i.e. accuracy, diversity, novelty and serendipity) and bias. Based on this, a framework will be developed to increase the fairness of such systems and minimise bias depending on the domain.

Links

Website zum CD-Labor

Christian Doppler Forschungsgesellschaft

Boltzmanngasse 20/1/3 | 1090 Wien | Tel: +43 1 5042205 | Fax: +43 1 5042205-20 | office@cdg.ac.at

Name	Position (purpose of the cookie)	Storage duration	Type	Data categories
CookieConsent	Saves your consent to the use of cookies	1 year	HTML	No consent / consent, in case of consent: Date of consent
fe_typo_user	Assigns your browser to a session on the server. This only affects the content you see and is not analysed or processed by us.	session	HTTP	Session ID
staticfilecache	Assigns your browser to a session on the server. This only affects the content you see and is not analysed or processed by us.	session	HTTP	Session ID

Name	Position (purpose of the cookie)	Storage duration	Type	Data categories
_pk_id	Recognition of the user.	13 months	HTML	Visitor ID
_pk_ref	Used to store the information of the user's website of origin.	6 months	HTML	Website of origin of the user
_pk_ses	Short-term cookie to store temporary data of the visit.	30 minutes	HTML	Data on the use of the website
_pk_cvar	Short-term cookie to store temporary data of the visit.	30 minutes	HTML	Data on the use of the website
_pk_hsr	Short-term cookie to store temporary data of the visit.	30 minutes	HTML	Data on the use of the website