1. Background
Prior research on ranking, recommendation, or retrieval systems focused on maximizing relevance to provide the most relevant items to the consumer, based solely on the consumer’s objective. However, some research shows that machine learning techniques tend to “unfairly” favor certain individuals or groups of individuals. This paper introduces pitfalls that existing approaches miss: "provider utility beyond position-based exposure, spillover effects, induced strategic incentives, and the effect of statistical uncertainty". In addition, data bottleneck and legal bottleneck are mentioned as the factors that hinder fair ranking.
2. Research Questions
- How do we define and measure 'fair' ranking?
- What are the pitfalls of existing ranking models?
3. Data and methods
This paper mainly reviews other relevant papers and supports or criticizes them.
It provides comprehensive examples as an appendix to explain the concepts and situations it is discussing in the paper. This is done by analyzing hypothetical scenarios which are formed as tables. It reviews other research that used datasets such as COMPAS or the German Credit datasets.
4. Findings
Fair ranking literature uses probability-based and exposure-based approaches to define "fairness".
Probability-based approaches focus on the fairness of the top-k ranking positions. They provide a minimum proportion of items/individuals from protected groups.
In contrast, exposure-based approaches emphasize fair allocation of exposure among providers. This assumes exposure is measured mainly by the provider’s position in the ranking.
Both approaches share they center around position when measuring fairness.
However, this paper criticizes provider utility may not be proportional to position. It claims that both approaches miss context-specific factors due to which higher exposure does not necessarily lead to increased user attention. Context, time, and which user it is exposed to are some other factors that need to be considered on top of the position.
Spillover effects or externalities should also be considered for the fair ranking. Early-exposed items take too many long-term advantages. They are over-rewarded by the fact that they were exposed first and will stay high ranked in the future as well. Also, similar items recommended by third-party services are not taken into account when they are receiving a positive spillover effect. Lastly, items that are accessed by users entering through search engines are not receiving fair scores.
Current fair ranking algorithms often fail to consider that the providers could be strategic players that act to maximize their profit. For example, providers will duplicate listings, some instances pretending to be a different provider, to be scored higher.
The last pitfall this paper points out is that machine-learned models or statistical techniques are trained in different settings and backgrounds from the real-world.
In addition, this paper mentions data and legal bottlenecks. The majority of datasets (ex. COMPAS or the German Credit datasets) are often far from the contexts in which fair ranking algorithms would be used. It would be useful in advancing the conceptual state-of-the-art in algorithmic fairness research, however, whether it is valid in a real-world scenario is doubtful. Also, legal hurdles mainly consist of legal restrictions on acquiring data.