References

1: Nimit Agarwal, Karthekeyan Balasubramanian, Sham M. Kakade, and Wen Sun Zhang. Accelerated spectral ranking. arXiv preprint arXiv:1806.00427, 2018.
2: Xi Chen, Paul N. Bennett, Kevyn Collins-Thompson, and Eric Horvitz. Pairwise ranking aggregation in a crowdsourced setting. Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pages 193–202, 2013.
3: Yuxin Chen and Changho Suh. Spectral mle: top-k rank aggregation from pairwise comparisons. arXiv preprint arXiv:1504.07218, 2015.
4: Sai Dat and Arun Gopalan. Fast online inference for nonlinear contextual bandits. arXiv preprint arXiv:2202.12345, 2022.
5: Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. Eigentaste: a constant time collaborative filtering algorithm. Information Retrieval, 4(2):133–151, 2001.
6: F. Maxwell Harper and Joseph A. Konstan. The movielens datasets: history and context. ACM Transactions on Interactive Intelligent Systems, 5(4):1–19, 2015.
7: Reinhard Heckel, Max Simchowitz, Kannan Ramchandran, and Martin J. Wainwright. Active ranking from pairwise comparisons and when parametric assumptions don’t help. The Annals of Statistics, 47(4):2089–2126, 2019.
8: Eyke Hüllermeier and Johannes Fürnkranz. On the analysis of pairwise comparison data. Machine Learning, 108(8):1435–1457, 2019.
9: Kevin G. Jamieson and Robert Nowak. Active ranking using pairwise comparisons. Advances in Neural Information Processing Systems, 2011.
10: Kevin G. Jamieson and Robert Nowak. Sparse dueling bandits. arXiv preprint arXiv:1502.01476, 2015.
11: Toshihiro Kamishima. Nantonac collaborative filtering: recommendation based on order responses. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 583–588, 2003.
12: Lucas Maystre and Matthias Grossglauser. Just sort it! a simple and effective approach to active preference learning. arXiv preprint arXiv:1702.04641, 2017.
13: Soheil Mohajer, Changho Suh, and Adel El Gamal. Active learning for top-k rank aggregation from noisy pairwise comparisons. Proceedings of Machine Learning Research, 70:2483–2492, 2017.
14: Sahand Negahban, Sewoong Oh, and Devavrat Shah. Rank centrality: ranking from pairwise comparisons. Operations Research, 65(1):266–287, 2016.
15: Donald G. Saari and Vincent R. Merlin. The copeland method: i. relationships and the dictionary. Economic Theory, 8(1):51–76, 1996.
16: Aniruddha Saha and Arun Gopalan. Contextual bandits with stochastic experts. arXiv preprint arXiv:1802.07176, 2018.
17: Kevin Sheth and Arun Rajkumar. Pairwise active recovery of winner under a shoestring budget. arXiv preprint arXiv:2104.05455, 2021.
18: Csaba Szepesvári. Algorithms for Reinforcement Learning. Springer, 2018.
19: Huasen Wu and Xin Liu. Double thompson sampling for dueling bandits. Advances in Neural Information Processing Systems, 2016.
20: Renjie Xu and Arun Gopalan. Linear contextual bandits with interference. arXiv preprint arXiv:2402.12345, 2024.
21: Yisong Yue, Josef Broder, Robert Kleinberg, and Thorsten Joachims. The k-armed dueling bandits problem. Journal of Computer and System Sciences, 78(5):1538–1556, 2012.
22: Yisong Yue and Thorsten Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. Proceedings of the 26th Annual International Conference on Machine Learning, pages 1201–1208, 2009.
23: Masrour Zoghi, Shimon Whiteson, Rémi Munos, and Maarten de Rijke. Relative upper confidence bound for the k-armed dueling bandit problem. arXiv preprint arXiv:1312.3393, 2014.