Aussie AI

Top-k Vector Algorithms

Last Updated 11 June, 2025

by David Spuler, Ph.D.

What is the Top-k Vector Algorithm?

The top-k algorithm on an array of numbers is a well-known theoretical algorithm in Computer Science theory. Longstanding theory examines the top-k algorithm in sequential algorithms, with both sorting and non-sorting versions. More recent theory has examined parallel top-k numeric algorithms using GPU-accelerated execution.

Sequential top-k algorithms have a significant body of theory. The simplest algorithm is to sort the array first, and then choose the first k elements from the sorted array. Sorting an array is an algorithm that also has its own huge body of theory, and the better sorting algorithms can be chosen. This has complexity O(n log n) for most sorting algorithms, and there is only an additional O(1) complexity for choosing k elements from a sorted list. However, sorting the entire array is somewhat wasteful and unnecessary, since we only want a subset, and there are several efficient top-k algorithms without sorting that optimize this overhead away.

Top-K Computation on GPUs

Papers on computation of vector top-k in parallel using GPU hardware:

Xi Xie, Yuebo Luo, Hongwu Peng, Caiwen Ding, 1 Sep 2024, RTop-K: Ultra-Fast Row-Wise Top-K Algorithm and GPU Implementation for Neural Networks, https://arxiv.org/abs/2409.00822v1
Anil Gaihre, Da Zheng, Scott Weitze, Lingda Li, Shuaiwen Leon Song, Caiwen Ding, Xiaoye S Li, Hang Liu, 16 Sep 2021, Dr. Top-k: Delegate-Centric Top-k on GPUs, https://arxiv.org/abs/2109.08219
Jingrong Zhang, Akira Naruse, Xipeng Li, and Yong Wang. 2023. Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Methods. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '23). Association for Computing Machinery, New York, NY, USA, Article 76, 1–13. https://doi.org/10.1145/3581784.3607062 https://dl.acm.org/doi/abs/10.1145/3581784.3607062 PDF: https://dl.acm.org/doi/pdf/10.1145/3581784.3607062
Yifei Li, Bole Zhou, Jiejing Zhang, Xuechao Wei, Yinghan Li, and Yingda Chen. 2024. POSTER: RadiK: Scalable Radix Top-K Selection on GPUs. In Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP '24). Association for Computing Machinery, New York, NY, USA, 472–474. https://doi.org/10.1145/3627535.3638478 https://dl.acm.org/doi/abs/10.1145/3627535.3638478
Yifei Li, Bole Zhou, Jiejing Zhang, Xuechao Wei, Yinghan Li, and Yingda Chen. 2024. RadiK: Scalable and Optimized GPU-Parallel Radix Top-K Selection. In Proceedings of the 38th ACM International Conference on Supercomputing (ICS '24). Association for Computing Machinery, New York, NY, USA, 537–548. https://doi.org/10.1145/3650200.3656596 https://dl.acm.org/doi/abs/10.1145/3650200.3656596

Parallel GPU Sorting Research

Sorting algorithms have fascinated researchers for over fifty years. More recent papers have generalized this to GPU sorting, and these sorting methods could be used in a top-k algorithm, as in the papers below:

Stehle, E., and Jacobsen, H.-A., A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs. Proceedings of the 2017 ACM International Conference on Management of Data (2017), ACM, pp. 417–432. https://dl.acm.org/doi/10.1145/3035918.3064043, https://arxiv.org/pdf/1611.01137.pdf (Parallel sorting algorithms with hybrid radix-sort on GPU.)
Bell, N., and Hoberock, J. Thrust: A Productivity-Oriented Library for CUDA. GPU computing gems, Jade edition. Elsevier, 2012, pp. 359–371. https://research.nvidia.com/publication/2011-10_thrust-productivity-oriented-library-cuda, PDF: https://research.nvidia.com/sites/default/files/pubs/2011-10_Thrust-A-Productivity-Oriented/Thrust%20-%20A%20Productivity-Oriented%20Library%20for%20CUDA.pdf (CUDA-based parallel on-GPU sorting.)
D. Merrill, A. Grimshaw, Revisiting sorting for gpgpu stream architectures, Technical Report CS2010-03, University of Virginia, Department of Computer Science, Charlottesville, VA, 2010, https://dl.acm.org/doi/10.1145/1854273.1854344, https://mvdirona.com/jrh/TalksAndPapers/RadixSortTRv2.pdf (Parallel radix sorting on GPUs.)
N. Satish, M. Harris, M. Garland, Designing efficient sorting algorithms for manycore GPUs, in Proceedings 23rd IEEE Int’l Parallel & Distributed Processing Symposium, IEEE Computer Society, Washington, DC, 2009. https://ieeexplore.ieee.org/document/5161005, PDF: https://www.nvidia.com/docs/IO/67073/nvr-2008-001.pdf (Fast sorting on GPUs including parallel radix sort and mergesort.)
Dmitri I. Arkhipov, Di Wu, Keqin Li, Amelia C. Regan, 2017, Sorting with GPUs: A Survey, https://arxiv.org/abs/1709.02520
Obeya, O., Kahssay, E., Fan, E., and Shun, J., Theoretically-efficient and Practical Parallel In-place Radix Sorting. In The 31st ACM Symposium on Parallelism in Algorithms and Architectures (2019), pp. 213–224. https://dl.acm.org/doi/10.1145/3323165.3323198, PDF Slides: https://people.csail.mit.edu/jshun/6886-s20/lectures/lecture5-2.pdf
Alabi, T., Blanchard, J. D., Gordon, B., and Steinbach, R., 2012, Fast 𝑘-Selection Algorithms for Graphics Processing Units. Journal of Experimental Algorithmics 17 (2012), 4–2, PDF: https://blanchard.math.grinnell.edu/Research/ABGS_KSelection.pdf (Examines sorting-based and 3 new algorithms for on-GPU top-k without sorting.)
M Kruliš, H Osipyan, S Marchand-Maillet, 2015, Optimizing sorting and top-k selection steps in permutation based indexing on GPUs, ADBIS 2015: New Trends in Databases and Information Systems, pp 305–317, https://link.springer.com/chapter/10.1007/978-3-319-23201-0_33
Sam Olesker-Taylor, 7 Feb 2025, A Randomised Approach to Distributed Sorting, https://arxiv.org/abs/2502.05082

Parallel GPU Top-k Algorithm Research without Sorting

There has been recent research attention on optimizing the standard top-k algorithms to run in parallel using GPU acceleration, but without using a full sort algorithm:

Wang, L., Wu, W., Zhang, J., Liu, H., Bosilca, G., Herlihy, M., and Fonseca, R., FFT-based Gradient Sparsification for the Distributed Training of Deep Neural Networks. Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing (2020), pp. 113–124, https://dl.acm.org/doi/abs/10.1145/3369583.3392681, https://arxiv.org/pdf/1811.08596.pdf (FFT-based top-k algorithm.)
Guo, C., Chen, H., Li, C., and Wu, T., A Memory Access Reduced Sort on Multi-core GPU. In International Conference on High Performance Computing and Communications; International Conference on Smart City; International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (2018), IEEE, pp. 586–593. https://ieeexplore.ieee.org/document/8622846
A Shanbhag, H Pirk, S Madden, May 2018, Efficient top-k query processing on massively parallel hardware, SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data, Pages 1557–1570, https://doi.org/10.1145/3183713.3183735, https://dl.acm.org/doi/abs/10.1145/3183713.3183735, https://www.doc.ic.ac.uk/~hlgr/pdfs/MassivelyParallelTopK.pdf
Alabi, T., Blanchard, J. D., Gordon, B., and Steinbach, R., 2012, Fast 𝑘-Selection Algorithms for Graphics Processing Units. Journal of Experimental Algorithmics 17 (2012), 4–2, PDF: https://blanchard.math.grinnell.edu/Research/ABGS_KSelection.pdf (Examines sorting-based and 3 new algorithms for on-GPU top-k without sorting.)
V Zois, VJ Tsotras, WA Najjar, 2018, GPU accelerated top-k selection with efficient early stopping, https://par.nsf.gov/servlets/purl/10123866 PDF: https://adms-conf.org/2019-camera-ready/zois_adms19.pdf (Examines heap-based top-k algorithms and data partitioning methods on GPUs.)
M Kruliš, H Osipyan, S Marchand-Maillet, 2015, Optimizing sorting and top-k selection steps in permutation based indexing on GPUs, ADBIS 2015: New Trends in Databases and Information Systems, pp 305–317, https://link.springer.com/chapter/10.1007/978-3-319-23201-0_33
Anil Gaihre, Da Zheng, Scott Weitze, Lingda Li, Shuaiwen Leon Song, Caiwen Ding, Xiaoye S Li, Hang Liu, Sep 2021, Dr. Top-k: Delegate-Centric Top-k on GPUs, https://arxiv.org/abs/2109.08219

Classic Sequential Top-k Algorithm Research

This is older research examining non-parallel algorithms for computing top-k from an array of numbers:

Abbar S, Amer-Yahia S, Indyk P, Mahabadi S, 2013, Real-time recommendation of diverse related articles. Proceedings of the 22nd international conference on World Wide Web, pp. 1–12, https://www.mit.edu/~mahabadi/papers/2013_AAIM.pdf (Examines dLSH hashing, MMR, and greedy max-min algorithms for top-k.)
Agrawal R, Gollapudi S, Halverson A, Ieong S, 2009, Diversifying search results. Proceedings of the second ACM international conference on web search and data mining, pp. 5–14, https://dl.acm.org/doi/10.1145/1498759.1498766 (A general algorithm for top-k.)
Carbonell J, Goldstein J, 1998, The use of MMR, diversity-based reranking for reordering documents and producing summaries, Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 335–33, PDF: https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf (Early paper on MMR algorithm for top-k.)
Fraternali P, Martinenghi D, Tagliasacchi M, 2012, Top-k bounded diversification. Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 421–432, https://dl.acm.org/doi/10.1145/2213836.2213884 https://www.researchgate.net/publication/229440521_Top-k_bounded_diversification (Extension of the MMR algorithm called SPP for top-k calculation.)
Gollapudi S, Sharma A, 2009, An axiomatic approach for result diversification. Proceedings of the 18th international conference on World wide web, pp. 381–390, PDF: https://theory.stanford.edu/~aneeshs/papers/diversification.pdf (Max-Sum diversification algorithm for top-k.)
Md Mouinul Islam, Mahsa Asadi, Sihem Amer-Yahia, Senjuti Basu Roy, Oct 2023, A Generic Framework for Efficient Computation of Top-k Diverse Results, https://hal.science/hal-04239842/document (Algorithm uses an m-ary tree data structure, compared to MMR, GMM, and SWAP algorithms for top-k.)
CPP Reference, Sep 2023, std::partial_sort, https://en.cppreference.com/w/cpp/algorithm/partial_sort (std::partial_sort in C++ probably uses the "heap select" or "heap sort" algorithms in the code.)
Wikipedia, 2023, Selection algorithm, https://en.wikipedia.org/wiki/Selection_algorithm
Chunbin Lin, Jiaheng Lu, Zhewei Wei, Jianguo Wang, Xiaokui Xiao, 2017, Optimal algorithms for selecting top-k combinations of attributes: theory and applications, The VLDB Journal DOI 10.1007/s00778-017-0485-2, PDF: https://www.cs.purdue.edu/homes/csjgwang/pubs/VLDBJ18_TopK.pdf (Generalizes top-k of an array to top-k of combinations of results.)
Bruno, N., Gravano, L., Marian, A., 2002, Evaluating top-k queries over web-accessible databases. ICDE, pp. 369–380, PDF: http://www1.cs.columbia.edu/~gravano/Papers/2004/tods04.pdf (Analysis of several top-k algorithms.)
Chang, K.C.-C., Hwang, S.-W., 2002, Minimal probing: supporting expensive predicates for top-k queries. SIGMOD, pp. 346–357, https://dl.acm.org/doi/10.1145/564691.564731, PDF: http://delab.csd.auth.gr/~manolopo/oikonomiko/chang.pdf
Fagin, R., Kumar, R., Sivakumar, D., 2003, Comparing top k lists. SIDMA 17(1), 134–160, https://dl.acm.org/doi/10.5555/644108.644113
Fagin, R., Lotem, A., Naor, M., 2003, Optimal aggregation algorithms for middleware. JCSS 66(4), 614–656, https://arxiv.org/abs/cs/0204046
Li, C., Chen-Chuan Chang, K., Ilyas, I.F., 2006, Supporting ad-hoc ranking aggregates. SIGMOD, pp. 61–72, https://dl.acm.org/doi/abs/10.1145/1142473.1142481
Soliman, M.A., Ilyas, I.F., Chang, K.C.-C., 2008, Probabilistic top-k and ranking-aggregate queries. TODS 33(3), 13, https://dl.acm.org/doi/10.1145/1386118.1386119
Ilyas, I.F., Beskales, G., Soliman, M.A., 2008, A survey of top-k query processing techniques in relational database systems. CSUR 40(4), 11, https://dl.acm.org/doi/10.1145/1391729.1391730
Ilyas, I.F., Aref, W.G., Elmagarmid, A. K., 2002, Joining ranked inputs in practice. VLDB, pp. 950–961, https://dl.acm.org/doi/10.5555/1287369.1287453
Michel, S., Triantafillou, P., Weikum, G., 2005, Klee: a framework for distributed top-k query algorithms. VLDB, pp. 637–648, https://dl.acm.org/doi/abs/10.5555/1083592.1083667, PDF: https://www.vldb.org/conf/2005/papers/p637-michel.pdf
Theobald, M., Schenkel, R., Weikum, G., 2005, An efficient and versatile query engine for topx search. VLDB, pp. 625–636, https://dl.acm.org/doi/10.5555/1083592.1083666