Aussie AI

Model Selection

Last Updated 17 November, 2025

by David Spuler, Ph.D.

Research on Model Selection

Research papers include:

Bodun Hu, Le Xu, Jeongyoon Moon, Neeraja J. Yadwadkar, Aditya Akella, 27 Oct 2023, MOSEL: Inference Serving Using Dynamic Modality Selection, https://arxiv.org/abs/2310.18481 (Multi-modal model with dynamic selection of modality.)
M Sponner, B Waschneck, A Kumar , 2024, Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning, ACM Computing Surveys,, PDF: https://dl.acm.org/doi/pdf/10.1145/3657283 (Survey of various adaptive inference optimization techniques with much focus on image and video processing optimization for LLMs.)
Can Wang, Bolin Zhang, Dianbo Sui, Zhiying Tu, Xiaoyu Liu, Jiabao Kang, 1 Mar 2024 (v2), A Survey on Effective Invocation Methods of Massive LLM Services, https://arxiv.org/abs/2402.03408 (Deployment of LLMs as LLM-as-a-Service or LLMaaS architectures including prompt compression, semantic caching and model selection based on scoring inputs.)
Yuyi Mao, Xianghao Yu, Kaibin Huang, Ying-Jun Angela Zhang, Jun Zhang, Dec 2023, Green Edge AI: A Contemporary Survey, https://arxiv.org/abs/2312.00333
David Spuler, March 2024, Chapter 54. Ensemble Multi-Model Architectures, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
Steven Kolawole, Don Dennis, Ameet Talwalkar, Virginia Smith, 2 Jul 2024, Revisiting Cascaded Ensembles for Efficient Inference https://arxiv.org/abs/2407.02348
Ziheng Wang, Pedro Reviriego, Farzad Niknia, Javier Conde, Shanshan Liu, Fabrizio Lombardi, 26 Aug 2024, Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things, https://arxiv.org/abs/2408.14528 (Running a small quantized model and then determining whether to run the full non-quantized model.)
Sean Michael Kerner, September 17, 2024, Model routing: The secret weapon for maximizing AI efficiency in enterprises, https://venturebeat.com/ai/why-accenture-and-martian-see-model-routing-as-key-to-enterprise-ai-success/
Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M Waleed Kadous, Ion Stoica, 21 Jul 2024 (v3), RouteLLM: Learning to Route LLMs with Preference Data, https://arxiv.org/abs/2406.18665
Dujian Ding, Ankur Mallick, Chi Wang, Robert Sim, Subhabrata Mukherjee, Victor Ruhle, Laks V.S. Lakshmanan, Ahmed Hassan Awadallah, 22 Apr 2024, Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing, ICLR 2024, https://arxiv.org/abs/2404.14618
Noah Martin, Abdullah Bin Faisal, Hiba Eltigani, Rukhshan Haroon, Swaminathan Lamelas, Fahad Dogar, 4 Oct 2024, LLMProxy: Reducing Cost to Access Large Language Models, https://arxiv.org/abs/2410.11857 (Deploying a proxy between user and LLM, with handling of conversational history context and caching.)
Lingjiao Chen, Matei Zaharia, James Zou, 9 May 2023, FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance, https://arxiv.org/abs/2305.05176
Dimitris Stripelis, Zijian Hu, Jipeng Zhang, Zhaozhuo Xu, Alay Dilipbhai Shah, Han Jin, Yuhang Yao, Salman Avestimehr, Chaoyang He, 23 Oct 2024 (v3), TensorOpera Router: A Multi-Model Router for Efficient LLM Inference, https://arxiv.org/abs/2408.12320
Zesen Zhao, Shuowei Jin, Z. Morley Mao, 23 Sep 2024, Eagle: Efficient Training-Free Router for Multi-LLM Inference, https://arxiv.org/abs/2409.15518
Tao Feng, Yanzhen Shen, Jiaxuan You, 4 Oct 2024, GraphRouter: A Graph-based Router for LLM Selections, https://arxiv.org/abs/2410.03834 https://github.com/ulab-uiuc/GraphRouter
Kaushal Kumar Maurya, KV Aditya Srivatsa, Ekaterina Kochmar, 16 Aug 2024, SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models, https://arxiv.org/abs/2408.08545
Quang H. Nguyen, Duy C. Hoang, Juliette Decugis, Saurav Manchanda, Nitesh V. Chawla, Khoa D. Doan, 24 Jul 2024 (v2), MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs, https://arxiv.org/abs/2407.10834
Keming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan, Chang Zhou, Jingren Zhou, 15 Nov 2023, Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models, https://arxiv.org/abs/2311.08692
Małgorzata Łazuka, Andreea Anghel, Thomas Parnell, 3 Oct 2024, LLM-Pilot: Characterize and Optimize Performance of your LLM Inference Services, https://arxiv.org/abs/2410.02425
Pranjal Aggarwal, Aman Madaan, Ankit Anand, Srividya Pranavi Potharaju, Swaroop Mishra, Pei Zhou, Aditya Gupta, Dheeraj Rajagopal, Karthik Kappaganthu, Yiming Yang, Shyam Upadhyay, Manaal Faruqui, Mausam, 28 Jun 2024 (v4), AutoMix: Automatically Mixing Language Models, https://arxiv.org/abs/2310.12963
Josef Pichlmeier, Philipp Ross, Andre Luckow, 8 Oct 2024 (v2), Performance Characterization of Expert Router for Scalable LLM Inference, https://arxiv.org/abs/2404.15153
Ou, Anthony C., Feb 2024, Large Language Model Routing with Benchmark Datasets, Master's Thesis, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, https://dspace.mit.edu/handle/1721.1/153846
KV Aditya Srivatsa, Kaushal Kumar Maurya, Ekaterina Kochmar, 1 May 2024, Harnessing the Power of Multiple Minds: Lessons Learned from LLM Routing, https://arxiv.org/abs/2405.00467
David Farr, Nico Manzonelli, Iain Cruickshank, Kate Starbird, Jevin West, 16 Oct 2024, LLM Chain Ensembles for Scalable and Accurate Data Annotation, https://arxiv.org/abs/2410.13006
Xiangxiang Dai, Jin Li, Xutong Liu, Anqi Yu, John C.S. Lui, 2 Oct 2024 (v2), Cost-Effective Online Multi-LLM Selection with Versatile Reward Models, https://arxiv.org/abs/2405.16587
Grant Wilkins, Srinivasan Keshav, Richard Mortier, 4 Jul 2024, Offline Energy-Optimal LLM Serving: Workload-Based Energy Models for LLM Inference on Heterogeneous Systems, https://arxiv.org/abs/2407.04014
Wenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen, https://arxiv.org/abs/2412.13437 18 Dec 2024, Deploying Foundation Model Powered Agent Services: A Survey, (A survey of not just deployment, but many inference optimization techniques.)
Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Matei Zaharia, James Zou, Ion Stoica, 20 Feb 2025, Optimizing Model Selection for Compound AI Systems, https://arxiv.org/abs/2502.14815
Zhijun Chen, Jingzheng Li, Pengpeng Chen, Zhuoran Li, Kai Sun, Yuankai Luo, Qianren Mao, Dingqi Yang, Hailong Sun, Philip S. Yu, 25 Feb 2025, Harnessing Multiple Large Language Models: A Survey on LLM Ensemble, https://arxiv.org/abs/2502.18036 https://github.com/junchenzhi/Awesome-LLM-Ensemble
Xinyuan Wang, Yanchi Liu, Wei Cheng, Xujiang Zhao, Zhengzhang Chen, Wenchao Yu, Yanjie Fu, Haifeng Chen, 9 Feb 2025, MixLLM: Dynamic Routing in Mixed Large Language Models, https://arxiv.org/abs/2502.18482
Xiaoye Qu, Yafu Li, Zhaochen Su, Weigao Sun, Jianhao Yan, Dongrui Liu, Ganqu Cui, Daizong Liu, Shuxian Liang, Junxian He, Peng Li, Wei Wei, Jing Shao, Chaochao Lu, Yue Zhang, Xian-Sheng Hua, Bowen Zhou, Yu Cheng, 27 Mar 2025, A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond, https://arxiv.org/abs/2503.21614
Avinash Kumar, Shashank Nag, Jason Clemons, Lizy John, Poulami Das, 14 Apr 2025, HELIOS: Adaptive Model And Early-Exit Selection for Efficient LLM Inference Serving, https://arxiv.org/abs/2504.10724
Jianfei Li, Kevin Kam Fung Yuen, 18 Jul 2025, CPC-CMS: Cognitive Pairwise Comparison Classification Model Selection Framework for Document-level Sentiment Analysis, https://arxiv.org/abs/2507.14022
Judy Long, Tao Liu, Sean Alexander Woznicki, Miljana Markovi\'c, Oskar Marko, Molly Sears, 10 Aug 2025, From Time-series Generation, Model Selection to Transfer Learning: A Comparative Review of Pixel-wise Approaches for Large-scale Crop Mapping, https://arxiv.org/abs/2507.12590
Justin Kay, Grant Van Horn, Subhransu Maji, Daniel Sheldon, and Sara Beery, 31 Jul 2025, Consensus-Driven Active Model Selection, https://arxiv.org/abs/2507.23771
Lorenzo Volpi, Alejandro Moreo, Fabrizio Sebastiani, 30 Jul 2025, Transductive Model Selection under Prior Probability Shift, https://arxiv.org/abs/2507.22647
Basile Lewandowski, Robert Birke, Lydia Y. Chen, 14 Aug 2025, Match & Choose: Model Selection Framework for Fine-tuning Text-to-Image Diffusion Models, https://arxiv.org/abs/2508.10993
Andrea Napoli, Paul White, 17 Aug 2025, Clustering-Based Validation Splits for Model Selection under Domain Shift, https://arxiv.org/abs/2405.19461
Chongyu Qu, Allen J. Luna, Thomas Z. Li, Junchao Zhu, Junlin Guo, Juming Xiong, Kim L. Sandler, Bennett A. Landman, Yuankai Huo, 20 Aug 2025, Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework, https://arxiv.org/abs/2508.14940
Jialiang Wang, Hanmo Liu, Shimin Di, Zhili Wang, Jiachuan Wang, Lei Chen, Xiaofang Zhou, 21 Jul 2025, Beyond Model Base Selection: Weaving Knowledge to Master Fine-grained Neural Network Design, https://arxiv.org/abs/2507.15336
Prateek Chanda, Saral Sureka, Parth Pratim Chatterjee, Krishnateja Killamsetty, Nikhil Shivakumar Nayak, Ganesh Ramakrishnan, 7 Aug 2025, Learning What Matters: Probabilistic Task Selection via Mutual Information for Model Finetuning, https://arxiv.org/abs/2507.12612
Bohan Yang, Gang Liu, Yang Zhong, Rirao Dao, Yujia Qian, Ke Shi, Anke Tang, Yong Luo, Qi Kong, Jingnan Liu, 7 Aug 2025, Unsupervised deep learning model for fast energy layer pre-selection of delivery-efficient proton arc therapy plan optimization of nasopharyngeal carcinoma, https://arxiv.org/abs/2506.15803
Chenghui Zheng, Garvesh Raskutti, 19 Aug 2025, Comparing Model-agnostic Feature Selection Methods through Relative Efficiency, https://arxiv.org/abs/2508.14268
Courtney Ford and Mark T. Keane, 26 Aug 2025, Feature-Guided Neighbor Selection for Non-Expert Evaluation of Model Predictions, https://arxiv.org/abs/2507.06029
Ayaka Tsutsumi, Guang Li, Ren Togo, Takahiro Ogawa, Satoshi Kondo, Miki Haseyama, 28 Aug 2025, Dual-Model Weight Selection and Self-Knowledge Distillation for Medical Image Classification, https://arxiv.org/abs/2508.20461
Radha Kodali, Venkata Rao Dhulipalla, Venkata Siva Kishor Tatavarty, Madhavi Nadakuditi, Bharadwaj Thiruveedhula, Suryanarayana Gunnam, Durga Prasad Bavirisetti and Gogulamudi Pradeep Reddy, 28 Aug 2025, Interpretation of Deep Learning Model in Embryo Selection for In Vitro Fertilization (IVF) Treatment, https://arxiv.org/abs/2506.06680
Florian Frommlet, Jon Lachmann, Geir Storvik, Aliaksandr Hubin, 31 Aug 2025, FBMS: An R Package for Flexible Bayesian Model Selection and Model Averaging, https://arxiv.org/abs/2509.00753
Jorge-Humberto Urrea-Quintero and David Anton and Laura De Lorenzis and Henning Wessels, 19 Sep 2025, Automated Constitutive Model Discovery by Pairing Sparse Regression Algorithms with Model Selection Criteria, https://arxiv.org/abs/2509.16040
Argimiro Arratia, Alejandra Caba\~na, Ernesto Mordecki, Gerard Rovira-Parra, 15 Sep 2025, The Morgan-Pitman Test of Equality of Variances and its Application to Machine Learning Model Evaluation and Selection, https://arxiv.org/abs/2509.12185
Yannis Belkhiter, Seshu Tirupathi, Giulio Zizzo, Sachin Sharma, John D. Kelleher, 2 Oct 2025, Pre-Hoc Predictions in AutoML: Leveraging LLMs to Enhance Model Selection and Benchmarking for Tabular datasets, https://arxiv.org/abs/2510.01842
Tiago da Silva Barros, Fr\'ed\'eric Giroire, Ramon Aparicio-Pardo and Joanna Moulierac, 2 Oct 2025, Small is Sufficient: Reducing the World AI Energy Consumption Through Model Selection, https://arxiv.org/abs/2510.01889
Binh H. Ho, Long Nguyen Chi, TrungTin Nguyen, Binh T. Nguyen, Van Ha Hoang, Christopher Drovandi, 14 Oct 2025, A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random, https://arxiv.org/abs/2505.19093
Emilio Cruciani, Roberto Verdecchia, 24 Sep 2025, Choosing to Be Green: Advancing Green AI via Dynamic Model Selection, https://arxiv.org/abs/2509.19996
Xuran Li, Jingyi Wang, 23 Oct 2025, An Empirical Study of Sample Selection Strategies for Large Language Model Repair, https://arxiv.org/abs/2510.20428
Sofoklis Kitharidis, Cor J. Veenman, Thomas B\"ack, Niki van Stein, 25 Oct 2025, Visual Model Selection using Feature Importance Clusters in Fairness-Performance Similarity Optimized Space, https://arxiv.org/abs/2510.22209
Jie Hao, Rui Yu, Wei Zhang, Huixia Wang, Jie Xu, Mingrui Liu, 8 Oct 2025, BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining, https://arxiv.org/abs/2510.06048
Jie Li, Andrew McCarthy, Zhizhuo Zhang, Stephen Young, 2 Oct 2025, Uncertainty-Guided Model Selection for Tabular Foundation Models in Biomolecule Efficacy Prediction, https://arxiv.org/abs/2510.02476
Dylan Sam, Ayan Chakrabarti, Afshin Rostamizadeh, Srikumar Ramalingam, Gui Citovsky, Sanjiv Kumar, 21 Oct 2025, Analyzing Similarity Metrics for Data Selection for Language Model Pretraining, https://arxiv.org/abs/2502.02494
Ruibo Chen, Sheng Zhang, Yihan Wu, Tong Zheng, Peihua Mai, Heng Huang, 29 Sep 2025, Model Correlation Detection via Random Selection Probing, https://arxiv.org/abs/2509.24171
Yavuz Durmazkeser, Patrik Okanovic, Andreas Kirsch, Torsten Hoefler, Nezihe Merve G\"urel, 10 Oct 2025, Active Model Selection for Large Language Models, https://arxiv.org/abs/2510.09418
Pai Liu, Lingfeng Zhao, Shivangi Agarwal, Jinghan Liu, Audrey Huang, Philip Amortila, Nan Jiang, 24 Oct 2025, Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol, https://arxiv.org/abs/2502.08021
Ting-Wei Li, Ruizhong Qiu, Hanghang Tong, 24 Oct 2025, Graph Data Selection for Domain Adaptation: A Model-Free Approach, https://arxiv.org/abs/2505.17293
Liman Wang, Hanyang Zhong, Tianyuan Wang, Shan Luo, and Jihong Zhu, 10 Oct 2025, MLLM-Fabric: Multimodal Large Language Model-Driven Robotic Framework for Fabric Sorting and Selection, https://arxiv.org/abs/2507.04351
Hong-Jie Dai, Zheng-Hao Li, An-Tai Lu, Bo-Tsz Shain, Ming-Ta Li, Tatheer Hussain Mir, Kuang-Te Wang, Min-I Su, Pei-Kang Liu, Ming-Ju Tsai, 23 Sep 2025, Model selection meets clinical semantics: Optimizing ICD-10-CM prediction via LLM-as-Judge evaluation, redundancy-aware sampling, and section-aware fine-tuning, https://arxiv.org/abs/2509.18846
Wenqian Li, Youjia Yang, Ruoxi Jia, Yan Pang, 9 Sep 2025, Data Valuation and Selection in a Federated Model Marketplace, https://arxiv.org/abs/2509.18104
Animesh Jha, Harshit Gupta, Ananjan Nandi, 30 Sep 2025, RL-Guided Data Selection for Language Model Finetuning, https://arxiv.org/abs/2509.25850
Mohamed Bal-Ghaoui, Fayssal Sabri, 7 Oct 2025, LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection, https://arxiv.org/abs/2510.05935