References
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural information processing systems 30 (2017).
- Y. H. Yeo, J. S. Samaan, W. H. Ng, P.-S. Ting, H. Trivedi, A. Vipani, W. Ayoub, J. D. Yang, O. Liran, B. Spiegel, et al., Assessing the performance of chatgpt in answering questions regarding cirrhosis and hepatocellular carcinoma, Clinical and molecular hepatology 29 (3) (2023) 721.
- Y. Ge, W. Hua, K. Mei, J. Tan, S. Xu, Z. Li, Y. Zhang, et al., Openagi: When llm meets domain experts, Advances in Neural Information Processing Systems 36 (2024).
- M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. d. O. Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, et al., Evaluating large language models trained on code, arXiv preprint arXiv:2107.03374 (2021).
- X. Wu, H. Zhao, Y. Zhu, Y. Shi, F. Yang, T. Liu, X. Zhai, W. Yao, J. Li, M. Du, et al., Usable xai: 10 strategies towards exploiting explainability in the llm era, arXiv preprint arXiv:2403.08946 (2024).
- X. Fang, W. Xu, F. A. Tan, J. Zhang, Z. Hu, Y. Qi, S. Nickleach, D. Socolinsky, S. Sengamedu, C. Faloutsos, Large language models on tabular data–a survey, arXiv preprint arXiv:2402.17944 (2024).
- T. Dinh, Y. Zeng, R. Zhang, Z. Lin, M. Gira, S. Rajput, J.-y. Sohn, D. Papailiopoulos, K. Lee, Lift: Language-interfaced fine-tuning for non-language machine learning tasks, Advances in Neural Information Processing Systems 35 (2022) 11763–11784.
- Y. Hou, J. Zhang, Z. Lin, H. Lu, R. Xie, J. McAuley, W. X. Zhao, Large language models are zero-shot rankers for recommender systems, in: European Conference on Information Retrieval, Springer, 2024, pp. 364–381.
- B. Zhao, C. Ji, Y. Zhang, W. He, Y. Wang, Q. Wang, R. Feng, X. Zhang, Large language models are complex table parsers, arXiv preprint arXiv:2312.11521 (2023).
- J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou, et al., Chain-of-thought prompting elicits reasoning in large language models, Advances in neural information processing systems 35 (2022) 24824–24837.
- C. Yuan, Q. Xie, J. Huang, S. Ananiadou, Back to the future: Towards explainable temporal reasoning with large language models, arXiv preprint arXiv:2310.01074 (2023).
- Y. Sui, M. Zhou, M. Zhou, S. Han, D. Zhang, Table meets llm: Can large language models understand structured table data? a benchmark and empirical study, in: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024, pp. 645–654.
- H. Xue, F. D. Salim, Promptcast: A new prompt-based learning paradigm for time series forecasting, IEEE Transactions on Knowledge and Data Engineering (2023).
- S. Pan, L. Luo, Y. Wang, C. Chen, J. Wang, X. Wu, Unifying large language models and knowledge graphs: A roadmap, IEEE Transactions on Knowledge and Data Engineering (2024).
- Q. Li, Y. Liang, Y. Diao, C. Xie, B. Li, B. He, D. Song, https://openreview.net/forum?id=SJTSvRtGsN, Tree-as-a-prompt: Boosting black-box large language models on few-shot classification of tabular data (2024). https://openreview.net/forum?id=SJTSvRtGsN
- N. Ziems, G. Liu, J. Flanagan, M. Jiang, Explaining tree model decisions in natural language for network intrusion detection, arXiv preprint arXiv:2310.19658 (2023).
- Y. Zhuang, L. Liu, C. Singh, J. Shang, J. Gao, Learning a decision tree algorithm with transformers, arXiv preprint arXiv:2402.03774 (2024).
- L. Breiman, Random forests, Machine learning 45 (2001) 5–32.
- S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, Advances in neural information processing systems 30 (2017).
- J. Zhou, T. Lu, S. Mishra, S. Brahma, S. Basu, Y. Luan, D. Zhou, L. Hou, Instruction-following evaluation for large language models, arXiv preprint arXiv:2311.07911 (2023).
- M. Romaszewski, P. Sekuła, Poster: Explainable classification of multimodal time series using llms, in: Polish Conference of Artificial Intelligence (PP-RAI’2024), Warsaw, Poland, 2024.
- T. Wei, J. Luan, W. Liu, S. Dong, B. Wang, Cmath: can your language model pass chinese elementary school math test?, arXiv preprint arXiv:2306.16636 (2023).
- J. Ahn, R. Verma, R. Lou, D. Liu, R. Zhang, W. Yin, Large language models for mathematical reasoning: Progresses and challenges, arXiv preprint arXiv:2402.00157 (2024).
- S. Imani, L. Du, H. Shrivastava, Mathprompter: Mathematical reasoning using large language models, arXiv preprint arXiv:2303.05398 (2023).
- R. Yamauchi, S. Sonoda, A. Sannai, W. Kumagai, Lpml: llm-prompting markup language for mathematical reasoning, arXiv preprint arXiv:2309.13078 (2023).
- X. Wang, J. Wei, D. Schuurmans, Q. Le, E. Chi, S. Narang, A. Chowdhery, D. Zhou, Self-consistency improves chain of thought reasoning in language models, arXiv preprint arXiv:2203.11171 (2022).
- L. Yu, W. Jiang, H. Shi, J. Yu, Z. Liu, Y. Zhang, J. T. Kwok, Z. Li, A. Weller, W. Liu, Meta-math: Bootstrap your own mathematical questions for large language models, arXiv preprint arXiv:2309.12284 (2023).
- L. C. Magister, J. Mallinson, J. Adamek, E. Malmi, A. Severyn, Teaching small language models to reason, arXiv preprint arXiv:2212.08410 (2022).
- A. K. Singh, D. Strouse, Tokenization counts: the impact of tokenization on arithmetic in frontier llms, arXiv preprint arXiv:2402.14903 (2024).
- M. Muffo, A. Cocco, E. Bertino, Evaluating transformer language models on arithmetic operations using number decomposition, arXiv preprint arXiv:2304.10977 (2023).
- T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in neural information processing systems 33 (2020) 1877–1901.
- T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, Y. Iwasawa, Large language models are zero-shot reasoners, Advances in neural information processing systems 35 (2022) 22199–22213.
- Z. Yu, L. He, Z. Wu, X. Dai, J. Chen, Towards better chain-of-thought prompting strategies: A survey, arXiv preprint arXiv:2310.04959 (2023).
- X. Cheng, J. Li, W. X. Zhao, J.-R. Wen, Chainlm: Empowering large language models with improved chain-of-thought prompting, arXiv preprint arXiv:2403.14312 (2024).
- Y. Shen, K. Song, X. Tan, D. Li, W. Lu, Y. Zhuang, HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face, in: A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, S. Levine (Eds.), Advances in Neural Information Processing Systems, Vol. 36, Curran Associates, Inc., 2023, pp. 38154–38180.
- Y. Fu, H. Peng, A. Sabharwal, P. Clark, T. Khot, Complexity-based prompting for multi-step reasoning, in: The Eleventh International Conference on Learning Representations, 2022.
- X. Liu, T. Pang, C. Fan, Federated prompting and chain-of-thought reasoning for improving llms answering, in: International Conference on Knowledge Science, Engineering and Management, Springer, 2023, pp. 3–11.
- R. R. Hoffman, S. T. Mueller, G. Klein, J. Litman, Metrics for explainable ai: Challenges and prospects, arXiv preprint arXiv:1812.04608 (2018).
- R. Dwivedi, D. Dave, H. Naik, S. Singhal, R. Omer, P. Patel, B. Qian, Z. Wen, T. Shah, G. Morgan, et al., Explainable ai (xai): Core ideas, techniques, and solutions, ACM Computing Surveys 55 (9) (2023) 1–33.
- S. Gawde, S. Patil, S. Kumar, P. Kamat, K. Kotecha, S. Alfarhood, Explainable predictive maintenance of rotating machines using lime, shap, pdp, ice, IEEE Access 12 (2024) 29345–29361.
- B. H. Van der Velden, H. J. Kuijf, K. G. Gilhuijs, M. A. Viergever, Explainable artificial intelligence (xai) in deep learning-based medical image analysis, Medical Image Analysis 79 (2022) 102470.
- G. Srivastava, R. H. Jhaveri, S. Bhattacharya, S. Pandya, P. K. R. Maddikunta, G. Yenduri, J. G. Hall, M. Alazab, T. R. Gadekallu, et al., Xai for cybersecurity: state of the art, challenges, open issues and future directions, arXiv preprint arXiv:2206.03585 (2022).
- E. Cambria, L. Malandri, F. Mercorio, M. Mezzanzanica, N. Nobani, A survey on xai and natural language explanations, Information Processing & Management 60 (1) (2023) 103111.
- H. Luo, L. Specia, From understanding to utilization: A survey on explainability for large language models, arXiv preprint arXiv:2401.12874 (2024).
- J. Yu, A. I. Cristea, A. Harit, Z. Sun, O. T. Aduragba, L. Shi, N. Al Moubayed, Interaction: A generative xai framework for natural language inference explanations, in: 2022 International Joint Conference on Neural Networks (IJCNN), IEEE, 2022, pp. 1–8.
- V. B. Nguyen, J. Schlötterer, C. Seifert, From black boxes to conversations: Incorporating xai in a conversational agent, in: World Conference on Explainable Artificial Intelligence, Springer, 2023, pp. 71–96.
- J. W. Tukey, et al., Exploratory data analysis, Vol. 2, Springer, 1977.
- N. Dave, D. Kifer, C. L. Giles, A. Mali, Investigating symbolic capabilities of large language models, arXiv preprint arXiv:2405.13209 (2024).
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–2830.
- C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring the limits of transfer learning with a unified text-to-text transformer, Journal of machine learning research 21 (140) (2020) 1–67.
- S. Longpre, L. Hou, T. Vu, A. Webson, H. W. Chung, Y. Tay, D. Zhou, Q. V. Le, B. Zoph, J. Wei, et al., The flan collection: Designing data and methods for effective instruction tuning, in: International Conference on Machine Learning, PMLR, 2023, pp. 22631–22648.
- S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, T. Zhang, F. Wu, et al., Instruction tuning for large language models: A survey, arXiv preprint arXiv:2308.10792 (2023).
- N. Muennighoff, T. Wang, L. Sutawika, A. Roberts, S. Biderman, T. L. Scao, M. S. Bari, S. Shen, Z.-X. Yong, H. Schoelkopf, et al., Crosslingual generalization through multitask finetuning, arXiv preprint arXiv:2211.01786 (2022).
- Y. Wang, S. Mishra, P. Alipoormolabashi, Y. Kordi, A. Mirzaei, A. Arunkumar, A. Ashok, A. S. Dhanasekaran, A. Naik, D. Stap, et al., Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks, arXiv preprint arXiv:2204.07705 (2022).
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685 (2021).
- T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, et al., Transformers: State-of-theart natural language processing, in: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, 2020, pp. 38–45.
- K. H. Brodersen, C. S. Ong, K. E. Stephan, J. M. Buhmann, The balanced accuracy and its posterior distribution, in: 2010 20th international conference on pattern recognition, IEEE, 2010, pp. 3121–3124.
- P. Głomb, M. Cholewa, W. Koral, A. Madej, M. Romaszewski, Detection of emergent leaks using machine learning approaches, Water Supply 23 (6) (2023) 2370–2386.
- L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., Training language models to follow instructions with human feedback, Advances in neural information processing systems 35 (2022) 27730–27744.