appendix-c

appendix C Choosing an LLM

 

This appendix outlines the key features of the most popular large language models (LLMs) available at the time of writing. It also highlights the criteria to consider when selecting the most suitable LLM for your project.

C.1 Popular large language models

Many LLMs are available today. Let’s take a look at some of the most popular on the market.

C.1.1 OpenAI GPT series

OpenAI’s GPT-4, released in March 2023, marked a turning point as the first widely recognized frontier model to demonstrate advanced reasoning capabilities. It also pioneered multimodality, initially handling text and images and later extending to audio and video, which opened the door to a wide range of applications. Although OpenAI hasn’t disclosed its architecture, GPT-4 was widely reported to use a Mixture-of-Experts (MoE) design and possibly scale to trillions of parameters, a shift aimed at improving accuracy and efficiency across varied tasks.

C.1.2 Gemini

C.1.3 Gemma

C.1.4 Claude

C.1.5 Cohere

C.1.6 Llama

C.1.7 Falcon

C.1.8 Mistral

C.1.9 Qwen

C.1.10 Grok

C.1.11 Phi

C.1.12 DeepSeek

C.2 How to choose a model

C.2.1 Model purpose

C.2.2 Proprietary vs. open source

C.2.3 Model size (number of parameters)

C.2.4 Context window size

C.2.5 Multilingual support

C.2.6 Accuracy vs. speed

C.2.7 Cost and hardware requirements

C.2.8 Task suitability (standard benchmarks)

C.2.9 Safety and Bias

C.2.10 A practical example