11 Using open-source LLMs
This chapter covers
- Advantages of open-source LLMs: flexibility, transparency, and control.
- Performance benchmarks and key features of leading open-source LLMs.
- Challenges of local deployment and strategies to address them.
- Selecting an optimal inference engine for your use case.
In earlier chapters, you worked with OpenAI's public REST API. It’s a straightforward way to build LLM applications since you don’t need to set up a local LLM host. After signing up with OpenAI and generating an API key, you can send requests to their endpoints and access LLM capabilities. This quick setup lets you work with state-of-the-art models like GPT-4o, 4o-mini, or o1 and o3 efficiently. The main drawback is cost—running examples like summarization might cost a few cents or even dollars. If you're working on projects for your company, privacy might also be a concern. Some employers block OpenAI entirely to avoid the risk of leaking sensitive or proprietary data.
This chapter introduces open-source LLMs, a practical solution for reducing costs and addressing privacy concerns. These models are especially appealing to individuals and organizations that prioritize data confidentiality or are new to AI. I’ll guide you through the most popular open-source LLM families, their features, and the advantages they offer. The focus will be on running these models, ranging from high-performance, advanced setups to user-friendly tools ideal for learning and experimentation.