2 “AI”, Large Language Models, Coding, and Data Analytics

This chapter covers

How LLMs basically work as a tool for coding
The basics of prompt engineering
LLM’s coding capabilities

This chapter explores the use of LLMs in coding tasks and data analytics tasks, their benefits, limitations, and various conceptual approaches to integrating them into your coding workflow with R. Most of the chapter focuses on building the groundwork for using LLMs in coding tasks, supported with examples and code snippets to illustrate the concepts. You will get familiar with the key parameters and settings of the OpenAI LLMs we will be working on within the following chapters, get a basic understanding of prompt engineering, and learn how to integrate these concepts and practices into your daily work with R. By the end of this chapter, you should have a good understanding of how LLMs can be used in coding and data analytics tasks, and how to integrate them into your coding workflow to boost productivity and efficiency.

2.1 Understanding Large Language Models

2.1.1 Training and mechanics behind LLMs

2.1.2 Inference and completion

2.1.3 Prompt, system message, and context window

2.2 Concrete Example: Data Visualization with ggplot2 and ChatGPT

2.2.1 System Message

2.2.2 Adding the “system message” to ChatGPT

2.2.3 Prompts

2.3 Prompt Engineering: The Art of Conversation to Write R Code

2.3.1 The Art and Science of Prompt Engineering

2.3.2 The Language of Prompts

2.3.3 Basic Prompting Strategies

2.3.4 Advanced prompting techniques

2.4 Extending LLM capabilities with external data: key concepts

2.5 Approaches to integrating LLMs in coding

2.5.1 Query ChatGPT for help

2.5.2 API-Based Integration and Custom-built Solutions

2.5.3 IDE plugins for code autocompletion

2.5.4 Command line AI pair programming

2.5.5 Agent-based coding assistance

2.6 Summary

2.7 References