5 Misuse and adversarial attacks: Challenges and responsible testing
This chapter covers
- Exploitation of generative AI models for intentional misuse
- Causes of LLM hallucinations and techniques to reduce them
- Unintentional misuse of chatbots in specialized knowledge fields
- Red-teaming strategies for uncovering vulnerabilities and strengthen system defenses
Since ChatGPT was made available to the public in November 2022, people have shared malicious use cases they’ve observed or tested successfully and speculated about how else it might be misused in the future. “AI Is About to Make Social Media (Much) More Toxic,” argued a story in The Atlantic [1]. “People are already trying to get ChatGPT to write malware,” reported ZDNET about a month after the tool’s release [2]. Because anyone could chat with the model, the sources of discovery for many of these revelations weren’t AI experts, but rather the general public, who shared their findings on X (formerly Twitter) and Reddit. As we’ve seen in the worlds of cybersecurity and disinformation, people are endlessly creative when it comes to using new tools to achieve their ends.