5 Misuse and adversarial attacks: Challenges and responsible testing

 

This chapter covers

  • Exploitation of generative AI models for intentional misuse
  • Causes of LLM hallucinations and techniques to reduce them
  • Unintentional misuse of chatbots in specialized knowledge fields
  • Red-teaming strategies for uncovering vulnerabilities and strengthen system defenses

Since ChatGPT was made available to the public in November 2022, people have shared malicious use cases they’ve observed or tested successfully and speculated about how else it might be misused in the future. “AI Is About to Make Social Media (Much) More Toxic,” argued a story in The Atlantic [1]. “People are already trying to get ChatGPT to write malware,” reported ZDNET about a month after the tool’s release [2]. Because anyone could chat with the model, the sources of discovery for many of these revelations weren’t AI experts, but rather the general public, who shared their findings on X (formerly Twitter) and Reddit. As we’ve seen in the worlds of cybersecurity and disinformation, people are endlessly creative when it comes to using new tools to achieve their ends.

5.1 Intentional misuse

5.1.1 Cybersecurity and social engineering

5.1.2 Illicit and harmful applications

5.1.3 Adversarial narratives

5.1.4 Political manipulation and electioneering

5.2 Hallucinations

5.2.1 Why do LLMs hallucinate?

5.2.2 Misuse of LLMs in the professional world

5.3 Red teaming LLMs

5.4 Conclusion

5.5 Summary