chapter twenty
20 Product, UX, and model character
Frontiers in RLHF and post-training show how these techniques are used within companies to make leading products. As RLHF becomes more established, the problems it is used to address are becoming more nuanced. In this chapter, we discuss a series of use-cases that leading AI laboratories consider RLHF and post-training for that are largely unstudied in the academic literature.
20.1 Character Training
Character training is the subset of post-training designed around crafting traits within the model in the manner of its response, rather than the content. Character training, while being important to the user experience within language model chatbots, is effectively unstudied in the public domain.