chapter twenty

20 Product, UX, and model character

 

Frontiers in RLHF and post-training show how these techniques are used within companies to make leading products. As RLHF becomes more established, the problems it is used to address are becoming more nuanced. In this chapter, we discuss a series of use-cases that leading AI laboratories consider RLHF and post-training for that are largely unstudied in the academic literature.

20.1 Character Training

Character training is the subset of post-training designed around crafting traits within the model in the manner of its response, rather than the content. Character training, while being important to the user experience within language model chatbots, is effectively unstudied in the public domain.

20.2 Model Specifications

20.3 Product Cycles, UX, and RLHF