Part 4: Sequence-to-sequence models and attention