appendix-c
                    appendix C Exercise solutions
The complete code examples for the exercises’ answers can be found in the supplementary GitHub repository at https://github.com/rasbt/LLMs-from-scratch.
Chapter 2
Exercise 2.1
You can obtain the individual token IDs by prompting the encoder with one string at a time:
print(tokenizer.encode("Ak"))
print(tokenizer.encode("w"))
# ... 
   
 This prints
[33901] [86] # ...
You can then use the following code to assemble the original string:
print(tokenizer.decode([33901, 86, 343, 86, 220, 959]))
This returns
'Akwirw ier'
Exercise 2.2
dataloader = create_dataloader(
    raw_text, batch_size=4, max_length=2, stride=2
) 
   
 It produces batches of the following format:
tensor([[  40,  367],
        [2885, 1464],
        [1807, 3619],
        [ 402,  271]]) 
   
 The code of the second data loader with max_length=8 and stride=2:
dataloader = create_dataloader(
    raw_text, batch_size=4, max_length=8, stride=2
) 
   
 An example batch looks like
tensor([[   40,   367,  2885,  1464,  1807,  3619,   402,   271],
        [ 2885,  1464,  1807,  3619,   402,   271, 10899,  2138],
        [ 1807,  3619,   402,   271, 10899,  2138,   257,  7026],
        [  402,   271, 10899,  2138,   257,  7026, 15632,   438]]) 
   
 Chapter 3
Exercise 3.1
The correct weight assignment is