appendix-d

appendix D  Using larger LLMs

 

This book uses the 0.6-billion-parameter (0.6B) Qwen3 base model because it is the smallest model in the Qwen3 family and therefore the easiest to run on consumer hardware. But the same Qwen3Model implementation from appendix C is not limited to the 0.6B checkpoint. We can also use it to load larger Qwen3 checkpoints with the same PyTorch code we built from scratch.

In practice, this means that once we understand how to work with the 0.6B model, moving to a larger model mainly involves three changes:

  • Selecting the matching configuration dictionary
  • Downloading the larger checkpoint from Hugging Face
  • Loading the appropriate tokenizer for the base or reasoning variant

This appendix illustrates this process using the Qwen3 4B model as an example because it is meaningfully stronger than the 0.6B model while still being easier to handle than the larger 8B, 14B, and 32B variants.

D.1 Larger dense Qwen3 configurations

The book’s repository includes configuration dictionaries for several larger Qwen3 models in the reasoning_from_scratch.appendix_c Python library, listed in table D.1. You can also view the source code directly at https://github.com/rasbt/reasoning-from-scratch/blob/main/reasoning_from_scratch/appendix_c.py.

Table D.1 Qwen3 configurations (larger than 0.6B)

Model size

Configuration Python dictionary

1.7B

QWEN3_CONFIG_1_7B

4B

QWEN3_CONFIG_4B

8B

QWEN3_CONFIG_8B

14B

QWEN3_CONFIG_14B

32B

QWEN3_CONFIG_32B

D.2 Downloading larger checkpoints overview

D.3 Loading a larger base model

D.4 Loading a larger reasoning variant

D.5 Practical recommendations