6 Words count: Reading files and STDIN, iterating lists, formatting strings

 

“I love to count!”

--Count von Count

Counting things is a surprisingly important programming skill. Maybe you’re trying to find out how many pizzas were sold each quarter or how many times you see certain words in a set of documents. Usually the data we deal with in computing comes to us in files, so in this chapter, we’re going to push a little further into reading files and manipulating strings.


We’re going to write a Python version of the venerable wc (“word count”) program. Ours will be called wc.py, and it will count the lines, words, and bytes found in each input supplied as one or more positional arguments. The counts will appear in columns eight characters wide, and they will be followed by the name of the file. For instance, here is what wc.py should print for one file:

$ ./wc.py ../inputs/scarlet.txt
    7035   68061  396320 ../inputs/scarlet.txt

When counting multiple files, there will be an additional “total” line summing each column:

$ ./wc.py ../inputs/const.txt ../inputs/sonnet-29.txt
     865    7620   44841 ../inputs/const.txt
      17     118     661 ../inputs/sonnet-29.txt
     882    7738   45502 total

6.1 Writing wc.py

6.1.1 Defining file inputs

6.1.2 Iterating lists

6.1.3 What you’re counting

6.1.4 Formatting your results

6.2 Solution

6.3 Discussion

6.3.1 Defining the arguments

6.3.2 Reading a file using a for loop

6.4 Going further

Summary

sitemap