chapter six

6 Words Count: Reading files/STDIN, iterating lists, formatting strings

 

"I love to count!" — Count von Count

Counting things is a surprisingly important programming skill. Maybe you’re trying to find how many pizzas were sold each quarter or how many times you see certain words in a set of documents. Usually the data we deal with in computing comes to us in files, so we’re going to push a little further into reading files and manipulating strings by writing a Python version of the venerable Unix wc ("word count") program.

We’re going to write a program called wc.py that will count the lines, words, and bytes found in each input. The counts will appear in columns 8 characters wide and will be followed by the name of the file. The inputs for the program which may be given as one or more positional arguments. For instance, here is what it should print for one file:

$ ./wc.py ../inputs/scarlet.txt
    7035   68061  396320 ../inputs/scarlet.txt

When counting multiple files, there will be an additional "total" line summing each column:

$ ./wc.py ../inputs/const.txt ../inputs/sonnet-29.txt
     865    7620   44841 ../inputs/const.txt
      17     118     661 ../inputs/sonnet-29.txt
     882    7738   45502 total

6.1  Writing wc.py

6.1.1  Defining file inputs

6.1.2  Iterating lists

6.1.3  What you’re counting

6.1.4  Formatting your results

6.2  Solution

6.3  Discussion

6.3.1  Defining the arguments

6.3.2  Reading a file using a for loop

6.4  Review

6.5  Going Further