Chapter 20. Basic file wrangling
This chapter covers
- Moving and renaming files
- Compressing and encrypting files
- Selectively deleting files
This chapter deals with the basic operations you can use when you have an ever-increasing collection of files to manage. Those files might be log files, or they might be from a regular data feed, but whatever their source, you can’t simply discard them immediately. How do you save them, manage them, and ultimately dispose of them according to a plan, but without manual intervention?
Many systems generate a continuous series of data files. These files might be the log files from an e-commerce server or a regular process; they might be a nightly feed of product information from a server; they might be automated feeds of items for online advertising; historical data of stock trades; or they might come from a thousand other sources. They’re often flat text files, uncompressed, with raw data that’s either an input or a byproduct of other processes. In spite of their humble nature, however, the data they contain has some potential value, so the files can’t be discarded at the end of the day—which means that every day, their numbers grow. Over time, files accumulate until dealing with them manually becomes unworkable and until the amount of storage they consume becomes unacceptable.