
About This Book
This book is intended to be a comprehensive introduction to gnuplot: from the basics to the power features and beyond. Besides providing a tutorial on gnuplot itself, it demonstrates how to apply and use gnuplot to extract insight from data.
The gnuplot program has always had complete and detailed reference documentation, but what was missing was a continuous presentation that tied all the different bits and pieces of gnuplot together and demonstrated how to use them to achieve certain tasks. This book attempts to fill that gap.
The book should also serve as a handy reference for more advanced gnuplot users, and as an introduction to graphical ways of knowledge discovery.
And finally, this book tries to show you how to use gnuplot to achieve surprisingly nifty effects that will make everyone say, “How did you do that?”
This book is divided into four parts.
The first part provides a tutorial introduction to gnuplot and some of the things you can do with it. If you’re new to gnuplot, start here. Even if you already know gnuplot, I suggest you at least skim the chapters in this part: you might pick up a few new tricks. (For example, did you know that you can use gnuplot to plot the Unix password file? No? Thought so.)
The second part is about polishing and describes the ways that we can influence the appearance of a plot: by using different styles (chapter 5); using labels, arrows, and other decorations (chapter 6); and by changing the axes and the overall appearance of a graph (chapter 7). The material in these chapters has the character of a reference—this is the place to look up some detail when you need it.
In part 3, we move on to more advanced concepts. Here’s where we talk about fun topics such as color (chapter 9) and three-dimensional plots (chapter 8). Chapter 10 introduces more specialized topics, such as plots-within-a-plot, polar coordinates, and curve fitting. In chapter 11 we’ll talk about gnuplot terminals and ways to export our work to common file formats, and chapter 12 is about ways to use gnuplot in conjunction with, or instead of, a programming language.
In the last part, I’ll take gnuplot for granted, and focus instead on the things you can do with it. In chapter 13 I’ll present fundamental types of graphs and discuss when and how to use them. I’ll also show you how to generate such graphs with gnuplot. In the remaining two chapters, I focus on the discovery and analysis process itself, and describe some techniques that I’ve found helpful.
There are three appendixes. Appendix A describes how to obtain and install gnuplot if you don’t already have it. It also contains some pointers in case you want to build gnuplot from source.
Appendix B is a command and options reference, grouped by topic, not alphabetically. So if you know that you want to change the appearance of the tic labels, but you’ve forgotten which option to use, this appendix should point you in the right direction quickly.
In appendix C, I list some additional resources (books and web sites) that you might find helpful. I also give a brief overview of a few tools that are comparable to gnuplot.
I’ve tried to be comprehensive in my coverage of gnuplot’s features, with two exceptions. I don’t cover obsolete or redundant features. I also don’t discuss features that would only be of interest to a very limited group of users: all material in this book should (at least potentially) be useful to all readers, no matter what their situation. Where appropriate, I refer to the standard gnuplot reference documentation for details not discussed here.
As far as the examples are concerned, I’ve tried to present fresh or at least unfamiliar data sets. This means you won’t find a plot of the Challenger O-Ring data here, nor Napoleon’s march, nor Playfair’s charts regarding trade with the West Indies. (The one classic I would’ve liked to include is Anscombe’s Quartet, but I couldn’t find a suitable context in which to present it. If you’ve never seen it before, go and look it up yourself.)
This book presents a continuing narrative, and the material is arranged as if the reader were going to read the book sequentially, cover to cover.
But I know that most people reach for a piece of documentation when they need to “get something done, now!” Therefore, I tried to make this book as diveable as possible: once you’ve mastered the essential gnuplot basics, you should be able to open this book on any chapter that’s relevant to your current task and start reading, without loss of continuity.
While the chapters are conceived as largely independent of each other, each chapter presents a continuous, progressive exposition, which is best read in order, and from start to finish. The nature of the topic demands that concepts need to be introduced early in a chapter and not brought to completion until the end, after necessary circumstantial material has been introduced.
My advice to you is that you should feel free to pick any chapter you’re interested in, but that you should attempt to read each chapter in its entirety, end-to-end, to get the maximum out of this book. I know that the temptation is great to just read a relevant figure caption and then to take it from there, but I’d advise you against that. Gnuplot has many odd quirks, and many useful little tricks as well, which you will not learn about by just skimming the headlines and the captions. I tried to keep the chapters short—try to take them in as a whole.
One caveat: gnuplot is very connected, and explaining one feature often requires knowledge of some other feature. The most proper way to introduce gnuplot would have been to follow a strict bottom-up approach: first, introduce string handling and number formats, followed by the syntax for option management and styles, and finally, in the last chapter, bring it all together by talking about the plot command. This would’ve been easy to write, perfectly organized—and excruciatingly boring to read!
I take a different approach: explain the most common use early, and leave more exotic variants and applications of commands for later. The price we have to pay is an increased number of forward references. This is an In Action book: I want to get you going quickly, without burdening you with unnecessary details early on.
This book was written with two groups of people in mind: those who already know gnuplot, and those who don’t.
If you already know gnuplot, I hope that you’ll still find it a useful reference, in particular in regard to some of the more advanced topics in the second half of this book. I’ve tried to provide the big-picture explanations and the examples that have always been missing from the standard gnuplot reference documentation.
If you’re new to gnuplot, I think you’ll find it easy enough to pick up—in fact, I can promise you that by the end of chapter 2 you’ll be productive with gnuplot, and by the end of chapter 3 you’ll be well equipped for most day-to-day data graphing tasks that may come your way. A flat learning curve was one of the design objectives of the original gnuplot authors, and the ease with which you can get started is one of the great strengths of gnuplot today.
This book doesn’t require a strong background in mathematical methods, and none at all in statistics: anybody with some college level (or just high-school) math should be able to read this book without difficulty. (Some familiarity with basic calculus is advantageous, but by no means required.)
This book should be accessible and helpful to anybody who tries to understand data. This includes scientists and engineers—in other words the kinds of people who’ve always been using gnuplot. If this describes you, I think you’ll find this book a helpful reference and handbook for gnuplot.
But I think this book will also be helpful to readers who don’t have a primary background in analytical methods, yet need to deal with data as part of their jobs: business analysts, technical managers, software engineers. If this is your situation, you may find the discussions on graphical methods in part 4 particularly helpful.
I spell the name of the program in all lowercase, except at the beginning of a sentence, when I capitalize normally. This is in accordance with the usage recommended in the gnuplot FAQ.
The gnuplot documentation is extensive and I refer to it occasionally, for additional detail on topics covered briefly or not at all here. Traditionally, the gnuplot documentation has been called the online help or online documentation, owing to the fact that it’s available “online” during a gnuplot session. But since the advent of the Internet, the word online seems to suggest network connectivity—falsely in this context. To avoid confusion, I’ll always refer to the standard gnuplot reference documentation instead.
Gnuplot commands are shown using a typewriter font, like this: plot sin(x). Single command lines can get long; to make them fit on a page, I occasionally had to break them across multiple lines. If so, a gray arrow (å) has been placed at the beginning of the next line, to indicate that it is the continuation of the previous one:
The break in the original line is not indicated separately. When using gnuplot in an interactive session, your terminal program should wrap a line that is too long automatically. Alternatively, you can break lines by escaping the newline with a backslash as usual. This is useful in command files for batch processing, but you don’t want to do this during an interactive session, since it messes with the command history feature.
Gnuplot has a large number of options, and keeping all of them, and their suboptions and optional parameters, straight is a major theme running through this book. Throughout the text, and in the reference appendix B, you’ll find summaries of gnuplot commands and their options and suboptions.
Within these summaries, I use a few syntactic conventions. My intent is to stay close to the usage familiar from the standard gnuplot reference documentation, but also to follow more general conventions (such as those used for Unix man pages):
For parameters supplied by the user, it’s not always clear from the context what kind of information the command expects: is it a string or a number? If it’s a number, is it an index into some array or a numerical factor? And so on. I’ve tried to clarify this situation by prefixing each user-supplied input parameter with a type indicator, terminated by a colon. I summarize the prefixes and their meanings in table 1.
Table 1. Type indicators for user-supplied parameters
Prefix |
Description |
---|---|
str: |
A string |
int: |
An integer number |
flt: |
A floating-point number |
idx: |
An integer number, which is interpreted as index into an existing array |
clr: |
A color specification—for example, rgbcolor "red" or rgb "#FFFF00" |
pos: |
A pair of coordinates, comma separated, optionally containing coordinate system specifiers—for example, 0,0 or first 1.1, screen 0.9 |
enum: |
A gnuplot keyword as unquoted string |
Many gnuplot options and directives have abbreviated forms, some of which I use frequently in the latter parts of the book. Table 2 lists both the abbreviated and the full forms. Also keep in mind that an empty filename inside the plot command uses the most recently named file in the same command line again.
Table 2. Abbreviations for the frequently used directives to the plot command and for the most important options
Abbreviation |
Full |
---|---|
i |
index |
ev |
every |
u |
using |
s |
smooth |
s acs |
smooth acsplines |
t |
title |
w l |
with lines |
w linesp or w lp |
with linespoints |
w p |
with points |
set t |
set terminal |
set o |
set output |
set logsc |
set logscale |
This book describes version 4.2.x of gnuplot, which was initially released in March 2007. The most current bug-fix release at the time of this writing is version 4.2.5, released in March 2009.
After being stagnant for a long time, gnuplot development has picked up again in the last few years, so that things have changed significantly since gnuplot version 3.7. I won’t explain obsolete or deprecated features, and only make cursory remarks (if that) regarding backward compatibility.
Some installations and distributions still use gnuplot 4.0 (or older). Not all examples in this book will work with version 4.0 of gnuplot or earlier. If this is your situation, you should upgrade, either by installing a precompiled binary of version 4.2, or by compiling gnuplot from source. Appendix A tells you how to do it.
The current development version is gnuplot 4.3, which will be released eventually as minor gnuplot release 4.4 (or potentially as major 5.0 release). Except for some features that I’ve worked on myself (such as the smooth cumul and smooth kdens features I’ll introduce in chapter 14), I won’t have much to say about upcoming features in the next gnuplot release.
I assume you have access to a reasonably modern computer (not older than five years or so), running any flavor of Unix/Linux, a recent release of MS Windows, or Mac OS X. Although gnuplot has been ported to many other platforms in the past, most of them are by now obsolete, and I won’t talk about them in this book.
My education is in physics, and I’ve worked as technology consultant, software engineer, technical lead, and project manager, for small startups and in large corporate environments, both in the U.S. and overseas.
I first started using gnuplot when I was a graduate student, and it has become an indispensable part of my toolbox: one of the handful of programs I can’t do without. Recently, I’ve also started to contribute a few features to the gnuplot development version.
I provide consulting services specializing in corporate metrics, business intelligence, data analysis, and mathematical modeling through my company, Principal Value, LLC (www.principal-value.com). I also teach classes on software design and data analysis at the University of Washington.
I hold a Ph.D. in theoretical physics from the University of Washington.
Purchase of Gnuplot in Action includes free access to a private web forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the author and from other users. To access the forum and subscribe to it, point your web browser to www.manning.com/GnuplotinAction. This page provides information on how to get on the forum once you are registered, what kind of help is available, and the rules of conduct on the forum. It also provides links to the source code for the examples in the book, errata, and other downloads.
Manning’s commitment to our readers is to provide a venue where a meaningful dialog between individual readers and between readers and the author can take place. It is not a commitment to any specific amount of participation on the part of the author, whose contribution to the Author Online remains voluntary (and unpaid). We suggest you try asking the author some challenging questions lest his interest stray!
The Author Online forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.
By combining introductions, overviews, and how-to examples, the In Action books are designed to help learning and remembering. According to research in cognitive science, the things people remember are things they discover during self-motivated exploration.
Although no one at Manning is a cognitive scientist, we are convinced that for learning to become permanent it must pass through stages of exploration, play, and, interestingly, retelling of what is being learned. People understand and remember new things, which is to say they master them, only after actively exploring them. Humans learn in action. An essential part of an In Action guide is that it’s example-driven. It encourages the reader to try things out, to play with new code, and explore new ideas.
There is another, more mundane, reason for the title of this book: our readers are busy. They use books to do a job or solve a problem. They need books that allow them to jump in and jump out easily and learn just what they want just when they want it. They need books that aid them in action. The books in this series are designed for such readers.
The figure on the cover of Gnuplot in Action is captioned “A peer of France.” The title of Peer of France was held by the highest-ranking members of the French nobility. It was an extraordinary honor granted only to few dukes, counts, and princes of the church. The illustration is taken from a 19th-century edition of Sylvain Maréchal’s four-volume compendium of regional dress customs published in France. Each illustration is finely drawn and colored by hand.
The rich variety of Maréchal’s collection reminds us vividly of how culturally apart the world’s towns and regions were just 200 years ago. Isolated from each other, people spoke different dialects and languages. In the streets or in the countryside, it was easy to identify where they lived and what their trade or station in life was just by their dress.
Dress codes have changed since then and the diversity by region, so rich at the time, has faded away. It is now hard to tell apart the inhabitants of different continents, let alone different towns or regions. Perhaps we have traded cultural diversity for a more varied personal life-certainly for a more varied and fast-paced technological life.
At a time when it is hard to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Maréchal’s pictures.