concept histogram in category gnuplot

This is an excerpt from Manning's book Gnuplot in Action.
Although all of these smoothing methods are invoked using the smooth keyword, they fall into three broad groups that do different things and apply to different situations. Methods 1–5 construct a smooth curve through a set of (x,y) coordinate pairs. Methods 6–8 treat the data set as a one-dimensional (or univariate) point distribution and generate visual representations of this distribution. Methods 9 and 10 remove multiple entries from a data set. As it turns out, method 10 (frequency) can also be used to count data points in order to construct a histogram.
Figure 13.9. An alternative to histograms: kernel density estimates using smooth kdensity. Curves for three different bandwidths are shown. A bandwidth of 0.3 seems to give the best trade-off between smoothing action and retention of details. Note how it brings out the secondary cluster near x=3.5. Individual data points are represented through the rug plot along the bottom.
![]()
The next step when investigating the properties of a distribution usually involves drawing a histogram. To create a histogram, you assign data points to buckets or bins and count how many events fall into each bin. It’s easiest to make all bins have equal width, but with proper normalization per bin, you can make a histogram containing bins of differing widths. This is sometimes useful out in the tails of a distribution where the number of events per bin is small.
Gnuplot doesn’t have an explicit histogramming function, but you can use the smooth frequency functionality (see section 3.5.3) to good effect. Recall: smooth frequency sorts the x values by size and then plots the sum of y values per x value. That’s what you need to build a histogram.
The smooth frequency feature forms the sum of all y values falling into each bin. If all you care about is the overall shape of the histogram, you may supply any constant, such as (1); but if you want to obtain a normalized histogram (one with a total surface area equal to unity), you need to take into account the number of points in the sample and the bin width. You can convince yourself easily that the proper y value for a normalized histogram is
You can use the with boxes style to draw a histogram (see figure 13.8), but you want to fix the width of the boxes in the graph to coincide with the bin width. (By default, the boxes expand to touch their neighbors, which leads to a faulty graphical representation if some of the internal bins are empty.) Choose a bin width of 0.1, and use the stats command to find the number of points in the data set. The plot command uses the binc() function, because the with boxes style positions its boxes centered at the supplied position:

This is an excerpt from Manning's book Gnuplot in Action: Understanding Data with Graphs.
New plot styles: filled curves and boxes, histograms, and vectors.
Figure 13.10. An alternative to histograms: kernel density estimates using smooth kdensity. Curves for three different bandwidths are shown. A bandwidth of 0.3 seems to give the best trade-off between smoothing action and retention of details. Note how it brings out the secondary cluster near x=3.5.
![]()
The next step when investigating the properties of a distribution usually involves drawing a histogram. To create a histogram, we assign data points to buckets or bins and count how many events fall into each bin. It’s easiest to make all bins have equal width, but with proper normalization per bin, we can make a histogram containing bins of differing widths. This is sometimes useful out in the tails of a distribution where the number of events per bin is small.
Gnuplot doesn’t have an explicit histogramming function, but we can use the smooth frequency functionality (see section 3.2) to good effect. Recall: smooth frequency sorts the x values by size, and then plots the sum of y values per x value. That’s what we need to build a histogram.