Histogram plot
histogram(
creates
a histogram plot of X
)X
. The histogram
function
uses an automatic binning algorithm that returns bins with a uniform
width, chosen to cover the range of elements in X
and
reveal the underlying shape of the distribution. histogram
displays
the bins as rectangles such that the height of each rectangle indicates
the number of elements in the bin.
histogram(
, where C
)C
is
a categorical array, plots a histogram with a bar for each category
in C
.
histogram(
plots
only the subset of categories specified by C
,Categories
)Categories
.
histogram('Categories',
manually
specifies categories and associated bin counts. Categories
,'BinCounts',counts
)histogram
plots
the specified bin counts and does not do any data binning.
histogram(___,
specifies
additional options with one or more Name,Value
)Name,Value
pair
arguments using any of the previous syntaxes. For example, you can
specify 'BinWidth'
and a scalar to adjust the width
of the bins, or 'Normalization'
with a valid option
('count'
, 'probability'
, 'countdensity'
, 'pdf'
, 'cumcount'
,
or 'cdf'
) to use a different type of normalization.
histogram(
plots
into the axes specified by ax
,___)ax
instead of into the
current axes (gca
). The option ax
can
precede any of the input argument combinations in the previous syntaxes.
Generate 10,000 random numbers and create a histogram. The histogram
function automatically chooses an appropriate number of bins to cover the range of values in x
and show the shape of the underlying distribution.
x = randn(10000,1); h = histogram(x)
h = Histogram with properties: Data: [10000×1 double] Values: [1×37 double] NumBins: 37 BinEdges: [1×38 double] BinWidth: 0.2000 BinLimits: [-3.8000 3.6000] Normalization: 'count' FaceColor: 'auto' EdgeColor: [0 0 0] Use GET to show all properties
When you specify an output argument to the histogram
function, it returns a histogram object. You can use this object to inspect the properties of the histogram, such as the number of bins or the width of the bins.
Find the number of histogram bins.
nbins = h.NumBins
nbins = 37
Plot a histogram of 1,000 random numbers sorted into 25 equally spaced bins.
x = randn(1000,1); nbins = 25; h = histogram(x,nbins)
h = Histogram with properties: Data: [1000×1 double] Values: [1×25 double] NumBins: 25 BinEdges: [1×26 double] BinWidth: 0.2800 BinLimits: [-3.4000 3.6000] Normalization: 'count' FaceColor: 'auto' EdgeColor: [0 0 0] Use GET to show all properties
Find the bin counts.
counts = h.Values
counts = Columns 1 through 13 1 3 0 6 14 19 31 54 74 80 92 122 104 Columns 14 through 25 115 88 80 38 32 21 9 5 5 5 0 2
Generate 1,000 random numbers and create a histogram.
X = randn(1000,1); h = histogram(X)
h = Histogram with properties: Data: [1000×1 double] Values: [1×23 double] NumBins: 23 BinEdges: [1×24 double] BinWidth: 0.3000 BinLimits: [-3.3000 3.6000] Normalization: 'count' FaceColor: 'auto' EdgeColor: [0 0 0] Use GET to show all properties
Use the morebins
function to coarsely adjust the number of bins.
Nbins = morebins(h); Nbins = morebins(h)
Nbins = 29
Adjust the bins at a fine grain level by explicitly setting the number of bins.
h.NumBins = 31;
Generate 1,000 random numbers and create a histogram. Specify the bin edges as a vector with wide bins on the edges of the histogram to capture the outliers that do not satisfy
. The first vector element is the left edge of the first bin, and the last vector element is the right edge of the last bin.
x = randn(1000,1); edges = [-10 -2:0.25:2 10]; h = histogram(x,edges);
Specify the Normalization
property as 'countdensity'
to flatten out the bins containing the outliers. Now, the area of each bin (rather than the height) represents the frequency of observations in that interval.
h.Normalization = 'countdensity';
Create a categorical vector that represents votes. The categories in the vector are 'yes'
, 'no'
, or 'undecided'
.
A = [0 0 1 1 1 0 0 0 0 NaN NaN 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 1]; C = categorical(A,[1 0 NaN],{'yes','no','undecided'})
C = Columns 1 through 9 no no yes yes yes no no no no Columns 10 through 16 undecided undecided yes no no no yes Columns 17 through 25 no yes no yes no no no yes yes Columns 26 through 27 yes yes
Plot a categorical histogram of the votes, using a relative bar width of 0.5
.
h = histogram(C,'BarWidth',0.5)
h = Histogram with properties: Data: [1×27 categorical] Values: [11 14 2] Categories: {'yes' 'no' 'undecided'} Normalization: 'count' DisplayStyle: 'bar' FaceColor: 'auto' EdgeColor: [0 0 0] Use GET to show all properties
Generate 1,000 random numbers and create a histogram using the 'probability'
normalization.
x = randn(1000,1); h = histogram(x,'Normalization','probability')
h = Histogram with properties: Data: [1000×1 double] Values: [1×23 double] NumBins: 23 BinEdges: [1×24 double] BinWidth: 0.3000 BinLimits: [-3.3000 3.6000] Normalization: 'probability' FaceColor: 'auto' EdgeColor: [0 0 0] Use GET to show all properties
Compute the sum of the bar heights. With this normalization, the height of each bar is equal to the probability of selecting an observation within that bin interval, and the height of all of the bars sums to 1.
S = sum(h.Values)
S = 1
Generate two vectors of random numbers and plot a histogram for each vector in the same figure.
x = randn(2000,1);
y = 1 + randn(5000,1);
h1 = histogram(x);
hold on
h2 = histogram(y);
Since the sample size and bin width of the histograms are different, it is difficult to compare them. Normalize the histograms so that all of the bar heights add to 1, and use a uniform bin width.
h1.Normalization = 'probability'; h1.BinWidth = 0.25; h2.Normalization = 'probability'; h2.BinWidth = 0.25;
Generate 1,000 random numbers and create a histogram. Return the histogram object to adjust the properties of the histogram without recreating the entire plot.
x = randn(1000,1); h = histogram(x)
h = Histogram with properties: Data: [1000×1 double] Values: [1×23 double] NumBins: 23 BinEdges: [1×24 double] BinWidth: 0.3000 BinLimits: [-3.3000 3.6000] Normalization: 'count' FaceColor: 'auto' EdgeColor: [0 0 0] Use GET to show all properties
Specify exactly how many bins to use.
h.NumBins = 15;
Specify the edges of the bins with a vector. The first value in the vector is the left edge of the first bin. The last value is the right edge of the last bin.
h.BinEdges = [-3:3];
Change the color of the histogram bars.
h.FaceColor = [0 0.5 0.5];
h.EdgeColor = 'r';
Generate 5,000 normally distributed random numbers with a mean of 5 and a standard deviation of 2. Plot a histogram with Normalization
set to 'pdf'
to produce an estimation of the probability density function.
x = 2*randn(5000,1) + 5; histogram(x,'Normalization','pdf')
In this example, the underlying distribution for the normally distributed data is known. You can, however, use the 'pdf'
histogram plot to determine the underlying probability distribution of the data by comparing it against a known probability density function.
The probability density function for a normal distribution with mean
, standard deviation
, and variance
is
Overlay a plot of the probability density function for a normal distribution with a mean of 5 and a standard deviation of 2.
hold on y = -5:0.1:15; mu = 5; sigma = 2; f = exp(-(y-mu).^2./(2*sigma^2))./(sigma*sqrt(2*pi)); plot(y,f,'LineWidth',1.5)
X
— Data to distribute among binsData to distribute among bins, specified as a vector, matrix,
or multidimensional array. If X
is not a vector,
then histogram
treats it as a single column vector, X(:)
,
and plots a single histogram.
histogram
ignores all NaN
values.
Similarly, histogram
ignores Inf
and -Inf
values,
unless the bin edges explicitly specify Inf
or -Inf
as
a bin edge.
Note:
If |
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
| logical
C
— Categorical dataCategorical data, specified as a categorical array. histogram
ignores
undefined categorical values.
Data Types: categorical
nbins
— Number of binsNumber of bins, specified as a positive integer. If you do not
specify nbins
, then histogram
automatically
calculates how many bins to use based on the values in X
.
Example: histogram(X,15)
creates a histogram
with 15 bins.
edges
— Bin edgesBin edges, specified as a vector. edges(1)
is
the left edge of the first bin, and edges(end)
is
the right edge of the last bin.
The value X(i)
is in the k
th
bin if edges(k)
≤ X(i)
< edges(k+1)
.
The last bin also includes the right bin edge, so that it contains X(i)
if edges(end-1)
≤ X(i)
≤ edges(end)
.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
| logical
Categories
— Categories included in histogramNote: This option only applies to categorical histograms. |
Categories included in histogram, specified as a cell array of character vectors or categorical vector.
If you specify an input categorical array C
,
then by default, histogram
plots a bar for each
category in C
. In that case, use Categories
to
specify a unique subset of the categories instead.
If you specify bin counts, then Categories
specifies
the associated category names for the histogram.
Example: h = histogram(C,{'Large','Small'})
plots
only the categorical data in the categories 'Large'
and 'Small'
.
Example: histogram('Categories',{'Yes','No','Maybe'},'BinCounts',[22
18 3])
plots a histogram that has three categories with
the associated bin counts.
Example: h.Categories
queries
the categories that are in histogram object h
.
Data Types: cell
| categorical
counts
— Bin countsBin counts, specified as a vector. Use this input to pass bin
counts to histogram
when the bin counts calculation
is performed separately and you do not want histogram
to
do any data binning.
The length of counts
must be equal to the
number of bins.
For numeric histograms, the number of bins is length(edges)-1
.
For categorical histograms, the number of bins is equal to the number of categories.
Example: histogram('BinEdges',-2:2,'BinCounts',[5 8
15 9])
Example: histogram('Categories',{'Yes','No','Maybe'},'BinCounts',[22
18 3])
ax
— Axes objectAxes object. If you do not specify an axes, then the histogram
function
uses the current axes (gca
).
Specify optional comma-separated pairs of Name,Value
arguments.
Name
is the argument
name and Value
is the corresponding
value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN
.
histogram(X,'BinWidth',5)
The histogram properties listed here are only a subset. For a complete list, see Histogram Properties.
'BarWidth'
— Relative width of categorical bars0.9
(default) | scalar in range [0,1]
Note: This option only applies to histograms of categorical data. |
Relative width of categorical bars, specified as a scalar value
in the range [0,1]
. Use this property to control
the separation of categorical bars within the histogram. The default
value is 0.9
, which means that the bar width is
90% of the space from the previous bar to the next bar, with 5% of
that space on each side.
If you set this property to 1
, then adjacent
bars touch.
Example: 0.5
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
'BinLimits'
— Bin limitsBin limits, specified as a two-element vector, [bmin,bmax]
.
This option plots a histogram using the values in the input array, X
,
that fall between bmin
and bmax
inclusive.
That is, X(X>=bmin & X<=bmax)
.
This option does not apply to histograms of categorical data.
Example: histogram(X,'BinLimits',[1,10])
plots
a histogram using only the values in X
that are
between 1
and 10
inclusive.
'BinLimitsMode'
— Selection mode for bin limits'auto'
(default) | 'manual'
Selection mode for bin limits, specified as 'auto'
or 'manual'
.
The default value is 'auto'
, so that the bin limits
automatically adjust to the data.
If you explicitly specify either BinLimits
or BinEdges
,
then BinLimitsMode
is automatically set to 'manual'
.
In that case, specify BinLimitsMode
as 'auto'
to
rescale the bin limits to the data.
This option does not apply to histograms of categorical data.
'BinMethod'
— Binning algorithm'auto'
(default) | 'scott'
| 'fd'
| 'integers'
| 'sturges'
| 'sqrt'
Binning algorithm, specified as one of the values in this table.
Value | Description |
---|---|
'auto' | The default 'auto' algorithm chooses a bin
width to cover the data range and reveal the shape of the underlying
distribution. |
'scott' | Scott's rule is optimal if the data is close to being
normally distributed. This rule is appropriate for most other distributions,
as well. It uses a bin width of 3.5*std(X(:))*numel(X)^(-1/3) . |
'fd' | The Freedman-Diaconis rule is less sensitive to outliers in
the data, and might be more suitable for data with heavy-tailed distributions.
It uses a bin width of 2*IQR(X(:))*numel(X)^(-1/3) ,
where IQR is the interquartile range of X . |
'integers' | The integer rule is useful with integer data, as it creates a bin for each integer. It uses a bin width of 1 and places bin edges halfway between integers. To avoid accidentally creating too many bins, you can use this rule to create a limit of 65536 bins (216). If the data range is greater than 65536, then the integer rule uses wider bins instead. |
'sturges' | Sturges' rule is popular due to its simplicity. It chooses
the number of bins to be ceil(1 + log2(numel(X))) . |
'sqrt' | The Square Root rule is widely used in other software packages.
It chooses the number of bins to be ceil(sqrt(numel(X))) . |
This option does not apply to histograms of categorical data.
Note:
If you set the |
Example: histogram(X,'BinMethod','integers')
creates
a histogram with the bins centered on integers.
'BinWidth'
— Width of binsWidth of bins, specified as a scalar. When you specify BinWidth
,
then histogram
can use a maximum of 65,536 bins
(or 216).
If instead the specified bin width requires more bins, then histogram
uses
a larger bin width corresponding to the maximum number of bins.
This option does not apply to histograms of categorical data.
Example: histogram(X,'BinWidth',5)
uses bins
with a width of 5.
'DisplayStyle'
— Histogram display style'bar'
(default) | 'stairs'
Histogram display style, specified as either 'bar'
or 'stairs'
.
Specify 'stairs'
to display a stairstep plot, which
displays the outline of the histogram without filling the interior.
The default value of 'bar'
displays a histogram
bar plot.
Example: histogram(X,'DisplayStyle','stairs')
plots
the outline of the histogram.
'EdgeAlpha'
— Transparency of histogram bar edges1
(default) | scalar value between 0
and 1
inclusiveTransparency of histogram bar edges, specified as a scalar value
between 0
and 1
inclusive. A
value of 1
means fully opaque and 0
means
completely transparent (invisible).
Example: histogram(X,'EdgeAlpha',0.5)
creates
a histogram plot with semi-transparent bar edges.
'EdgeColor'
— Histogram edge color[0 0 0]
or black (default) | 'none'
| 'auto'
| RGB triplet or color nameHistogram edge color, specified as one of these values:
'none'
— Edges are not drawn.
'auto'
— Color of each edge
is chosen automatically.
RGB triplet or a color name — Edges use the specified color.
An RGB triplet is a three-element row vector whose elements
specify the intensities of the red, green, and blue components of
the color. The intensities must be in the range [0,1]
;
for example, [0.4 0.6 0.7]
. This table lists the
long and short color name options and the equivalent RGB triplet values.
Long Name | Short Name | RGB Triplet |
---|---|---|
'yellow' | 'y' | [1 1 0] |
'magenta' | 'm' | [1 0 1] |
'cyan' | 'c' | [0 1 1] |
'red' | 'r' | [1 0 0] |
'green' | 'g' | [0 1 0] |
'blue' | 'b' | [0 0 1] |
'white' | 'w' | [1 1 1] |
'black' | 'k' | [0 0 0] |
Example: histogram(X,'EdgeColor','r')
creates
a histogram plot with red bar edges.
'FaceAlpha'
— Transparency of histogram bars0.6
(default) | scalar value between 0
and 1
inclusiveTransparency of histogram bars, specified as a scalar value
between 0
and 1
inclusive. histogram
uses
the same transparency for all the bars of the histogram. A value of 1
means
fully opaque and 0
means completely transparent
(invisible).
Example: histogram(X,'FaceAlpha',1)
creates
a histogram plot with fully opaque bars.
'FaceColor'
— Histogram bar color'auto'
(default) | 'none'
| RGB triplet or color nameHistogram bar color, specified as one of these values:
'none'
— Bars are not filled.
'auto'
— Histogram bar color
is chosen automatically (default).
RGB triplet or a color name — Bars are filled with the specified color.
An RGB triplet is a three-element row vector whose elements
specify the intensities of the red, green, and blue components of
the color. The intensities must be in the range [0,1]
;
for example, [0.4 0.6 0.7]
. This table lists the
long and short color name options and the equivalent RGB triplet values.
Long Name | Short Name | RGB Triplet |
---|---|---|
'yellow' | 'y' | [1 1 0] |
'magenta' | 'm' | [1 0 1] |
'cyan' | 'c' | [0 1 1] |
'red' | 'r' | [1 0 0] |
'green' | 'g' | [0 1 0] |
'blue' | 'b' | [0 0 1] |
'white' | 'w' | [1 1 1] |
'black' | 'k' | [0 0 0] |
If you specify DisplayStyle
as 'stairs'
,
then histogram
does not use the FaceColor
property.
Example: histogram(X,'FaceColor','g')
creates
a histogram plot with green bars.
'LineStyle'
— Line style'-'
(default) | '--'
| ':'
| '-.'
| 'none'
Line style, specified as one of the line styles listed in this table.
Line Style | Description | Resulting Line |
---|---|---|
'-' | Solid line |
|
'--' | Dashed line |
|
':' | Dotted line |
|
'-.' | Dash-dotted line |
|
'none' | No line | No line |
'LineWidth'
— Width of bar outlines0.5
(default) | positive valueWidth of bar outlines, specified as a positive value in point units. One point equals 1/72 inch.
Example: 1.5
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
'Normalization'
— Type of normalization'count'
(default) | 'probability'
| 'countdensity'
| 'pdf'
| 'cumcount'
| 'cdf'
Type of normalization, specified as one of the values in this table.
Value | Description |
---|---|
'count' | Default normalization scheme. The height of each bar
is the number of observations in each bin. The sum of the bar heights
is For categorical histograms,
the sum of the bar heights is either |
'probability' | The height of each bar is the relative number of observations,
(number of observations in bin / total number of observations). The
sum of the bar heights is For categorical
histograms, the height of each bar is, (number of elements in category
/ total number of elements in all categories). The sum of the bar
heights is |
'countdensity' | The height of each bar is, (number of observations in
bin / width of bin). The area (height * width) of each bar is the
number of observations in the bin. The sum of the bar areas is For
categorical histograms, this is the same as |
'pdf' | Probability density function estimate. The height of
each bar is, (number of observations in the bin) / (total number of
observations * width of bin). The area of each bar is the relative
number of observations. The sum of the bar areas is For
categorical histograms, this is the same as |
'cumcount' | The height of each bar is the cumulative number of observations
in each bin and all previous bins. The height of the last bar is For
categorical histograms, the height of each bar is equal to the cumulative
number of elements in each category and all previous categories. The
height of the last bar is |
'cdf' | Cumulative density function estimate. The height of each
bar is equal to the cumulative relative number of observations in
the bin and all previous bins. The height of the last bar is For
categorical data, the height of each bar is equal to the cumulative
relative number of observations in each category and all previous
categories. The height of the last bar is |
Example: histogram(X,'Normalization','pdf')
plots
an estimate of the probability density function for X
.
'Orientation'
— Orientation of bars'vertical'
(default) | 'horizontal'
Orientation of bars, specified as 'vertical'
or 'horizontal'
.
Example: histogram(X,'Orientation','horizontal')
creates
a histogram plot with horizontal bars.
h
— HistogramHistogram, returned as an object. For more information, see histogram.
This function supports tall arrays with the limitations:
Some input options are not supported. The allowed options are:
'BinWidth'
'BinLimits'
'Normalization'
'DisplayStyle'
'BinMethod'
— The 'auto'
and 'scott'
bin
methods are the same. The 'fd'
bin method is not
supported.
'EdgeAlpha'
'EdgeColor'
'FaceAlpha'
'FaceColor'
'LineStyle'
'LineWidth'
'Orientation'
Additionally, there is a cap on the maximum number of bars. The default maximum is 100.
The morebins
and fewerbins
methods
are not supported.
For more information, see Tall Arrays.
Histogram plots created using histogram
have
a context menu in plot edit mode that enables interactive manipulations
in the figure window. For example, you can use the context menu to
interactively change the number of bins, align multiple histograms,
or change the display order.
discretize
| fewerbins
| histcounts
| histcounts2
| histogram2
| morebins