Stack data from multiple variables into single variable
converts
the table or timetable, S
= stack(U
,vars
)U
, into an equivalent table
or timetable, S
, that is stacked. The stack
function
stacks up multiple variables from U
, specified
by vars
, into a single variable in S
.
In general, S
contains fewer variables, but more
rows, than U
.
The output argument, S
, contains a new categorical
variable to indicate which variable in U
the stacked
data in each row came from. stack
replicates data
from the variables in U
that are not stacked.
If U
is a timetable, then stack
also
replicates the row times. You cannot stack the row times of U
.
converts
the table, S
= stack(U
,vars
,Name,Value
)U
, with additional options specified
by one or more Name,Value
pair arguments.
For example, you can specify variable names for the new and
stacked variables in U
.
Create a table containing test scores from three separate tests. The table is in unstacked format.
Test1 = [93;57;87;89]; Test2 = [89;77;92;86]; Test3 = [95;62;89;91]; U = table(Test1,Test2,Test3)
U = Test1 Test2 Test3 _____ _____ _____ 93 89 95 57 77 62 87 92 89 89 86 91
The table contains four rows and three variables.
Stack the test scores into a single variable.
S = stack(U,1:3)
S = Test1_Test2_Test3_Indicator Test1_Test2_Test3 ___________________________ _________________ Test1 93 Test2 89 Test3 95 Test1 57 Test2 77 Test3 62 Test1 87 Test2 92 Test3 89 Test1 89 Test2 86 Test3 91
S
contains twelve rows and two variables. S
is in stacked format.
The categorical variable, Test1_Test2_Test3_Indicator
, identifies which test corresponds to the score in the stacked data variable, Test1_Test2_Test3
.
Create a table indicating the amount of snowfall at three locations from five separate storms.
Storm = [1;2;3;4;5]; Date = {'12/25/11';'1/2/12';'1/23/12';'2/7/12';'2/15/12'}; Natick = [20;5;13;0;17]; Boston = [18;9;21;5;12]; Worcester = [26;10;16;3;15]; U = table(Storm,Date,Natick,Boston,Worcester)
U = Storm Date Natick Boston Worcester _____ __________ ______ ______ _________ 1 '12/25/11' 20 18 26 2 '1/2/12' 5 9 10 3 '1/23/12' 13 21 16 4 '2/7/12' 0 5 3 5 '2/15/12' 17 12 15
The variables Storm
and Date
contain data that is constant at each location.
Stack the variables Natick
, Boston
, and Worcester
into a single variable. Name the variable containing the stacked data, Snowfall
, and name the new indicator variable, Town
.
S = stack(U,{'Natick','Boston','Worcester'},... 'NewDataVariableName','Snowfall',... 'IndexVariableName','Town')
S = Storm Date Town Snowfall _____ __________ _________ ________ 1 '12/25/11' Natick 20 1 '12/25/11' Boston 18 1 '12/25/11' Worcester 26 2 '1/2/12' Natick 5 2 '1/2/12' Boston 9 2 '1/2/12' Worcester 10 3 '1/23/12' Natick 13 3 '1/23/12' Boston 21 3 '1/23/12' Worcester 16 4 '2/7/12' Natick 0 4 '2/7/12' Boston 5 4 '2/7/12' Worcester 3 5 '2/15/12' Natick 17 5 '2/15/12' Boston 12 5 '2/15/12' Worcester 15
S
contains three rows for each storm, and stack
repeats the data in the constant variables, Storm
and Date
, accordingly.
The categorical variable, Town
, identifies which variable in U
contains the corresponding Snowfall
data.
Create a table containing estimated influenza rates along the east coast of the United States. Create a different variable for the Northeast, Mid Atlantic, and South Atlantic. Data Source: Google Flu Trends (http://www.google.org/flutrends).
Month = {'October';'November';'December';... 'January';'February';'March'}; Year = [2005*ones(3,1); 2006*ones(3,1)]; NE = [1.1902; 1.3610; 1.5003; 1.7772; 2.1350; 2.2345]; MidAtl = [1.1865; 1.4120; 1.6043; 1.8830; 2.1227; 1.9920]; SAtl = [1.2730; 1.5820; 1.8625; 1.9540; 2.4803; 2.0203]; fluU = table(Month,Year,NE,MidAtl,SAtl)
fluU = Month Year NE MidAtl SAtl __________ ____ ______ ______ ______ 'October' 2005 1.1902 1.1865 1.273 'November' 2005 1.361 1.412 1.582 'December' 2005 1.5003 1.6043 1.8625 'January' 2006 1.7772 1.883 1.954 'February' 2006 2.135 2.1227 2.4803 'March' 2006 2.2345 1.992 2.0203
The variables Month
and Year
contain data that is constant across the row.
Stack the variables NE
, MidAtl
, and SAtl
into a single variable called FluRate
. Name the new indicator variable Region
and output an index vector, ifluU
, to indicate the correspondence between rows in the input unstacked table, fluU
, and the output stacked table, fluS
.
[fluS,ifluU] = stack(fluU,3:5,... 'NewDataVariableName','FluRate',... 'IndexVariableName','Region')
fluS = Month Year Region FluRate __________ ____ ______ _______ 'October' 2005 NE 1.1902 'October' 2005 MidAtl 1.1865 'October' 2005 SAtl 1.273 'November' 2005 NE 1.361 'November' 2005 MidAtl 1.412 'November' 2005 SAtl 1.582 'December' 2005 NE 1.5003 'December' 2005 MidAtl 1.6043 'December' 2005 SAtl 1.8625 'January' 2006 NE 1.7772 'January' 2006 MidAtl 1.883 'January' 2006 SAtl 1.954 'February' 2006 NE 2.135 'February' 2006 MidAtl 2.1227 'February' 2006 SAtl 2.4803 'March' 2006 NE 2.2345 'March' 2006 MidAtl 1.992 'March' 2006 SAtl 2.0203 ifluU = 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6
ifluU(5)
is 2
. The fifth row in the output table, fluS
, contains data from the second row in the input table fluU
.
U
— Input tableInput table, specified as a table or a timetable.
vars
— Variables in U
to stackVariables in U
to stack, specified as a positive
integer, vector of positive integers, character vector, cell array
of character vectors, or logical vector.
Specify optional comma-separated pairs of Name,Value
arguments.
Name
is the argument
name and Value
is the corresponding
value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN
.
'NewDataVariableName','StackedData'
names
the new data variable StackedData
.'ConstantVariables'
— Variables other than vars
to include in the outputVariables other than vars
to include in the
output, specified as the comma-separated pair consisting of 'ConstantVariables'
and
a positive integer, vector of positive integers, character vector,
cell array of character vectors, or logical vector. stack
replicates
the data from the constant variables for each stacked entry from a
row.
The default is all the variables in U
not
specified by vars
. You can specify the 'ConstantVariables'
name-value
pair argument to exclude variables not specified by vars
or 'ConstantVariables'
from
the output table, S
.
'NewDataVariableName'
— Name for the new data variable in S
Name for the new data variable in S
, specified
as the comma-separated pair consisting of 'NewDataVariableName'
and
a character vector. The default is a concatenation of the names of
the variables from U
that are stacked up.
'IndexVariableName'
— Name for the new indicator variable in S
Name for the new indicator variable in S
,
specified as the comma-separated pair consisting of 'IndexVariableName'
and
a character vector. The default is a name based on NewDataVariableName
.
S
— Output tableStacked table, returned as a table or a timetable. S
contains
a stacked data variable, a categorical indicator variable, and any
constant variables.
You can store additional metadata such as descriptions, variable
units, variable names, and row names in the table. For more information,
see Table Properties
.
stack
assigns the variable units and variable
description property values from the first variable listed in vars
to
the corresponding S.Properties.VariableUnits
and S.Properties.VariableDescrisciptions
values
for the new data variable.
iu
— Index to U
Index to U
, returned as a column vector.
The index vector, iu
, identifies the row in the
input table, U
, containing the corresponding data. stack
creates
the j
th row in the output table, S
,
using U(iu(j),vars)
.
This function supports tall arrays with the limitation:
The two-output syntax [S,iu] = stack(...)
is
not supported.
For more information, see Tall Arrays.
You can specify more than one group of data variables
in U
, and each group becomes a stacked data variable
in S
. Use a cell array to contain multiple values
for vars
, and a cell array of character vectors
to contain multiple values for the 'NewDataVariableName'
name-value
pair argument. All groups must contain the same number of variables.