This example shows how to convert a variable in a table from a cell array of character vectors to a categorical array.
Load sample data gathered from 100 patients.
load patients
whos
Name Size Bytes Class Attributes Age 100x1 800 double Diastolic 100x1 800 double Gender 100x1 12212 cell Height 100x1 800 double LastName 100x1 12416 cell Location 100x1 15008 cell SelfAssessedHealthStatus 100x1 12340 cell Smoker 100x1 100 logical Systolic 100x1 800 double Weight 100x1 800 double
Store the patient data from Age
, Gender
, Height
, Weight
, SelfAssessedHealthStatus
, and Location
in a table. Use the unique identifiers in the variable LastName
as row names.
T = table(Age,Gender,Height,Weight,... SelfAssessedHealthStatus,Location,... 'RowNames',LastName);
The cell arrays of character vectors, Gender
and Location
, contain discrete sets of unique values.
Convert Gender
and Location
to categorical arrays.
T.Gender = categorical(T.Gender); T.Location = categorical(T.Location);
The variable, SelfAssessedHealthStatus
, contains four unique values: Excellent
, Fair
, Good
, and Poor
.
Convert SelfAssessedHealthStatus
to an ordinal categorical array, such that the categories have the mathematical ordering Poor < Fair < Good < Excellent
.
T.SelfAssessedHealthStatus = categorical(T.SelfAssessedHealthStatus,... {'Poor','Fair','Good','Excellent'},'Ordinal',true);
View the data type, description, units, and other descriptive statistics for each variable by using summary
to summarize the table.
format compact
summary(T)
Variables: Age: 100×1 double Values: min 25 median 39 max 50 Gender: 100×1 categorical Values: Female 53 Male 47 Height: 100×1 double Values: min 60 median 67 max 72 Weight: 100×1 double Values: min 111 median 142.5 max 202 SelfAssessedHealthStatus: 100×1 ordinal categorical Values: Poor 11 Fair 15 Good 40 Excellent 34 Location: 100×1 categorical Values: County General Hospital 39 St. Mary's Medical Center 24 VA Hospital 37
The table variables Gender
, SelfAssessedHealthStatus
, and Location
are categorical arrays. The summary contains the counts of the number of elements in each category. For example, the summary indicates that 53 of the 100 patients are female and 47 are male.
Create a subtable, T1
, containing the age, height, and weight of all female patients who were observed at County General Hospital. You can easily create a logical vector based on the values in the categorical arrays Gender
and Location
.
rows = T.Location=='County General Hospital' & T.Gender=='Female';
rows
is a 100-by-1 logical vector with logical true
(1
) for the table rows where the gender is female and the location is County General Hospital.
Define the subset of variables.
vars = {'Age','Height','Weight'};
Use parentheses to create the subtable, T1
.
T1 = T(rows,vars)
T1 = Age Height Weight ___ ______ ______ Brown 49 64 119 Taylor 31 66 132 Anderson 45 68 128 Lee 44 66 146 Walker 28 65 123 Young 25 63 114 Campbell 37 65 135 Evans 39 62 121 Morris 43 64 135 Rivera 29 63 130 Richardson 30 67 141 Cox 28 66 111 Torres 45 70 137 Peterson 32 60 136 Ramirez 48 64 137 Bennett 35 64 131 Patterson 37 65 120 Hughes 49 63 123 Bryant 48 66 134
A
is a 19-by-3 table.
Since ordinal categorical arrays have a mathematical ordering for their categories, you can perform element-wise comparisons of them with relational operations, such as greater than and less than.
Create a subtable, T2
, of the gender, age, height, and weight of all patients who assessed their health status as poor or fair.
First, define the subset of rows to include in table T2
.
rows = T.SelfAssessedHealthStatus<='Fair';
Then, define the subset of variables to include in table T2
.
vars = {'Gender','Age','Height','Weight'};
Use parentheses to create the subtable T2
.
T2 = T(rows,vars)
T2 = Gender Age Height Weight ______ ___ ______ ______ Johnson Male 43 69 163 Jones Female 40 67 133 Thomas Female 42 66 137 Jackson Male 25 71 174 Garcia Female 27 69 131 Rodriguez Female 39 64 117 Lewis Female 41 62 137 Lee Female 44 66 146 Hall Male 25 70 189 Hernandez Male 36 68 166 Lopez Female 40 66 137 Gonzalez Female 35 66 118 Mitchell Male 39 71 164 Campbell Female 37 65 135 Parker Male 30 68 182 Stewart Male 49 68 170 Morris Female 43 64 135 Watson Female 40 64 127 Kelly Female 41 65 127 Price Male 31 72 178 Bennett Female 35 64 131 Wood Male 32 68 183 Patterson Female 37 65 120 Foster Female 30 70 124 Griffin Male 49 70 186 Hayes Male 48 66 177
T2
is a 26-by-4 table.