Create table from file
creates
a table by reading column oriented data from a file.T
= readtable(filename
)
readtable
determines the file format from
the file extension:
.txt
, .dat
,
or .csv
for delimited text files
.xls
, .xlsb
, .xlsm
, .xlsx
, .xltm
, .xltx
,
or .ods
for spreadsheet files
readtable
creates one variable in T
for
each column in the file and reads variable names from the first row
of the file. By default, the variables created are double
when
the entire column is numeric, or cell arrays of character vectors
when any element in a column is not numeric.
creates
a table from a file with additional options specified by one or more
name—value pair arguments.T
= readtable(filename
,Name,Value
)
For example, you can specify whether readtable
reads
the first row of the file as variable names or as data.
creates
a table using the import options and with additional options specified
by one or more of these name—value pair arguments: T
= readtable(filename
,opts
,Name,Value
)ReadVariableNames
, ReadRowNames
, DatetLocale
, FileEncoding
, Sheet
,
and Basic
.
Create a file, myCsvTable.dat
, that
contains the following comma-separated column oriented data.
LastName,Gender,Age,Height,Weight,Smoker Smith,M,38,71,176,1 Johnson,M,43,69,163,0 Williams,F,38,64,131,0 Jones,F,40,67,133,0 Brown,F,49,64,119,0
Create a table from the comma-separated text file. The
resulting table T
contains one variable for each
column in the file and readtable
treats the entries
in the first line of the file as variable names.
T = readtable('myCsvTable.dat')
T = LastName Gender Age Height Weight Smoker __________ ______ ___ ______ ______ ______ 'Smith' 'M' 38 71 176 1 'Johnson' 'M' 43 69 163 0 'Williams' 'F' 38 64 131 0 'Jones' 'F' 40 67 133 0 'Brown' 'F' 49 64 119 0
Create a file, mySpaceDelimTable.txt
,
that contains the following space delimited, column oriented data.
M 45 45 NY true F 41 32 CA false M 40 34 MA false
Create a table from the space delimited text file that does not contain variable names as column headings.
T = readtable('mySpaceDelimTable.txt',... 'Delimiter',' ','ReadVariableNames',false)
T = Var1 Var2 Var3 Var4 Var5 ____ ____ ____ ____ _______ 'M' 45 45 'NY' 'true' 'F' 41 32 'CA' 'false' 'M' 40 34 'MA' 'false'
T
contains default variable names.
Create a file, myCsvTable.dat
, that
contains the following comma-separated column oriented data.
LastName,Gender,Age,Height,Weight,Smoker Smith,M,38,71,176,1 Johnson,M,43,69,163,0 Williams,F,38,64,131,0 Jones,F,40,67,133,0 Brown,F,49,64,119,0
Create a table from the comma-separated text file. Import
the first two columns as character vectors, the third column as uint32
,
and the next two columns as double-precision, floating-point numbers.
Import the entries of the last column as character vectors.
T = readtable('myCsvTable.dat','Format','%s%s%u%f%f%s')
T = LastName Gender Age Height Weight Smoker __________ ______ ___ ______ ______ ______ 'Smith' 'M' 38 71 176 '1' 'Johnson' 'M' 43 69 163 '0' 'Williams' 'F' 38 64 131 '0' 'Jones' 'F' 40 67 133 '0' 'Brown' 'F' 49 64 119 '0'
The conversion specifiers are %s
for a cell
array of character vectors, %f
for double
,
and %u
for uint32
.
Read German dates from a file and add them to a table as English dates.
Create a sample file named myfile.txt
that
contains comma-separated values. The first column of values contains
dates in German and the second and third columns are numeric values.
fileID = fopen('myfile.txt','w','n','ISO-8859-15'); fprintf(fileID,'1 Januar 2014, 20.2, 100.5 \n'); fprintf(fileID,'1 Februar 2014, 21.6, 102.7 \n'); fprintf(fileID,'1 März 2014, 20.7, 99.8 \n'); fclose(fileID);
The sample file looks like this:
1 Januar 2014, 20.2, 100.5 1 Februar 2014, 21.6, 102.7 1 März 2014, 20.7, 99.8
Read the sample file using readtable
.
The conversion specifiers are %D
for a date and %f
for
floating-point values. Specify the file encoding using the FileEncoding
name-value
pair argument. Specify the format and locale of the dates using the DateLocale
name-value
pair argument.
T = readtable('myfile.txt','ReadVariableNames',false,... 'Format','%{dd MMMM yyyy}D %f %f',... 'FileEncoding','ISO-8859-15',... 'DateLocale','de_DE')
T = Var1 Var2 Var3 ________________ ____ _____ 01 January 2014 20.2 100.5 01 February 2014 21.6 102.7 01 March 2014 20.7 99.8
Create a table from a spreadsheet that contains variable names in the first row and row names in the first column.
T = readtable('patients.xls','ReadRowNames',true);
Display the first five rows and first four variables of the table.
T(1:5,1:4)
ans = Gender Age Location Height ________ ___ ___________________________ ______ Smith 'Male' 38 'County General Hospital' 71 Johnson 'Male' 43 'VA Hospital' 69 Williams 'Female' 38 'St. Mary's Medical Center' 64 Jones 'Female' 40 'VA Hospital' 67 Brown 'Female' 49 'County General Hospital' 64
View the DimensionNames
property of the table.
T.Properties.DimensionNames
ans = 1×2 cell array 'LastName' 'Variables'
'LastName'
is the name in the first column of the first row of the spreadsheet.
Create a table using data from a specified
region of the spreadsheet patients.xls
. Use the
data from the 5-by-3 rectangular region between the corners C2
and E6
.
Do not use the first row of this region as variable names.
T = readtable('patients.xls',... 'Range','C2:E6',... 'ReadVariableNames',false)
T = Var1 Var2 Var3 ____ ___________________________ ____ 38 'County General Hospital' 71 43 'VA Hospital' 69 38 'St. Mary's Medical Center' 64 40 'VA Hospital' 67 49 'County General Hospital' 64
T
contains default variable names.
Create import options, tailor the data types for multiple variables, and then read the data.
Create an import options object from a text file.
opts = detectImportOptions('airlinesmall.csv')
opts = DelimitedTextImportOptions with properties: Format Properties: Delimiter: {','} Whitespace: '\b\t ' LineEnding: {'\n' '\r' '\r\n'} CommentStyle: {} ConsecutiveDelimitersRule: 'split' LeadingDelimitersRule: 'keep' EmptyLineRule: 'skip' Encoding: 'US-ASCII' Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' ExtraColumnsRule: 'addvars' Variable Import Properties: Set types by name with setvartype VariableNames: {1×29 cell} VariableTypes: {1×29 cell} SelectedVariableNames: {1×29 cell} Location Properties: DataLine: 2 VariableNamesLine: 1 RowNamesColumn: 0 VariableUnitsLine: 0 VariableDescriptionsLine: 0 VariableOptions: [1×29 VariableOptions] Access VariableOptions sub-properties using setvaropts/getvaropts
Examine the Type
property of variables TaxiIn
and TaxiOut
.
getvaropts(opts,{'TaxiIn','TaxiOut'})
ans = 1×2 TextVariableImportOptions array with properties: WhitespaceRule Type FillValue Name QuoteRule TreatAsMissing
Change the type of the variables TaxiIn
and TaxiOut
to double
.
opts = setvartype(opts,{'TaxiIn','TaxiOut'},'double');
Specify the subset of variables to import and examine.
opts.SelectedVariableNames = {'TaxiIn','TaxiOut'};
Use the readtable
function along with the options object to import the selected variables. Display a summary of the table.
T = readtable('airlinesmall.csv',opts);
summary(T)
Variables: TaxiIn: 123523×1 double Values: min 0 median 5 max 1451 NaNs 37383 TaxiOut: 123523×1 double Values: min 0 median 13 max 755 NaNs 37364
Create import options for a spreadsheet file, specify the variables to import, and then read the data.
Create an import options object from a file.
opts = detectImportOptions('patients.xls')
opts = SpreadsheetImportOptions with properties: Sheet Properties: Sheet: '' Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' Variable Import Properties: Set types by name with setvartype VariableNames: {1×10 cell} VariableTypes: {1×10 cell} SelectedVariableNames: {1×10 cell} Range Properties: DataRange: 'A2' VariableNamesRange: 'A1' RowNamesRange: '' VariableUnitsRange: '' VariableDescriptionsRange: '' VariableOptions: [1×10 VariableOptions] Access VariableOptions sub-properties using setvaropts/getvaropts
Modify the options object to specify which variables to import.
opts.SelectedVariableNames = {'Systolic','Diastolic'};
Use readtable
along with the options object to import the specified variables.
T = readtable('patients.xls',opts);
summary(T)
Variables: Systolic: 100×1 double Values: min 109 median 122 max 138 Diastolic: 100×1 double Values: min 68 median 81.5 max 99
filename
— Name of file to readName of the file to read, specified as a character vector. If filename
includes
the file extension, then readtable
determines the
file format from the extension. Otherwise, you must specify the 'FileType'
name-value
pair arguments to indicate the type of file.
On Windows® systems with Microsoft® Excel® software, readtable
reads
any Excel spreadsheet file format recognized by your version
of Excel. If your system does not have Excel for Windows, readtable
operates
in basic
import mode, and reads only .xls,
.xlsx, .xlsm, .xltx, and .xltm
files.
For a delimited text file, readtable
converts
empty fields in the file to either NaN
(for a numeric
variable) or an empty character vector (for a text variable). All
lines in the text file must have the same number of delimiters. readtable
ignores
insignificant white space in the file.
Example: 'myFile.xlsx'
opts
— File import optionsSpreadsheetImportOptions
| DelimitedtextImportOptions
File import options, specified as a SpreadsheetImportOptions
object
or a DelimitedTextImportOptions
object created
by the detectImportOptions
function.
The opts
object contains properties that control
the data import process. For more information on the properties of
each object, see the appropriate object page.
Type of Files | Output |
---|---|
Spreadsheet files | SpreadsheetImportOptions object |
Text files | DelimitedTextImportOptions object |
Specify optional comma-separated pairs of Name,Value
arguments.
Name
is the argument
name and Value
is the corresponding
value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN
.
'ReadVariableNames',false
indicates
that the first row of the file does not correspond to variable names.When reading:
Text files, only these parameter names apply: FileType
, ReadVariableNames
, ReadRowNames
, TreatAsEmpty
, DatetimeType
, Delimiter
,
HeaderLines
, Format
, DateLocale
,
and FileEncoding
.
Spreadsheet files, only these parameter names apply:
FileType
, ReadVariableNames
, ReadRowNames
, TreatAsEmpty
, DatetimeType
, Sheet
,
Range
, Basic
, and TextType
.
Text or Spreadsheet files with the opts
import
options, only these parameter names apply: ReadVariableNames
, ReadRowNames
, DatetLocale
, FileEncoding
, Sheet
,
and Basic
.
'FileType'
— Type of file'text'
| 'spreadsheet'
Type of file, specified as the comma-separated pair consisting
of 'FileType'
and a character vector containing 'text'
or 'spreadsheet'
.
Specify the 'FileType'
name-value pair argument
when the filename
does not include the file extension
or if the extension is other than one of the following.
.txt
, .dat
,
or .csv
for delimited text files
.xls
, .xlsb
, .xlsm
, .xlsx
, .xltm
, .xltx
,
or .ods
for spreadsheet files
Example: 'FileType','text'
'ReadVariableNames'
— Read first row as variable namestrue
| false
| 1
| 0
Indicator for reading the first row as variable names, specified
as the comma-separated pair consisting of 'ReadVariableNames'
and
either true
, false
, 1
,
or 0
. If unspecified, readtable
automatically
detects the presence of variable names.
Indicator | Description |
---|---|
| Use when the first row of the region to read contains
the variable names for the table. |
| Use when the first row of the region to read contains
data in the table. |
unspecified | When left unspecified, readtable automatically
detects true or false and proceeds
accordingly. |
Note: If both the 'ReadVariableNames'
and 'ReadRowNames'
logical
indicators are true
, then readtable
saves
the name in the first column of the first row of the region to read
as the first dimension name in the property, T.Properties.DimensionNames
.
If you specify the ReadVariableNames
argument
in addition to opts
the import options, then the readtable
behavior
changes based on the specification:
If ReadVariableNames
is true
,
then read the variable names from the specified file by using the VariableNamesRange
or
the VariableNamesLine
property of the import options
object.
If ReadVariableNames
is false
,
then read the variable names from the VariableNames
property
of the import options object.
'ReadRowNames'
— Indicator for reading the first column as row namesfalse
(default) | true
| 0
| 1
Indicator for reading first column as row names, specified as
the comma-separated pair consisting of 'ReadRowNames'
and
either false
, true
, 0
,
or 1
.
Indicator | Description |
---|---|
| Use when the first column of the region to read contains data, and not the row names for the table. |
| Use when the first column of the region to read contains the row names for the table. |
unspecified | When left unspecified, readtable assumes false . |
Note: If both the 'ReadVariableNames'
and 'ReadRowNames'
logical
indicators are true
, then readtable
saves
the name in the first column of the first row of the region to read
as the first dimension name in the property, T.Properties.DimensionNames
.
If you specify the ReadRowNames
argument
in addition to opts
the import options , then the readtable
behavior
changes based on the specification:
If ReadRowNames
is true
,
then read the row names from the specified file by using the RowNamesRange
or
the RowNameColumn
property of the import options
object.
If ReadRowNames
is false
,
then do not import row names.
'TreatAsEmpty'
— Placeholder text to treat as empty valuePlaceholder text to treat as an empty value, specified as the
comma-separated pair consisting of 'TreatAsEmpty'
and
a single character vector or a cell array of character vectors. Table
elements corresponding to these characters are set to NaN
.
'TreatAsEmpty'
only applies to numeric columns
in the file, and readtable
does not accept numeric
literals, such as '-99'
.
Example: 'TreatAsEmpty','N/A'
sets N/A
within
numeric columns to NaN
.
Example: 'TreatAsEmpty',{'.','NA','N/A'}
sets .
, NA
and N/A
within
numeric columns to NaN
.
'TextType'
— Output data type of text'char'
(default) | 'string'
Output data type of text, specified as the comma-separated pair
consisting of 'TextType'
and either 'char'
or 'string'
.
If you specify the value 'char'
, then readtable
returns
text as a cell array of character vectors. If you specify the value 'string'
,
then readtable
returns text as an array of type string
.
'DatetimeType'
— Type for imported date and time data'datetime'
(default) | 'text'
| 'exceldatenum'
(spreadsheet files only)Type for imported date and time data, specified as the comma-separated
pair consisting of 'DatetimeType'
and one of these
values: 'datetime'
, 'text'
,
or 'exceldatenum'
. The value 'exceldatenum'
is
applicable only for spreadsheet files, and is not valid for text files.
Value | Type for Imported Date and Time Data |
---|---|
'datetime' | MATLAB® For more information, see |
'text' | Cell array of character vectors |
'exceldatenum' | Excel serial date numbers A serial date
number is a single number equal to the number of days from a given
reference date. Excel serial date numbers use a different reference
date than MATLAB serial date numbers. For more information on Excel dates,
see |
Example: 'DatetimeType','text'
'Delimiter'
— Field delimiter characterField delimiter character, specified as the comma-separated
pair consisting of 'Delimiter'
and one of the following
specifiers.
Specifier | Field Delimiter |
---|---|
| Comma |
| Space |
| Tab |
| Semicolon |
| Vertical bar |
unspecified | If unspecified, |
Example: 'Delimiter','space'
'HeaderLines'
— Lines to skipLines to skip at beginning of the file, specified as the comma-separated
pair consisting of 'HeaderLines'
and a positive
integer. If unspecified, readtable
automatically
detects the number of lines to skip.
Data Types: single
| double
'Format'
— Column formatColumn format of the file, specified as the comma-separated
pair consisting of 'Format'
and a character vector
of one or more conversion specifiers. The conversion specifiers are
the same as the specifiers accepted by the textscan
function.
Specifying the format can significantly improve speed for some
large files. If you do not specify a value for Format
,
then readtable
uses %q
to
interpret nonnumeric columns. The %q
specifier
reads the text and omits double quotation marks ("
)
if appropriate.
By default, the variables created are either double
or
cell array of character vectors, depending on the data. If the entire
column is numeric, variables are imported as double
.
If any element in a column is not numeric, the variables are imported
as cell arrays of character vectors.
If you specify the 'Format'
name-value pair
argument, you can also specify any of the name-value pair arguments
accepted by the textscan
function.
'DateLocale'
— Locale for reading datesLocale for reading dates, specified as the comma-separated pair
consisting of 'DateLocale'
and a character vector.
The character vector takes the form
where:xx
_YY
YY
is an uppercase ISO
3166-1 alpha-2 code indicating a country
xx
is a lowercase ISO 639-1
two-letter code indicating a language
For a list of common values for the locale, see the Locale
name-value
pair argument for the datetime
function.
When using the %D
format specifier to read
text as datetime
values, use DateLocale
to
specify the locale in which readtable
should interpret
month and day-of-week names and abbreviations.
If you specify the DateLocale
argument in
addition to opts
the import options, then the readtable
function
uses the specified value for DateLocale
argument,
overriding the locale defined in the import options.
Example: 'DateLocale','ja_JP'
'FileEncoding'
— Character encoding scheme'UTF-8'
| 'ISO-8859-1'
| 'windows-1251'
| 'windows-1252'
| ...Character encoding scheme associated with the file, specified
as the comma-separated pair consisting of 'FileEncoding'
and
one of these values.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
The default encoding is system-dependent.
If you specify the FileEncoding
argument
in addition to opts
the import options, then the readtable
function
uses the specified value for FileEncoding
argument,
overriding the encoding defined in the import options.
Data Types: char
'Sheet'
— Worksheet to read1
(default) | positive integer | character vectorWorksheet to read, specified as the comma-separated pair consisting
of 'Sheet'
and a positive integer indicating the
worksheet index or a character vector containing the worksheet name.
The worksheet name cannot contain a colon (:
).
To determine the names of sheets in a spreadsheet file, use [status,sheets]
= xlsfinfo(filename)
. For more information, see xlsfinfo
.
If you specify the Sheet
argument in addition
to opts
the import options, then the readtable
function
uses the specified value for Sheet
argument, overriding
the sheet name defined in the import options.
Example: 'Sheet'
, 2
'Range'
— Portion of worksheet to readPortion of the worksheet to read, indicated as a rectangular
area specified by a comma separated pair consisting of 'Range'
and
a character vector in one of the following forms.
Ways to specify Range | Description |
---|---|
Rectangular Range | Specify the range using the syntax Example: |
Unspecified or Empty | If unspecified, Example: Note: Used Range refers
to the rectangular portion of the spreadsheet that actually contains
data. |
Row Range | You can identify range by specifying the beginning and
ending rows using Excel row designators. Then Example: |
Column Range | You can identify range by specifying the beginning and
ending columns using Excel column designators. Then Example: |
Excel's Named Range | In Excel, you can create names to identify ranges
in the spreadsheet. For instance, you can select a rectangular portion
of the spreadsheet and call it Example: |
'Basic'
— Indicator for reading in basic
modetrue
| false
| 1
| 0
Indicator for reading in basic
mode, specified
as the comma-separated pair consisting of 'Basic'
and
either true
, false
, 1
,
or 0
. On Windows without Excel, Mac,
and Linux®, the only value allowed is the default value true
.
When operating in this mode, readtable
functionality
differs in the support of file formats and treatment of date and time
data.
Basic |
|
|
---|---|---|
Supported file formats |
|
|
Date/time import type and format | Text representing dates and times Dates and times, when represented with text in a character vector, are composed of fields related to a specific date and/or time. The level of detail in the imported dates and times depends on the values in the imported range:
To convert text representing dates and
times to the MATLAB date and time data type, see | Excel serial date numbers A serial date
number is a single number equal to the number of days from a given
reference date. Excel serial date numbers use a different reference
date than MATLAB serial date numbers. For more information on Excel dates,
see To
convert Excel serial date numbers to MATLAB date and time
data type, see |
Support for interactive features, such as formulas and macros | Yes | No |
T
— Output tableOutput table, returned as a table. The table can store metadata
such as descriptions, variable units, variable names, and row names.
For more information, see Table Properties
.
detectImportOptions
| Import Tool | table
| textscan
| writetable