fileDatastore

Create FileDatastore object for collections of custom files

Syntax

  • fds = fileDatastore(location,'ReadFcn',@fcn)
  • fds = fileDatastore(location,Name,Value)
    example

Description

fds = fileDatastore(location,'ReadFcn',@fcn) creates a datastore from the collection of files specified by location and uses the function fcn to read the data from the files. A datastore is a repository for collections of data that are too large to fit in memory. After creating a FileDatstore object, you can read and process the data in various ways. See FileDatastore for more information.

example

fds = fileDatastore(location,Name,Value) specifies additional parameters for fds using one or more name-value pair arguments. For example, you can specify which files to include in the datastore depending on their extensions with fileDatastore(location,'ReadFcn',@customreader,'FileExtentions',{'.exts','.extx'}).

Examples

collapse all

Create a datastore containing all .mat files within the MATLAB® demos folder, specifying the load function to read the file data.

fds = fileDatastore(fullfile(matlabroot,'toolbox','matlab','demos'),'ReadFcn',@load,'FileExtensions','.mat')
fds = 

  FileDatastore with properties:

      Files: {
             ' ...\matlab\toolbox\matlab\demos\accidents.mat';
             ' ...\matlab\toolbox\matlab\demos\airfoil.mat';
             ' ...\matlab\toolbox\matlab\demos\airlineResults.mat'
              ... and 35 more
             }
    ReadFcn: @load

Read the first file in the datastore, and then read the second file.

data1 = read(fds);
data2 = read(fds);

Read all files in the datastore simultaneously.

readall(fds);

Initialize a cell array to hold the data and counter i.

dataarray = cell(numel(fds.Files), 1);
i = 1;

Reset the datastore to the first file and read the files one at a time until there is no data left. Assign the data to the array dataarray.

reset(fds);
while hasdata(fds)
    dataarray{i} = read(fds);
    i = i+1;
end

Input Arguments

collapse all

Files or folders to include in the datastore, specified as a character vector or cell array of character vectors. If the files are not in the current folder, then location must be full or relative paths. Files within subfolders of the specified folder are not automatically included in the datastore.

If the files are not available locally, then the full path of the files or folders must be an internationalized resource identifier (IRI) of the form
hdfs://hostname:portnumber/path_to_file.

Before reading from HDFS™, set the HADOOP_HOME, HADOOP_PREFIX, or MATLAB_HADOOP_INSTALL environment variable to the folder where Hadoop® is installed. For more information, see Read from HDFS.

    Note:   When reading from HDFS or when reading Sequence files locally, the datastore function calls the javaaddpath command. This command does the following:

    • Clears the definitions of all Java® classes defined by files on the dynamic class path

    • Removes all global variables and variables from the base workspace

    • Removes all compiled scripts, functions, and MEX-functions from memory

    To prevent persistent variables, code files, or MEX-files from being cleared, use the mlock function.

You can use the wildcard character (*) when specifying location. This character indicates that all matching files or all files in the matching folders are included in the datastore.

Example: 'file1.ext'

Example: '../dir/data/file1.ext'

Example: {'C:\dir\data\file1.exts','C:\dir\data\file2.extx'}

Example: 'C:\dir\data\*.ext'

Data Types: char | cell

Function that reads the file data, specified as a function handle. At a minimum, the function takes a file name as input, and then it outputs the corresponding file data. For example, if customreader is the specified function to read the file, then it must have a signature similar to the following:

function data = customreader(filename)
..
end
If there is more than one output argument, then the datastore uses only the first argument and ignores the rest.

Example: @customreader

Data Types: function_handle

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: fds = fileDatastore('C:\dir\data','FileExtensions',{'.exts','.extx'})

collapse all

Subfolder inclusion flag, specified as the comma-separated pair consisting of 'IncludeSubfolders' and true, false, 0, or 1. Specify true to include all files and subfolders within each folder or false to include only the files within each folder.

If you do not specify 'IncludeSubfolders', then the default value is false.

Example: 'IncludeSubfolders',true

Data Types: logical | double

Custom format file extensions, specified as the comma-separated pair consisting of 'FileExtensions' and a character vector or cell array of character vectors. You can use '' to represent files without extensions. If you do not specify 'FileExtensions', then fileDatastore automatically includes all files within a folder.

Example: 'FileExtensions','.ext'

Example: 'FileExtensions',{'.exts','.extx'}

Data Types: char | cell

Output Arguments

collapse all

Datastore for custom file collections, returned as a FileDatastore object. The Files property is a cell array of character vectors, where each character vector is an absolute path to a file resolved from the location argument. See FileDatastore for more information.

See Also

| |

Introduced in R2016a

Was this topic helpful?