write

Write tall array to disk for checkpointing

Syntax

write(location,tA)
example

Description

write(location,tA) calculates the values in tall array tA and then writes the array to files in the folder specified by location. The data is stored in an efficient binary format suitable for reading back using datastore(location).

Examples

collapse all

Write and Reconstruct Tall Array

Open Script

Write a tall array to disk, then subsequently recover the tall array by creating a new datastore for the written files. This process is useful to save your work or share a tall array with a colleague.

Create a datastore for the airlinesmall.csv data set. Select only the Year, Month, and UniqueCarrier variables, and treat 'NA' values as missing data. Convert the datastore into a tall table.

ds = datastore('airlinesmall.csv');
ds.TreatAsMissing = 'NA';
ds.SelectedVariableNames = {'Month','Year','UniqueCarrier'};
tt = tall(ds)

tt =

  M×3 tall table 

    Month    Year    UniqueCarrier
    _____    ____    _____________

    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    :        :       :
    :        :       :

Sort the data in descending order by year and extract the top 25 rows. The resulting tall table is unevaluated.

tt_new = topkrows(tt,25,'Year')

tt_new =

  M×3 tall table 

    Month    Year    UniqueCarrier
    _____    ____    _____________

    ?        ?       ?            
    ?        ?       ?            
    ?        ?       ?            
    :        :       :
    :        :       :

Save the results to a new folder named ExampleData on the C:\ disk. (You might want to specify a different write location, especially if you are not using a Windows® computer.) The write function evaluates the tall array prior to writing the files, so there is no need to use the gather function prior to saving the data.

location = 'C:\ExampleData';
write(location,tt_new)

Writing tall data to folder C:\ExampleData
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 0 sec
Evaluation completed in 0 sec

Clear tt and ds from your working directory. To recover the tall table that was written to disk, first create a new datastore that references the same directory. Then convert the datastore into a tall table. Since the tall table was evaluated before being written to disk, the display now includes a preview of the values.

clear tt ds
ds2 = datastore(location);
tt2 = tall(ds2)

tt2 =

  M×3 tall table 

    Month    Year    UniqueCarrier
    _____    ____    _____________

    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    :        :       :
    :        :       :

Input Arguments

collapse all

`location` — Folder location to write data
character vector | string

Folder location to write data, specified as a character vector or string. location can specify a full or relative path. The specified folder can be either of these options:

Existing empty folder that contains no other files
New folder that write creates

Additional considerations apply for Hadoop^® and Apache Spark™:

If the folder is not available locally, then the full path of the folder must be an internationalized resource identifier (IRI) of the form:
hdfs://hostname:portnumber/path_to_file.
Before writing to HDFS™, set the HADOOP_HOME, HADOOP_PREFIX, or MATLAB_HADOOP_INSTALL environment variable to the folder where Hadoop is installed.
Before writing to Apache Spark, set the SPARK_HOME environment variable to the folder where Apache Spark is installed.

Example: location = 'hdfs://myHadoopCluster/some/output/folder'

Example: location = '../../dir/data'

Example: location = 'C:\Users\MyName\Desktop'

Data Types: char | string

`tA` — Input array
tall array

Input array, specified as a tall array.

More About

collapse all

Tips

Use the write function to create checkpoints or snapshots of your data as you work, especially when working with huge data sets. This practice allows you to reconstruct tall arrays directly from files on disk rather than reexecuting all of the commands that produced the tall array.

Tall Arrays

Documentation

write

Syntax

Description

Examples

Write and Reconstruct Tall Array

Input Arguments

`location` — Folder location to write data
character vector | string

`tA` — Input array
tall array

More About

Tips

See Also

Introduced in R2016b

MATLAB Documentation

Other Documentation

Support

Documentation

write

Syntax

Description

Examples

Write and Reconstruct Tall Array

Input Arguments

location — Folder location to write datacharacter vector | string

tA — Input arraytall array

More About

Tips

See Also

Introduced in R2016b

MATLAB Documentation

Other Documentation

Support

`location` — Folder location to write data
character vector | string

`tA` — Input array
tall array