Datastore

Read large collections of data

The datastore function creates a datastore, which is a repository for collections of data that are too large to fit in memory. A datastore allows you to read and process multiple files as a single entity. If files are too large to fit in memory, you can manage the incremental import of data, create a tall array to work with the data, or use the datastore as an input to mapreduce for further processing. For more information, see Getting Started with Datastore.

Functions

`tabularTextDatastore`	Create TabularTextDatastore object for collections of tabular text data
`imageDatastore`	Create ImageDatastore object for collections of image data
`spreadsheetDatastore`	Create SpreadsheetDatastore object for collections of spreadsheet data
`fileDatastore`	Create FileDatastore object for collections of custom files
`datastore`	Create datastore for large collections of data

Using Objects

`TabularTextDatastore`	Datastore for tabular text files
`ImageDatastore`	Datastore for image data
`SpreadsheetDatastore`	Datastore for spreadsheet files
`KeyValueDatastore`	Datastore for key-value pair data
`FileDatastore`	Datastore for custom format files
`TallDatastore`	Datastore for checkpointing tall arrays

Topics

Getting Started with Datastore

A datastore is an object for reading a single file or a collection of files or data.

Read and Analyze Large Tabular Text File

This example shows how to create a datastore for a large text file containing tabular data, and then read and process the data one chunk at a time or one file at a time.

Read and Analyze Image Files

This example shows how to create a datastore for a collection of images, read the image files, and find the images with the maximum average hue, saturation, and brightness (HSV).

Read and Analyze MAT-File with Key-Value Data

This example shows how to create a datastore for key-value pair data in a MAT-file that is the output of mapreduce.

Read and Analyze Hadoop Sequence File

This example shows how to create a datastore for a Sequence file containing key-value data.

Read from HDFS

You can create a datastore for a collection of text files or sequence files that reside on the Hadoop^® Distributed File System (HDFS™) using the datastore function.

Compute Maximum Average HSV of Images with MapReduce

This example shows how to use ImageDatastore and mapreduce to find images with maximum hue, saturation and brightness values in an image collection.

Tall Arrays

Learn about tall arrays and perform an example calculation.

Was this topic helpful?

Documentation

Datastore

Functions

Using Objects

Topics

MATLAB Documentation

Other Documentation

Support