To create an IMDC with the API, use the method InMemoryDataCollectionRepository::createInMemoryDataCollection, specifying the IMDC name, schema, and (optionally) amount of memory (MB) to use for each partition.
The schema specification is a list of either SQL data types or column definitions. A column definition is a column name followed by its SQL data type. If you specify a list of n SQL data types, they are assigned the names Col<0>, Col<1>, …, Col<n>.
The default amount of memory to use for each partition is 10 MB. If you add more data to the IMDC than fits in the partitions, data overflows to disk.
You cannot create an IMDC during contract negotiation.
-
Import the necessary Java classes:
import java.util.List; import java.util.ArrayList; import com.asterdata.ncluster.sqlmr.RowFunction; import com.asterdata.ncluster.sqlmr.data.RowEmitter; import com.asterdata.ncluster.sqlmr.data.RowIterator; import com.asterdata.ncluster.sqlmr.InMemoryDataCollectionRepository; import com.asterdata.ncluster.sqlmr.data.InMemoryDataCollection; import com.asterdata.ncluster.sqlmr.data.SqlType;
-
[Optional] If you want to specify the schema with a list of column definitions, import the ColumnDefinition class:
import com.asterdata.ncluster.sqlmr.data.ColumnDefinition;
-
Specify the IMDC name:
... public void operateOnSomeRows(RowIterator inputIterator, outputEmitter) { ... //Specify IMDC name String IMDC_name_var = new String("IMDC_name"); ...
-
Specify the IMDC schema:
- To specify the schema with a list of SQL data types:
List<SqlType> IMDC_schema = new ArrayList<SqlType>(n); IMDC_schema.add(SqlType.type_0()); IMDC_schema.add(SqlType.type_1()); ... IMDC_schema.add(SqlType.type_n());
- To specify the schema with a list of column definitions:
List<ColumnDefinition> IMDC_schema = new ArrayList<ColumnDefinition>(n); IMDC_schema.add(new ColumnDefinition("column_name_0", SqlType.type_0())); IMDC_schema.add(new ColumnDefinition("column_name_1", SqlType.type_1())); ... IMDC_schema.add(new ColumnDefinition("column_name_2", SqlType.type_n()));
- To specify the schema with a list of SQL data types:
-
Create the IMDC:
- To use the default amount of memory for each partition:
InMemoryDataCollection myIMDC = InMemoryDataCollectionRepository.createInMemoryDataCollection( IMDC_name, IMDC_schema); ... }
- To specify the amount of memory for each partition:
//Specify amount of memory to use for each partition int memoryInMB = m; //Create IMDC InMemoryDataCollection myIMDC = InMemoryDataCollectionRepository.createInMemoryDataCollection( IMDC_name, IMDC_schema, memoryInMB); ... }
- To use the default amount of memory for each partition:
The IMDC now exists in the appendable state, and you can add data to it.