For a VantageCloud Lake system, with the considerations and tradeoffs of Object File System tables as described in Object File System Considerations, pay attention to the following items when designing a table for an Object File System:
- An Object File System table has no primary index (NoPI), does not support partitioned primary index, and is always defined as multiset.
The Object File System DDL supports an ORDER BY clause that provides a similar functionality as a partitioned primary index in the file system.
- There is less need for secondary indexes:
- Use of Parquet (a columnar format) in the Object File System:
Columnar format offers the ability to partition a table by column. In its simplest form, each column in the table becomes is own column partition. Values from the same column partition taken from multiple logical rows are packed together into a physical space where they can be easily accessed together. A column partition in a columnar table offers similar advantages as a secondary index.
- Internal indexes:
When a table is loaded into the Object File System, metadata structures similar to indexes are generated to capture the range of values for specified columns in each object. There may be many such metadata structures for a given Object File System table. These system-defined indexes can be used to speed up access, replacing the need for additional user-defined indexes.
- Use of Parquet (a columnar format) in the Object File System:
- Even though Object File System tables are not supported for a multiple-table join index, single-table join indexes (STJI) can be created on an Object File System table. See Join Index .
- No need for compression.
- See Data Storage for information about data placement options in different storage types to achieve a balance of performance and cost: Block Storage on the primary cluster or using the Object File System.
Row-Based Format and Column-Based Format
- Considering degree of compression, columnar format typically has greater compression compared to row format.
- More granular partitioning is possible with Columnar, leading to more effective partition elimination.
- Row format is a better choice for row at a time updates.
- Considering suitability for tactical (PI access) applications, tactical applications perform more efficiently when the table is in row format.