Vertica Storage Model
Vertica implements unique data storage model as shown in image below. It is supported by three components.
This model will be same in each vertica node.
- WOS (Write-Optimized Store)
- ROS (Read-Optimized Store)
- Tuple Mover
Lets discuss each component in detail
Write Optimized Store (WOS)
- The WOS is an memory resident data storage structure optimized for low-latency data loading.
- WOS efficiently support INSERT, UPDATE, DELETE, and COPY operations (without DIRECT hint) for storing data in memory.
- Records in WOS are stored without data compression or indexing to support faster loading.
- WOS is not optimized for read operation because the data (in projection) is sorted only when queried.
Read Optimized Store (ROS)
- ROS is a highly optimized, read-oriented, disk storage structure.
- Most of the data in Vertica cluster stays in ROS.
- ROS is optimized for fast reads because data is sorted, indexed and compressed in ROS .
- ROS data is pushed into particular group of files called ROS containers.
- A container is just a set of rows (file) created by Moveout, Mergeout, DMLs or COPY DIRECT statements.
- Data can be directly loaded into the ROS using COPY, UPDATE, DELETE and INSERT (with /*+DIRECT*/ hints) statements.
**Vertica is optimized for both writes and reads by having two different(WOS and ROS) data storage structures.
Tuple Mover
- Vertica moves data from WOS to ROS using the Tuple Mover.
- The Tuple Mover runs in the background, performing some tasks automatically at time intervals determined by its configuration parameters.
- The Tuple Mover performs two operations:
Moveout:
- During moveout operations, the Tuple Mover moves data from memory (WOS) to new ROS container.
- The ROS container setup allows for faster movement of data from WOS to ROS because newer data doesn't need to be merged with existing ROS data immediately.
Mergeout:
- During mergeout operations, the Tuple Mover combines small ROS containers created by moveout operations or COPY DIRECT statements over time into larger ones.
- Vertica keeps data from different partitions on different disks. During mergeout, Vertica will adhere to this policy of not merging ROS container of different partitions.
- It also purges data that is marked for deletion.
- Check here For more detail on MergeOut.
How Data Movement Happens in Vertica:
By default, data is loaded into WOS first when normal COPY, INSERT, or UPDATE statement are used.
Tuple Mover performs a moveout operation and moves the data from WOS to ROS at a time interval that is specified in configuration.
Later Tuple Mover merges ROS containers and purges data marked for deletion during a mergeout operation.
Data can also be loaded directly into ROS by using the COPY DIRECT, INSERT DIRECT or UPDATE DIRECT options.