Vertica implements unique data storage model consist of ROS, WOS & Tuple Mover.
Vertica Storage Model
Vertica implements unique data storage model as shown in image below. It is supported by three components.
This model will be same in each vertica node.
- WOS (Write-Optimized Store)
- ROS (Read-Optimized Store)
- Tuple Mover
Lets discuss each component in detail
Write Optimized Store (WOS)
Read Optimized Store (ROS)
- The WOS is an memory resident data storage structure optimized for low-latency data loading.
- WOS efficiently support INSERT, UPDATE, DELETE, and COPY operations (without DIRECT hint) for storing data in memory.
- Records in WOS are stored without data compression or indexing to support faster loading.
- WOS is not optimized for read operation because the data (in projection) is sorted only when queried.
- ROS is a highly optimized, read-oriented, disk storage structure.
- Most of the data in Vertica cluster stays in ROS.
- ROS is optimized for fast reads because data is sorted, indexed and compressed in ROS .
- ROS data is pushed into particular group of files called ROS containers.
- A container is just a set of rows (file) created by Moveout, Mergeout, DMLs or COPY DIRECT statements.
- Data can be directly loaded into the ROS using COPY, UPDATE, DELETE and INSERT (with /*+DIRECT*/ hints) statements.
**Vertica is optimized for both writes and reads by having two different(WOS and ROS) data storage structures.
How Data Movement Happens in Vertica:
- Vertica moves data from WOS to ROS using the Tuple Mover.
- The Tuple Mover runs in the background, performing some tasks automatically at time intervals determined by its configuration parameters.
- The Tuple Mover performs two operations:
- During moveout operations, the Tuple Mover moves data from memory (WOS) to new ROS container.
- The ROS container setup allows for faster movement of data from WOS to ROS because newer data doesn't need to be merged with existing ROS data immediately.
- During mergeout operations, the Tuple Mover combines small ROS containers created by moveout operations or COPY DIRECT statements over time into larger ones.
- Vertica keeps data from different partitions on different disks. During mergeout, Vertica will adhere to this policy of not merging ROS container of different partitions.
- It also purges data that is marked for deletion.
- Check here For more detail on MergeOut.
By default, data is loaded into WOS first when normal COPY, INSERT, or UPDATE statement are used.
Tuple Mover performs a moveout operation and moves the data from WOS to ROS at a time interval that is specified in configuration.
Later Tuple Mover merges ROS containers and purges data marked for deletion during a mergeout operation.
Data can also be loaded directly into ROS by using the COPY DIRECT, INSERT DIRECT or UPDATE DIRECT options.