Range Segmentation

Hash segmentation is the preferred method of segmentation in Vertica 2.0 and later. Refer to the Hash Segmentation section of this document and the CREATE PROJECTION command in the SQL Reference Manual for detailed information about using hash segmentation in a projection.

Range segmentation allows you to specify a list of nodes, each of which stores a specific range of data values, except for the MAXVALUE node, which has no upper limit. Refer to the CREATE PROJECTION command in the SQL Reference Manual for detailed information about using range segmentation in a projection.

Unlike hash segmentation, range does not automatically distribute data evenly across some or all nodes in a cluster. Use range segmentation only when your projection includes a column that is known to contain data that is suitable for use as as a segmentation expression. In other words, avoid using columns that distribute data in a way that causes skewed distribution and execution (some nodes consistently storing more data and working harder than others). This includes data that does not yet exist but can cause skew if loaded in the future.

In particular, avoid using a date/time column for range segmentation because it causes temporal skew. For example, consider a fact table in which each row contains a timestamp representing that point in time at which the fact was established. In that case, all new fact table rows would be stored on the MAXVALUE node, causing skew that would increase over time and thus would have a negative effect on query performance.

The Database Designer is a tool that analyzes a logical schema definition, sample queries, and sample data and generates a set of projections in the form of an SQL script to be executed after you create the tables but before you load any data. The script creates a minimal set of superprojections to ensure K-Safety, and optionally pre-join projections. In most cases, the projections created by the Database Designer provide excellent query performance within physical constraints. You can, however, write a custom projection script should the Database Designer not meet your needs.