Book Contents

Book Index

Next Topic

Home

hash-segmentation-clause

Hash segmentation allows you to segment a projection based on a built-in hash function that provides even distribution of data across some or all of the nodes in a cluster, resulting in optimal query execution.

Hash segmentation is the preferred method of segmentation in Vertica 2.0 and later.

Syntax

SEGMENTED BY expression

[ ALL NODES [ OFFSET offset ] | NODES node [ ,... ] ]

Semantics

 

SEGMENTED BY expression

can be a general SQL expression but there is no reason to use anything other than the built-in HASH function with one or more NOT NULL columns as arguments. Choose columns that have a large number of unique data values and acceptable skew in the data distribution. Primary key columns that meet the criteria may be an excellent choice for hash segmentation.

ALL NODES

automatically distributes the data evenly across all nodes at the time the CREATE PROJECTION statement is executed. The ordering of the nodes is fixed.

OFFSET offset

is an integer that specifies the node within the ordered sequence on which to start the segmentation distribution, relative to 0. See example below.

NODES node [ ,... ]

specifies a subset of the nodes in the cluster over which to distribute the data. You can use a specific node only once in any projection. For a list of the nodes in a database, use the View Database command in the Administration Tools.

Notes

Examples

CREATE PROJECTION ... SEGMENTED BY HASH(C1,C2,...) ALL NODES;

CREATE PROJECTION ... SEGMENTED BY HASH(C1,C2,...) ALL NODES OFFSET 1;

The example produces two hash-segmented buddy projections that form part of a K-Safe design. The projections can use different sort orders.