Adding Primary Key and Foreign Key Constraints
Primary Key Constraints
A primary key is a column or combination of columns that uniquely identify a row in a table. A PRIMARY KEY constraint is a rule that states that a column (or a set of columns used as a compound key) cannot be null and cannot (as a whole) contain duplicate data.
In Vertica, all dimension tables (unsegmented tables) are required to have a PRIMARY KEY column-constraint or table-constraint in order to be joined to other tables.
Primary key constraints are enforced only when loading data into a
pre-join projection.
Foreign Key Constraints
A foreign key is a column that is used to join a table to other tables. A FOREIGN KEY constraint is a rule that states that a column cannot be null and contains only values from the PRIMARY KEY column of a specific dimension table.
In Vertica, fact table join columns are required to have FOREIGN KEY constraints in order to participate in pre-join projections, which provide optimal query performance.
However, fact table join columns cannot have FOREIGN KEY constraints when used in outer join queries that produce expected results. If the fact table join column has a PRIMARY KEY constraint, outer join queries produce the same result set as inner join queries. See the Using Outer Joins section of the SQL Programmer's Guide for more information.
FOREIGN KEY constraints are enforced only when loading data into a pre-join projection. Thus, if a table does not participate in a pre-join projection, it is possible to load erroneous data into that table that can cause a constrain violation error when:
- an inner join query is based on that FOREIGN KEY
- an outer join being treated as an inner join due to the presence of the constraint is based on that FOREIGN KEY
- a new pre-join projection based on that FOREIGN KEY is
refreshedA refresh operation ensures that all projections on a node are up-to-date (can participate in query execution). This process may take a long time, depending on how much data is in the table(s).
Defining Constraints
There are two SQL commands that define a primary key or foreign key constraints:
- CREATE TABLE allows you to define a
column-constraint on a single column.Adds a
referential integrity constraint to the metadata of a column. See Adding Join Constraints in the Database Administrator's Guide.
Referential integrity in Vertica consists of a set of constraints (logical schema objects) that define primary key and foreign key columns. In a star schema or snowflake schema:
- Each dimension table must have a PRIMARY KEY constraint.
- The fact table must contain columns that can be used to join the fact table to dimension tables.
- Fact table join columns must have FOREIGN KEY constraints in order to participate in pre-join projections.
- Outer join queries produce expected results only when the fact table join column used in the query does not have a FOREIGN KEY constraint.
Syntax
[ CONSTRAINT constraint-name ]
{ [ NOT ] NULL
| PRIMARY KEY
| REFERENCES table-name [ ( column-name ) ]
[ DEFAULT constant ]
}
Semantics
CONSTRAINT
constraint-name
|
optionally assigns a name to the constraint. Vertica recommends that you name all constraints.
|
NULL
|
(default) specifies that the column is allowed to contain null values.
|
NOT NULL
|
specifies that the column must receive a value during INSERT and UPDATE operations. If no DEFAULT value is specified and no value is provided, the INSERT or UPDATE statement returns an error because no default value exists.
|
PRIMARY KEY
|
adds a referential integrity constraint defining the column as the primary key.
Referential integrity in Vertica consists of a set of constraints (logical schema objects) that define primary key and foreign key columns. In a star schema or snowflake schema:
- Each dimension table must have a PRIMARY KEY constraint.
- The fact table must contain columns that can be used to join the fact table to dimension tables.
- Fact table join columns must have FOREIGN KEY constraints in order to participate in pre-join projections.
- Outer join queries produce expected results only when the fact table join column used in the query does not have a FOREIGN KEY constraint.
|
REFERENCES
|
adds a referential integrity constraint defining the column as a foreign key.
If column is omitted, the default is the primary key of table.
|
table-name
|
specifies the table to which the REFERENCES constraint applies.
|
column-name
|
specifies the column to which the REFERENCES constraint applies. If column is omitted, the default is the primary key of table-name.
|
DEFAULT constant
|
specifies a constant to be used by default in any INSERT operation that does not specify a value for the column. The data type of the constant must match the data type of the column. If no default value is specified, the default is null.
|
Notes
- You must define primary key and foreign key constraints in all tables that participate in joins. See Adding Constraints.
- You must specify NOT NULL constraints on columns that will be given PRIMARY and REFERENCES constraints.
- Vertica does not support expressions in the DEFAULT clause.
- ALTER TABLE allows you to define a
table-constraint on multiple columns (a compound key).Adds a join constraint or a constraint describing a
functional dependency to the metadata of a table. See Adding Constraints in the Database Administrator's Guide.
A functional dependency is a relationship between two sets of column values in a table where one column value can determine the other. It represents knowledge of the data that cannot otherwise be expressed in the logical schema. In Vertica functional dependencies are expressed as CORRELATION constraints and are used by the Database Designer to produce more efficient physical schema designs.
Syntax
[ CONSTRAINT constraint_name ]
{ PRIMARY KEY ( column [ , ... ] )
| FOREIGN KEY ( column [ , ... ] )
REFERENCES table [ ( column [ , ... ] ) ]
| CORRELATION ( column1 ) DETERMINES ( column2 )
}
Semantics
CONSTRAINT constraint-name
|
optionally assigns a name to the constraint. Vertica recommends that you name all constraints.
|
PRIMARY KEY
( column [ , ... ] )
|
adds a referential integrity constraint defining one or more NOT NULL numeric columns as the primary key.
Referential integrity in Vertica consists of a set of constraints (logical schema objects) that define primary key and foreign key columns. In a star schema or snowflake schema:
- Each dimension table must have a PRIMARY KEY constraint.
- The fact table must contain columns that can be used to join the fact table to dimension tables.
- Fact table join columns must have FOREIGN KEY constraints in order to participate in pre-join projections.
- Outer join queries produce expected results only when the fact table join column used in the query does not have a FOREIGN KEY constraint.
|
FOREIGN KEY
( column [ , ... ] )
|
adds a referential integrity constraint defining one or more NOT NULL numeric columns as a foreign key.
Referential integrity in Vertica consists of a set of constraints (logical schema objects) that define primary key and foreign key columns. In a star schema or snowflake schema:
- Each dimension table must have a PRIMARY KEY constraint.
- The fact table must contain columns that can be used to join the fact table to dimension tables.
- Fact table join columns must have FOREIGN KEY constraints in order to participate in pre-join projections.
- Outer join queries produce expected results only when the fact table join column used in the query does not have a FOREIGN KEY constraint.
|
REFERENCES
table [ ( column [ , ... ] )
|
specifies the table to which the FOREIGN KEY constraint applies. If column is omitted, the default is the primary key of table.
|
CORRELATION
|
describes a functional dependency. Given a tuple and the set of values in column1, one can determine the corresponding value of column2.
A functional dependency is a relationship between two sets of column values in a table where one column value can determine the other. It represents knowledge of the data that cannot otherwise be expressed in the logical schema. In Vertica functional dependencies are expressed as CORRELATION constraints and are used by the Database Designer to produce more efficient physical schema designs.
|
Notes
- Use the ALTER TABLE command to add a table constraint. The CREATE TABLE statement does not allow table constraints.
- You must define primary key and foreign key constraints in all tables that participate in inner joins. See Adding Join Constraints.
Examples
CORRELATION (Product_Description) DETERMINES (Category_Description)
The Retail Sales Example Database described in the Quick Start contains a table
Product_Dimension in which products have descriptions and categories. For example, the description "Seafood Product 1" exists only in the "Seafood" category. You can define several similar correlations between columns in the Product Dimension table.
The Product Dimension table describes all products sold by the grocery chain since its beginning. Typically, individual stores only carry a subset of the products. The generated data file contains data for 60,000 products by default.
Column Name
|
Data Type
|
Example
|
Product_Key
|
integer
|
1
|
Product_Description
|
varchar(128)
|
Seafood Product 1
|
SKU_Number
|
char(32)
|
SKU-#1
|
Category_Description
|
char(32)
|
Food
|
Department_Description
|
char(32)
|
Seafood
|
Package_Type_Description
|
char(32)
|
Box
|
Package_Size
|
char(32)
|
18 ox
|
Fat_Content
|
integer
|
89
|
Diet_Type
|
char(32)
|
South Beach
|
Weight
|
integer
|
50
|
Weight_Units_of_Measure
|
char(32)
|
gram
|
Shelf_Width
|
integer
|
2
|
Shelf_Height
|
integer
|
4
|
Shelf_Depth
|
integer
|
4
|
Examples are shown below. See the SQL Reference Manual for detailed information about these commands.
Defining Column Constraints
To define a PRIMARY KEY constraint on a single column:
CREATE TABLE Dim1 (
C1 INTEGER CONSTRAINT Dim1pk PRIMARY KEY,
C2 INTEGER);
Because a constraint name is optional, you can write the example as:
CREATE TABLE Dim1 (
C1 INTEGER PRIMARY KEY,
C2 INTEGER);
To define a FOREIGN KEY constraint on a single column:
CREATE TABLE Fact1 (
C1 INTEGER CONSTRAINT Fact1Dim1fk FOREIGN KEY REFERENCES DIM1(C1),
C2 INTEGER);
Again, the constraint name is optional.
Defining Table (Multi-Column) Constraints
To specify multi-column (compound) keys, you must use an ALTER TABLE statement in addition to a CREATE TABLE statement. For example:
CREATE TABLE Dim1 (
C1 INTEGER,
C2 INTEGER);
ALTER TABLE Dim1
ADD CONSTRAINT Dim1pk PRIMARY KEY (C1, C2);
The matching FOREIGN KEY constraint would look like this:
CREATE TABLE Fact1 (
C1 INTEGER,
C2 INTEGER);
ALTER TABLE Fact1
ADD CONSTRAINT Fact1fk FOREIGN KEY REFERENCES Dim1(C1,C2);
Vertica recommends that you use only named table constraints.