Book Contents

Book Index

Next Topic

Home

Projections

Vertica's physical storage consists of collections of table columns called projections. A projection is a special case of a materialized view. A materialized view is similar to a standard SQL view with one major exception: the result set is stored on disk rather than computed each time the view is used in a query. The result set is automatically refreshed whenever data values are changed.

Query Performance

Because projections are stored in sort order, a single disk read can fetch many rows at once. Disks read data extremely quickly once positioned, but can take many milliseconds to seek and fetch a single row. For example, if the WHERE clause of a query is (X=1 AND Y=2) and a projection is already sorted on (X,Y), the query runs almost instantaneously.

Another important performance factor is data compression. Sorted data contains repeated sequences that compress very well using, for example, run-length encoding. Compression also makes multiple projections fit within your disk space budget, and thus more extremely fast queries.

Creating Projections

Vertica's Database Designer tool (discussed in a section to follow):

  1. analyzes your logical schema, sample data, and sample queries
  2. creates a physical schema (projections) in the form of an SQL script that you execute

The query performance of the physical schema produced by the Database Designer depends on the items you provide. For example, if your logical schema includes dimension tables, your sample queries must include at least a minimal set of joins.

Alternatively, you can write your own custom projections as described in the Database Administrator's Guide (Advanced). This option is intended for advanced users only and should be used as advised by Technical Support.

Superprojections

A superprojection is a projection that contains every column of a table in the Logical Schema. A table can have multiple superprojections with different sort orders.

Pre-Join Projections

A pre-join projection stores the result set obtained by joining a single fact table in the logical schema (the anchor table) with one or more dimension tables. The result set is typically sorted for optimal query execution performance. Most query-specific projections are pre-joins.

The anchor table of a join can be:

The diagram below shows a pre-join projection consisting of four columns, one from the fact table and three from dimension tables.

Projection (Simple)