Overview
As a Vertica SQL programmer, most of your tasks will be very similar to those required by other relational database management systems.
Connecting to a Vertica Database
There is nothing unusual about connecting to a Vertica database. You can connect interactively using
vsql or connect from an application program using
JDBC or
ODBC.
Open DataBase Connectivity is a standard application programming interface (API) for access to database management systems.
JDBC (Java Database Connectivity) is a call-level API that provides connectivity between Java programs and data sources (SQL databases and other non-relational data sources, such as spreadsheets or flat files). JDBC is included in the Java 2 standard and enterprise editions.
vsql is the Vertica implementation of psql, a character-based, interactive, front-end that is part of PostgreSQL and used by other database management systems. It allows you to type in SQL statements and see the results. It also provides a number of meta-commands and various shell-like features to facilitate writing scripts and automating a wide variety of tasks.
Writing Queries
Vertica is designed to execute queries that are suitable for a
star schema or
snowflake schema. You may need to modify existing normalized schema queries in order to execute them againSt a Vertica database.
A snowflake schema is the same as a star schema except that a dimension table can be normalized (hierarchically decomposed) into additional dimension tables. Every dimension table participates in a 1::n join with the fact table or another dimension table.
The star schema (sometimes called a star join schema) is the simplest
data warehouse schema. In a star schema design there is a central fact table with a large number of tuples, optionally surrounded by a collection of dimension tables, each with a lesser number of tuples. Every dimension table participates in a 1::n join with the fact table.
A data warehouse is a relational database that is designed for query and analysis rather than transaction processing. Data warehouses:
- are often subjected to a heavy load of periodic and ad-hoc queries.
- contain historical information that enables analysis of correlations and trends over long periods of time.
- integrate data from various production (transactional) databases. Extraction, transformation, and loading (
ETL) software converts the data to a common format and copies it into a data warehouse at regular intervals. ETL (Extract, Transform, Load) is a process in data warehousing that involves extracting data from outside sources, transforming it to fit a specific schema, and ultimately loading it into the database.
- typically consist of one or more star or snowflake schemas.