The ClickStream Example Database is a simple star schema that represents a record of the clicks made by a user on a web site. This data can be analyzed and used, for example, for business/marketing purposes or the detection of malicious activities on the website. Each table is described in a separate section.
The Clickstream schema is focused towards discovering interesting and useful information from Web content and usage. This schema can be used for
The data in the ClickStream schema is populated from parsing Web Server logs, users browsing activities and habits etc. This data can be used for tracking malicious and fraudulent activities in real time. The schema is focused towards recognizing patterns either by using statistical models, by manual offline analysis or by SQL queries.
The schema is intended to answer following queries for fraud detection or other purposes
Table Name
|
Default Number of Rows
|
ClickStream_Fact
Each tuple in the fact table represents a summary of the user clicks done during browser session.
Field Name
|
Data Type
|
Description/Example
|
Date_Key
|
Integer
|
Date Key
|
Session_Key
|
Integer
|
Foreign Key, references Session_Dimension table
|
Customer_Key
|
Integer
|
Foreign Key, references Customer_Dimension Table
|
ClientIP_Key
|
Integer
|
Client IP Address, Foreign Key, references IPAddress_Dimension Table
|
ServerIP_Key
|
Integer
|
WebServer IP Address Foreign Key, references IPAddress_Dimension Table
|
UserAgent_ID
|
Integer
|
Foreign Key, references UserAgent_Dimension table
|
Page_Id
|
Integer
|
Foreign Key, references Page_Dimension table
|
Referrer_Page_id
|
Integer
|
Referring Page id.
|
CreditCard_ID
|
Integer
|
Foreign Key, references CreditCard_Dimension Table
|
Num_Errors
|
Integer
|
Number of Errors encountered while browsing
|
KBytes_Downloaded
|
Integer
|
Amount of Data downloaded at client machine
|
Browsing_Time_Per_Page
|
Integer
|
browsing time in minutes
|
|
5000000
|
Customer_Dimension
This table describes the user demographic information. Data in this table is populated from parsing strings from web logs of server.
Field Name
|
Data Type
|
Description/Example
|
Customer_Key
|
Integer
|
Primary key
|
Name
|
Varchar
|
Name of customer
|
Email_ID
|
Varchar
|
Unique Mail id of customer
|
Sex
|
Char
|
Sex of the customer
|
Age
|
Integer
|
Age of customer
|
Annual_income
|
Integer
|
Annual income of the customer e.g. 50000 ($50000)
|
City
|
Varchar
|
Home city of customer
|
State
|
Varchar
|
Home state of customer
|
Country
|
Varchar
|
Home country of customer
|
|
5000
|
Session_Dimension
This table details user browsing session information.
Field Name
|
Data Type
|
Description/Example
|
Session_Key
|
Integer
|
Primary Key
|
Session_Start_Time
|
Varchar
|
Session Start Time
|
Session_End_Time
|
Varchar
|
Session End Time
|
Duration
|
Integer
|
Duration of the session in minutes
|
Server_IP
|
Varchar
|
IP address of Server
|
Client_IP
|
Varchar
|
IP address of Client
|
|
50000
|
UserAgent_Dimension
This table describes user agent types for all machine types.
A user agent is a client application program used to access resources on networks such as the World Wide Web. User agents include web browsers, search engine crawlers, PDAs, cell phones, and so forth.
Field Name
|
Data Type
|
Description/Example
|
UserAgent_Key
|
Integer
|
Primary key
|
Browser_Type
|
Varchar
|
Mozilla
|
Browser_Version
|
Varchar
|
4.7
|
Operating_System
|
Varchar
|
WinNT/Linux
|
Operating_System_Version
|
Varchar
|
4.0/5.0 etc
|
Agent_Language
|
Varchar
|
English/French etc
|
|
500
|
IPAddress _Dimension
This table describes the customer demographic information. Data in this table is populated from parsing strings from web logs of server.
Field Name
|
Data Type
|
Description/Example
|
IPAddress_Key
|
Integer
|
Primary Key
|
IPAddress_Val
|
Varchar
|
IP Address value in dotted decimal e.g. 172.16.0.1
|
City
|
Varchar
|
City part of IP Address
|
State
|
Varchar
|
State part of IP Address
|
Country
|
Varchar
|
Country part of IP Address
|
|
1000
|
Page_Dimension
This table describes each page's domain relationships.
Field Name
|
Data Type
|
Description/Example
|
Page_Key
|
Integer
|
Primary Key
|
Page_Name
|
Varchar
|
Page Description and Name
|
Page_Sub_Domain
|
Varchar
|
Page Sub Domain
|
Page_Domain
|
Varchar
|
Page Domain
|
|
5000
|
CreditCard_Dimension
This table describes the all domain pages.
Field Name
|
Data Type
|
Description/Example
|
Card_Key
|
Integer
|
Primary Key
|
CardHolder_Name
|
Varchar
|
Varchar
|
Card_Type
|
Varchar
|
MasterCard/Visa/Amex
|
Card_Expiration_Date
|
Date
|
Date
|
|
5000
|