Building a Data Warehouse: With Examples in SQL Server
Contents at a Glance
Contents
About the Author
Preface
Introduction to Data Warehousing
What Is a Data Warehouse?
Retrieves Data
Consolidates Data
Periodically
Dimensional Data Store
Normalized Data Store
History
Query
Business Intelligence
Reporting
Online Analytical Processing (OLAP)
Data Mining
Other Analytical Activities
Updated in Batches
Other Definitions
Data Warehousing Today
Business Intelligence
Customer Relationship Management
Data Mining
Master Data Management (MDM)
Customer Data Integration
Future Trends in Data Warehousing
Unstructured Data
Search
Service-Oriented Architecture (SOA)
Real-Time Data Warehouse
Summary
Data Warehouse Architecture
Data Flow Architecture
Single DDS
NDS + DDS
ODS + DDS
Federated Data Warehouse
System Architecture
Case Study
Summary
Data Warehouse Development Methodology
Waterfall Methodology
Iterative Methodology
Summary
Functional and Nonfunctional Requirements
Identifying Business Areas
Understanding Business Operations
Defining Functional Requirements
Defining Nonfunctional Requirements
Conducting a Data Feasibility Study
Summary
Data Modeling
Designing the Dimensional Data Store
Dimension Tables
Date Dimension
Slowly Changing Dimension
Product, Customer, and Store Dimensions
Subscription Sales Data Mart
Supplier Performance Data Mart
CRM Data Marts
Data Hierarchy
Source System Mapping
Designing the Normalized Data Store
Summary
Physical Database Design
Hardware Platform
Storage Considerations
Configuring Databases
Creating DDS Database Structure
Creating the Normalized Data Store
Using Views
Summary Tables
Partitioning
Indexes
Summary
Data Extraction
Introduction to ETL
ETL Approaches and Architecture
General Considerations
Extracting Relational Databases
Whole Table Every Time
Incremental Extract
Fixed Range
Related Tables
Testing Data Leaks
Extracting File Systems
Extracting Other Source Types
Extracting Data Using SSIS
Memorizing the Last Extraction Timestamp
Extracting from Files
Summary
Populating the Data Warehouse
Stage Loading
Data Firewall
Populating NDS
Using SSIS to Populate NDS
Upsert Using SQL and Lookup
Normalization
Practical Tips on SSIS
Populating DDS Dimension Tables
Populating DDS Fact Tables
Batches, Mini-batches, and Near Real-Time ETL
Pushing the Data In
Summary
Assuring Data Quality
Data Quality Process
Data Cleansing and Matching
Cross-checking with External Sources
Data Quality Rules
Action: Reject, Allow, Fix
Logging and Auditing
Data Quality Reports and Notifications
Summary
Metadata
Metadata in Data Warehousing
Data Definition and Mapping Metadata
Data Structure Metadata
Source System Metadata
ETL Process Metadata
Data Quality Metadata
Audit Metadata
Usage Metadata
Maintaining Metadata
Summary
Building Reports
Data Warehouse Reports
When to Use Reports and When Not to Use Them
Report Wizard
Report Layout
Report Parameters
Grouping, Sorting, and Filtering
Simplicity
Spreadsheets
Multidimensional Database Reports
Deploying Reports
Managing Reports
Managing Report Security
Managing Report Subscriptions
Managing Report Execution
Summary
Multidimensional Database
What a Multidimensional Database Is
Online Analytical Processing
Creating a Multidimensional Database
Processing a Multidimensional Database
Querying a Multidimensional Database
Administering a Multidimensional Database
Multidimensional Database Security
Processing Cubes
Backup and Restore
Summary
Using Data Warehouse for Business Intelligence
Business Intelligence Reports
Business Intelligence Analytics
Business Intelligence Data Mining
Business Intelligence Dashboards
Business Intelligence Alerts
Business Intelligence Portal
Summary
Using Data Warehouse for Customer Relationship Management
Single Customer View
Campaign Segmentation
Permission Management
Delivery and Response Data
Customer Analysis
Customer Support
Personalization
Customer Loyalty Scheme
Summary
Other Data Warehouse Usage
Customer Data Integration
Unstructured Data
Search in Data Warehousing
Summary
Testing Your Data Warehouse
Data Warehouse ETL Testing
Functional Testing
Performance Testing
Security Testing
User Acceptance Testing
End-to-End Testing
Migrating to Production
Summary
Data Warehouse Administration
Monitoring Data Warehouse ETL
Monitoring Data Quality
Managing Security
Managing Databases
Making Schema Changes
Updating Applications
Summary
Normalization Rules
First Normal Form
Second Normal Form
Third Normal Form
Boyce-Codd Normal Form
Higher Normal Forms
Index