Important Information
Reference
Technical Support
Introduction
Welcome to TIBCO Spotfire Miner™ 8.2
Key Features and Benefits of Spotfire Miner
System Requirements and Installation
How Spotfire Miner Does Data Mining
Define Goals
Access Data
Explore Data
Select and Transform Variables
Model Data
Deploy Model
The Spotfire S+ Library
Help, Support, and Learning Resources
Online Help
Online Manuals
Data Mining References
Typographic Conventions
Data Input and Output
Overview
Data Types in Spotfire Miner™
Categorical Data
Strings
Reading/Writing Data Sets with Long Column Names
Reading Long Strings
Dates
Worksheet Options for Dates
Date Parsing Formats
Date Display Formats
Limitations
Working with External Files
Reading External Files and Databases
Using Absolute and Relative Paths
Data Input
Read Text File
General Procedure
Properties
Using the Viewer
Read Fixed Format Text File
General Procedure
Properties
Using the Viewer
Read Spotfire Data
General Procedure
Properties
Using the Viewer
Read SAS File
General Procedure
Properties
Using the Viewer
Read Excel File
General Procedure
Properties
Using the Viewer
Read Other File
General Procedure
Properties
Using the Viewer
Read Database ODBC
The ODBC Data Source Administrator
ODBC Drivers
Defining a Data Source
General Procedure
Properties
Using the Viewer
Read DB2 Native
DB2 Client
General Procedure
Properties
Using the Viewer
Read Oracle Native
Oracle Client
General Procedure
Properties
Using the Viewer
Read SQL Native
Microsoft SQL Server Client
General Procedure
Properties
Using the Viewer
Read Sybase Native
Sybase Client
General Procedure
Properties
Using the Viewer
Read Database JDBC
Data Output
Write Text File
General Procedure
Properties
Using the Viewer
Write Fixed Format Text File
General Procedure
Properties
Using the Viewer
Write SAS File
General Procedure
Properties
Using the Viewer
Write Spotfire Data
General Procedure
Properties
Using the Viewer
Write Excel File
General Procedure
Properties
Using the Viewer
Write Other File
General Procedure
Properties
Using the Viewer
Write Database ODBC
General Procedure
Properties
Using the Viewer
Write DB2 Native
General Procedure
Properties
Using the Viewer
Write Oracle Native
General Procedure
Properties
Using the Viewer
Write SQL Native
General Procedure
Properties
Using the Viewer
Write Sybase Native
General Procedure
Properties
Using the Viewer
Write Database JDBC
The TIBCO Spotfire Miner™ Interface
Overview
The Main Menu
File Menu Options
Edit Menu Options
View Menu Options
Tools Menu Options
Window Menu Options
Help Menu Options
The Toolbar
The Explorer Pane
The Main Library
The Spotfire S+ Library
The User Library
Copying Nodes To Libraries
Deleting Library Components
Library Manager
Other Library Operations
The Desktop Pane
The Message Pane
The Command Line Pane
The Spotfire Miner™ Working Environment
Worksheet Directories
The Examples Folder
Building and Editing Networks
Building a Network
Adding Nodes
Navigating in a Worksheet
Annotations
Deleting Nodes
Linking Nodes
Deleting Links
Viewing The Data In Links
Link Line Style
Copying Nodes
Model Ports
Specifying Properties for Nodes
Specifying File Names
Collapsing Nodes
Creating Customized Components
Running and Stopping a Network
Running Nodes and Networks
Node Priority
Status Indicators
Data Caches
Invalidating Nodes
Stopping a Running Network
Common Features of Network Nodes
Shortcut Menus
Properties Dialogs
Opening the Properties Dialog
Sorting in Dialog Fields
Visual Cues in Dialog Fields
Advanced Page
Viewers
Launching a Viewer
Closing Viewers
The Table Viewer
Data Exploration
Overview
Creating One-Dimensional Charts
General Procedure
Chart Types
Pie Charts
Bar Charts
Column Charts
Dot Charts
Histograms
Box Plots
The Order of Levels in Categorical Variables
Properties
The Properties Page
The Options Page
Conditioned Charts
Using the Viewer
Selecting Charts
Viewing Charts
Enlarging Charts
Formatting Charts
Saving, Printing, and Copying Charts
An Example
Computing Correlations and Covariances
General Procedure
Definitions
Properties
Using the Viewer
Output
An Example
Crosstabulating Categorical Data
General Procedure
Properties
Using the Viewer
An Example
Computing Descriptive Statistics
General Procedure
Properties
Using the Viewer
Comparing Data
General Procedure
Properties
The Properties Page
The Output Page
Using the Viewer
Viewing Tables
General Procedure
Using the Viewer
Data Cleaning
Overview
Missing Values
General Procedure
Properties
The Properties Page
Using the Viewer
An Example
Duplicate Detection
General Procedure
Background
Properties
The Properties Page
The Output Page
Using the Viewer
An Example
Outlier Detection
General Procedure
Background
Properties
The Properties Page
The Output Page
Using the Viewer
An Example
Interpreting the Results
Technical Details
Why Robust Distances Are Preferable
Algorithm Specifics
References
Data Manipulation
Overview
Manipulating Rows
Aggregate
General Procedure
Properties
Using the Viewer
Append
General Procedure
Properties
Using the Viewer
Filter Rows
General Procedure
Properties
Using the Viewer
Partition
General Procedure
Properties
Using the Viewer
Sample
General Procedure
Properties
Using the Viewer
Shuffle
General Procedure
Using the Viewer
Sort
General Procedure
Properties
Using the Viewer
Split
General Procedure
Properties
Using the Viewer
Stack
General Procedure
Properties
Using the Viewer
Unstack
General Procedure
Properties
Using the Viewer
Manipulating Columns
Bin
General Procedure
Properties
Vary By Column
Using the Viewer
Create Columns
General Procedure
Properties
Using the Viewer
Filter Columns
General Procedure
Properties
Using the Viewer
Recode Columns
General Procedure
Properties
Using the Viewer
Example
Join
General Procedure
Properties
Using the Viewer
Modify Columns
General Procedure
Properties
Using the Viewer
Normalize
General Procedure
Properties
Using the Viewer
Reorder Columns
General Procedure
Properties
Using the Viewer
Transpose
General Procedure
Properties
Using the Viewer
Using the Spotfire Miner™ Expression Language
Value Types
NA Handling
Error Handling
Column References
Double and String Constants
Operators
Functions
Conversion Functions
Numeric Functions
String Functions
Date Manipulation Functions
Data Set Functions
Miscellaneous Functions
Classification Models
Overview
General Procedure
Selecting Dependent and Independent Variables
Sorting Column Names
Selecting Output
Creating Predict Nodes
Logistic Regression Models
Mathematical Definitions
Properties
The Properties Page
The Options Page
The Output Page
Using the Viewer
Creating a Filter Column node
A Cross-Sell Example
Importing and Exploring the Data
Manipulating the Data
Modeling the Data
Predicting from the Model
Technical Details
Classification Trees
Background
Growing a Tree
Pruning a Tree
Ensemble Trees
Trees in Spotfire Miner
Properties
The Properties Page
The Options Page
The Single Tree Page
The Ensemble Page
The Output Page
The Advanced Page
Using the Viewer
A Cross-Sell Example (Continued)
Importing, Exploring, and Manipulating the Data
Modeling the Data
Predicting from the Model
Classification Neural Networks
Background
Properties
The Properties Page
The Options Page
The Output Page
Using the Viewer
A Cross-Sell Example (Continued)
Importing, Exploring, and Manipulating the Data
Modeling the Data
Predicting from the Model
Technical Details
Learning Algorithms
Initialization of Weights
Naive Bayes Models
Background
Properties
The Properties Page
The Output Page
Using the Viewer
A Promoter Gene Sequence Example
Technical Details
References
Regression Models
Overview
General Procedure
Selecting Dependent and Independent Variables
Sorting Column Names
Selecting Output
Creating Predict Nodes
Linear Regression Models
Mathematical Definitions
Properties
The Properties Page
The Output Page
Using the Viewer
Creating a Filter Column node
A House Pricing Example
Importing and Exploring the Data
Manipulating the Data
Exploring and Manipulating the Data Again
Modeling the Data
Technical Details
Algorithm Specifics
The Coding of Levels in Categorical Variables
Regression Trees
Background
Growing a Tree
Ensemble Trees
Trees in Spotfire Miner
Properties
The Properties Page
The Options Page
The Single Tree Page
The Ensemble Page
The Output Page
The Advanced Page
Using the Viewer
A House Pricing Example (Continued)
Regression Neural Networks
Background
Properties
The Properties Page
The Options Page
The Output Page
Using the Viewer
A House Pricing Example (Continued)
Technical Details
Learning Algorithms
Initialization of Weights
References
Clustering
Overview
The K-Means Component
General Procedure
Properties
Properties Page
Options Page
Output Page
Tips for Better Cluster Results
Technical Details
Scalable K-Means Algorithm
Coding of Categorical Variables
Example
K-Means Clustering Example
References
Dimension Reduction
Overview
Principal Components
General Procedure
Properties
The Properties Page
The Output Page
Using the Viewer
An Example Using Principal Components
Technical Details
Association Rules
Overview
Association Rules Node Options
Properties Page
Options Page
Output Page
Definitions
Support
Confidence
Lift
Data Input Types
Groceries Example
Setting the Association Rules
Survival
Introduction
Basic Survival Models Background
General Procedure
Properties
The Properties Page
Time Varying Covariates
The Options Page
The Output Page
Using the Viewer
A Banking Customer Churn Example
A Time Varying Covariates Example
Technical Details for Cox Regression Models
Mathematical Definitions
Computational Details
Time-Dependent Covariates
Tied Events
Strata
Survival Function
References
Model Assessment
Overview
Properties
Assessing Classification Models
General Procedure
Classification Agreement
Confusion Matrices
Using the Viewer
Lift Chart
Chart Types
Assessing Regression Models
General Procedure
Definitions
Using the Viewer
Deploying Models
Overview
Predictive Modeling Markup Language
PMML Conformance
Import/Export Compatibility
Export PMML
General Procedure
Properties
Using the Viewer
Import PMML
General Procedure
Properties
Using the Viewer
Export Report
General Procedure
Properties
Transform
Using the Viewer
Advanced Topics
Overview
Pipeline Architecture
The Advanced Page
Worksheet Advanced Options
Max Rows Per Block
Max Megabytes Per Block
Order of Operations
Caching
Random Seed
Worksheet Random Seeds Option
Notes on Data Blocks and Caching
Deleting Data Caches
Worksheet Data Directories
Memory Intensive Functions
Size Recommendations for Spotfire Miner™
Worst-case Scenario Assumptions
Upper Limit Estimation for .wsd Disk Space
Command Line Options
Running Spotfire Miner in Batch
Increasing Java Memory
Importing and Exporting Data with JDBC
JDBC Example Workflow
The S-PLUS Library
Overview
S-PLUS Data Nodes
Read S-PLUS Data
General Procedure
Properties
Using the Viewer
Write S-PLUS Data
General Procedure
Properties
Using the Viewer
S-PLUS Chart Nodes
Overview
General Procedure
Using the Graph Window
Graph Options
One Column - Continuous
Data Page
Density Plot
Histogram
QQ Math Plot
One Column - Categorical
Data Page
Bar Chart
Dot Plot
Pie Chart
Two Columns - Continuous
Data Page
Hexbin Plot
Scatter Plot
Two Columns - Mixed
Data Page
Box Plot
Strip Plot
QQ Plot
Three Columns
Data Page
Contour Plot
Level Plot
Surface Plot
Cloud Plot
Multiple Columns
Multiple 2-D Plots
Data Page
Hexbin Matrix
Scatterplot Matrix
Parallel Plot
Time Series
Time Series Line Plot
Time Series High- Low Plot
Time Series Stacked Bar Plot
Common Pages
Titles Page
Axes Page
Multipanel Page
File Page
Advanced Page
S-PLUS Data Manipulation Nodes
Evaluating S-PLUS Expressions
Data Types in Spotfire Miner and Spotfire S+
Spotfire S+ Column Names
S-PLUS Create Columns
General Procedure
Properties
Using the Viewer
S-PLUS Filter Rows
General Procedure
Properties
Using the Viewer
S-PLUS Split
General Procedure
Properties
Using the Viewer
S-PLUS Script Node
General Procedure
Properties
The Properties Page
The Options Page
The Parameters Page
Processing Multiple Data Blocks
The Test Phase
Input List Elements
Output List Elements
Size of the Input Data Frames
Date and String Values
Interpreting min/max values
Debugging
Processing Data Using the Execute Big Data Script Option
Reading and Writing bdFrames
Passing Other Object Types using bdPackedObjects
Loading Spotfire S+ Modules
Examples Using the S-PLUS Script Node
Create Plots
Fit and Use a Generalized Additive Model
Passing Model Information to Prediction Nodes
Replace Missing Values
Use a Custom Library from Spotfire S+
Access Data from a Spotfire S+ Database
Filter Columns Using Dynamic Outputs
An Extended Example with Two S-PLUS Script Nodes
References
Index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Z