PC Chairs’ Preface
General Chairs’ Preface
Organization
Contents – Part I
Contents – Part II
Classification
Joint Classification with Heterogeneous Labels Using Random Walk with Dynamic Label Propagation
1 Introduction
2 Related Works
2.1 Traditional Classification and Joint Classification
2.2 The Construction of MRG
3 Random Walk with Dynamic Label Propagation on MRG
4 Experiment and Result
4.1 Data Set
4.2 Compared Methods
4.3 Performance Comparison
5 Conclusions
References
Hybrid Sampling with Bagging for Class Imbalance Learning
1 Introduction
2 Related Work
3 The Proposed Method
4 Experiments
4.1 Experiment 1: Sampling Rate Verification
4.2 Experiment 2: Comparative Studies
4.3 Experiment 3: Sampling Rate Comparison
4.4 Experiment 4: Parameter Selection
5 Conclusion
References
Sparse Adaptive Multi-hyperplane Machine
1 Introduction
2 Preliminary
3 Related Work
3.1 Multi-class SVM
3.2 Multi-hyperplane Machine
3.3 Adaptive Multi-hyperplane Machine
4 Sparse Adaptive Multi-hyperplane Machine (SAMM)
4.1 Optimization Problem
4.2 Optimization Solution
4.3 Generalization Error of SAMM
5 Experiments
5.1 Experimental Settings
5.2 Evaluation on Accuracy and Time of the Proposed Method
5.3 Tuning the Sparsity and Its Influence on Performance
6 Conclusion
7 Technical Lemmas
References
Exploring Heterogeneous Product Networks for Discovering Collective Marketing Hyping Behavior
1 Introduction
2 Related Work
3 Methodology
3.1 Product Network Regularization
3.2 The Learning Algorithm
4 Experiments
4.1 Human Evaluation
4.2 Benchmark Methods
4.3 Experimental Results
4.4 Application: A Case Study
5 Conclusions
References
Optimal Training and Efficient Model Selection for Parameterized Large Margin Learning
1 Introduction
2 Large Margin Learning with Multiple Parameters
3 Deriving the Explicit Dependence
4 PDGDP for Training: Global Optimality Guarantee
5 PDGDP Based Model Selection Algorithm
6 Experimental Results
6.1 PDGDP Training Results
6.2 PDGDP Model Selection Results
7 Conclusion and Future Work
References
Locally Weighted Ensemble Learning for Regression
1 Introduction
2 Related Work
2.1 Constant Weighted Ensemble
2.2 Dynamic Weighted Ensemble
3 Locally Weighted Ensemble Learning
3.1 Objective Function of Locally Weighted Ensemble Learning
3.2 Optimization of Locally Weighted Ensemble Learning
3.3 Algorithm of Locally Weighted Ensemble Learning
4 Experiments and Analysis
4.1 Convergence of Objective Function
4.2 Prediction on UCI Datasets
5 Conclusion
References
Reliable Confidence Predictions Using Conformal Prediction
1 Introduction
2 Inductive Conformal Classification
3 Conformal Classifier Errors
3.1 Class-Conditional Conformal Classification
3.2 Utilizing Posterior Information
4 Experiments
5 Concluding Remarks
References
Grade Prediction with Course and Student Specific Models
1 Introduction
2 Definitions and Notations
3 Methods
3.1 Course-Specific Regression (CSR)
3.2 Student-Specific Regression (SSR)
3.3 Methods Based on Matrix Factorization
4 Experimental Design
4.1 Dataset
4.2 Competing Methods
4.3 Parameters and Model Selection
4.4 Evaluation Methodology and Performance Metrics
5 Experimental Results
5.1 Course-Specific Regression
5.2 Student-Specific Regression
5.3 Methods Based on Matrix Factorization
5.4 Comparison with other methods
6 Conclusions
References
Flexible Transfer Learning Framework for Bayesian Optimisation
1 Introduction
2 Preliminaries
2.1 Gaussian Process
2.2 Bayesian Optimisation
3 Proposed Method
4 Experiments
4.1 Experimental Setup
4.2 Experiment with Synthetic Data
4.3 Experiment with Real World Datasets
5 Conclusion
References
A Simple Unlearning Framework for Online Learning Under Concept Drifts
1 Introduction
2 Preliminaries
3 Unlearning Framework
3.1 Unlearning Test
3.2 Instance for Unlearning Test
4 Empirical Evaluation
4.1 Results and Discussion
5 Conclusion
References
User-Guided Large Attributed Graph Clustering with Multiple Sparse Annotations
1 Introduction
2 Related Work
3 Method CGMA
3.1 Framework
3.2 Complexity Analysis
4 Experiments
5 Conclusions
References
Early-Stage Event Prediction for Longitudinal Data
1 Introduction
2 Related Work
3 Preliminaries
3.1 Problem Formulation
3.2 Naive Bayes Method
3.3 Tree-Augmented Naive Bayes Method
4 The Proposed ESP Framework
4.1 Prior Probability Extrapolation
4.2 The ESP Algorithm
5 Experimental Results
5.1 Dataset Description
5.2 Performance Evaluation
5.3 Results and Discussion
6 Conclusion
References
Toxicity Prediction in Cancer Using Multiple Instance Learning in a Multi-task Framework
1 Introduction
2 Related Work and Background Knowledge
2.1 Toxicity Prediction
2.2 The Multi-instance Learning
2.3 Multi-task Learning Using Nonparametric Factor Analysis
3 The Proposed Framework
3.1 Model Description
3.2 Model Inference
4 Experiments
4.1 Synthetic Data
4.2 Real Data Description
4.3 Experiment Setting and Results
5 Conclusion
References
Shot Boundary Detection Using Multi-instance Incremental and Decremental One-Class Support Vector Machine
1 Introduction
2 Computational Framework
2.1 Overview
2.2 Feature Extraction
2.3 OCSVM
2.4 MID-OCSVM
2.5 OCSVM Divergence
3 Experimental Results
3.1 Setup
3.2 Performance Evaluation
4 Conclusion and Future Work
References
Will I Win Your Favor? Predicting the Success of Altruistic Requests
1 Introduction
2 Dataset and Features
2.1 Data
2.2 Features
3 The Proposed GPRS Model
3.1 Constructing Request Graph
3.2 Propagation-Based Optimization
4 Evaluation
5 Conclusion
References
Feature Extraction and Pattern Mining
Unsupervised and Semi-supervised Dimensionality Reduction with Self-Organizing Incremental Neural Network and Graph Similarity Constraints
1 Introduction
2 Preliminaries
2.1 Linear Dimensionality Reduction
2.2 The Single Layered SOINN
3 Dimensionality Reduction with Semi-supervised SOINN
3.1 Problem Formation and Algorithm Framework
3.2 Semi-supervised Extension to the Single Layered SOINN
4 Experiments
4.1 Artificial Datasets
4.2 The Intrusion Detection Dataset
5 Conclusions and Future Works
References
Cross-View Feature Hashing for Image Retrieval
1 Introduction
2 Problem Formulation
3 Preliminary and A Baseline Approach
3.1 Canonical Correlation Analysis
3.2 Spectral Hashing
3.3 CCA+SH Baseline Algorithm
4 CVFH: Cross-View Feature Hashing
4.1 Objective
4.2 ``Bi-Partition and Match'' Strategy
4.3 Hash Functions
4.4 CVFH Algorithm
5 Experiments
5.1 Results on Toy Data
5.2 Results on NUS-WIDE-LITE Image Data
6 Conclusion and Future Work
References
Towards Automatic Generation of Metafeatures
1 Introduction
2 Metalearning
3 Systematic Generation of Metafeatures
3.1 Decomposing Metafeatures
4 Experiments
4.1 Experimental Setup
4.2 Systematized vs Unsystematized
4.3 Systematized vs State-of-the-art
5 Conclusion and Future Work
References
Hash Learning with Convolutional Neural Networks for Semantic Based Image Retrieval
1 Introduction
2 Related Work
3 Methodology
3.1 Hash Layer
3.2 Hinge Softmax Loss
3.3 The Model
3.4 Hash Codes
4 Experiments
4.1 Experimental Settings
4.2 CIFAR-10
4.3 SVHN
5 Discussion
6 Conclusion
References
Bayesian Group Feature Selection for Support Vector Learning Machines
1 Introduction
2 Related Work
3 Bayesian GFS for SVL Machines
3.1 Group Sparse Model
3.2 The Proposed Framework
3.3 Computational Complexity
4 Experiments
4.1 Regression
4.2 Classification
5 Conclusion
References
Active Distance-Based Clustering Using K-Medoids
1 Introduction
2 Related Work
3 Proposed Method
4 Empirical Results
5 Conclusion
References
Analyzing Similarities of Datasets Using a Pattern Set Kernel
1 Introduction
2 Pattern Kernel
2.1 Episode Kernel: Pattern Kernel for Injective Serial Episodes
2.2 Itemset Kernel: Pattern Kernel for Itemsets
3 Pattern Set Kernel
3.1 Complexity for Finding the Pattern Set Kernel
4 Simulations
4.1 Measuring Similarity Between Sequences
4.2 Pattern Set Kernels for Classification
4.3 Change Detection in Streaming Data from Conveyor Systems
4.4 Measuring Similarity Between Transaction Data
5 Conclusion
References
Significant Pattern Mining with Confounding Variables
1 Introduction
2 Related Work
3 Significant Pattern Mining
4 Exact Logistic Regression
4.1 Logistic Regression
4.2 Exact Inference
5 Min-P Decrease Algorithm
5.1 Algorithm for K Contingency Tables
5.2 Speed and Memory Usage Improvements
6 Experiment on Synthetic Dataset
7 Experiment on Data Integration
7.1 Significant Subgraphs
7.2 Performance Evaluation
8 Conclusion
References
Building Compact Lexicons for Cross-Domain SMT by Mining Near-Optimal Pattern Sets
1 Introduction
2 Related Work
3 Framework
3.1 Formal Problem Definition
3.2 Solution Framework
3.3 Our Approach
4 Evaluation
4.1 Experimental Setup
4.2 Effect of Syntactic Completeness-Based Consensus on Pattern Extraction
4.3 Effect of Varying the Lexicon Size
4.4 Comparison of Different Approaches to Pattern-Set Extraction for Cross-Domain SMT
5 Conclusion
References
Forest CERN: A New Decision Forest Building Technique
1 Introduction
2 Literature Review
3 Our Technique
4 Experimental Results
5 Conclusion
References
Sparse Logistic Regression with Logical Features
1 Introduction
1.1 Related Work
1.2 Contributions
2 Model Formulation
3 Experiments
3.1 Experiment 1
3.2 Experiment 2
3.3 Experiment 3
4 Discussion
References
A Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders
1 Introduction and Related Work
2 Maniac -- Multi-Label Classification Using Autoencoders
3 Evaluation
3.1 Implementation
4 Experimental Results
5 Conclusion and Future Work
References
Preconditioning an Artificial Neural Network Using Naive Bayes
1 Introduction
2 WANBIACCLL
3 Method
4 Experimental Results
4.1 MSE Vs. CLL
4.2 WANBIACMSE Vs. ANN
4.3 WANBIACMSE Vs. Random Forest
5 Conclusion
References
OCEAN: Fast Discovery of High Utility Occupancy Itemsets
1 Introduction
2 Related Work
3 Problem Formulation
4 High Utility Occupancy Itemset Mining
4.1 Upper Bound of Utility Occupancy
4.2 Design and Implementation of OCEAN
5 Experiment
5.1 High Utility Occupancy Itemsets Vs. High Utility Itemsets
5.2 Efficiency of OCEAN
6 Conclusion
References
Graph and Network Data
Leveraging Emotional Consistency for Semi-supervised Sentiment Classification
1 Introduction
2 Related Work
3 Problem Statement
4 Proposed Approach
4.1 Phase 1: Label Propagation
4.2 Phase 2: Emotional Clustering Consistency
4.3 Phase 3: Target Classifier Learning
5 Performance Evaluation
5.1 Experimental Settings
5.2 Experimental Results
6 Conclusion
References
The Effect on Accuracy of Tweet Sample Size for Hashtag Segmentation Dictionary Construction
1 Introduction
2 Hashtag Segmentation Using Dynamic Programming
3 Segmentation Accuracy Distribution
4 Jaccard Similarity Distribution Parameters
5 Accuracy of the Model
6 Conclusion
A Derivation of Model Mean and Variance
References
Social Identity Link Across Incomplete Social Information Sources Using Anchor Link Expansion
Abstract
1 Introduction
2 Related Works
3 Overall Algorithm
4 Optimal Search Range
5 Identity Matching
5.1 Features Definition
5.2 Decision Model on Pairwise Similarity
6 Experiments
6.1 Experimental Setup
6.2 Experimental Results
6.3 Efficiency Evaluation
7 Conclusion
Acknowledgments
References
Discovering the Network Backbone from Traffic Activity Data
1 Introduction
2 Problem Definition
3 Related Work
4 Algorithm
4.1 The Greedy Algorithm
4.2 Speeding up the Greedy Algorithm
5 Experimental Evaluation
5.1 Quantitative Results
5.2 Comparison to Existing Approaches
6 Conclusions
References
A Fast and Complete Enumeration of Pseudo-Cliques for Large Graphs
1 Introduction
2 Preliminary and Notation
3 Maximal Connected k-Plex
4 j-Cored k-MPC
4.1 Small C-k-Plex
4.2 Medium c-k-Plex
4.3 Large c-k-Plex
4.4 Formations Revised for (j,k)-MPCs
5 Search Control Rules, Right and Left Ones
6 Algorithm for (j,k)-MPCs
7 Experiments
7.1 Computational Performance
7.2 Quality of Solutions as Pseudo-Cliques
8 Conclusion
References
Incorporating Heterogeneous Information for Mashup Discovery with Consistent Regularization
1 Introduction
2 Related Work
3 Heterogeneous Network Based Mashup Discovery
3.1 Baseline Model for Mashup Discovery
3.2 Heterogeneous Information Incorporation
3.3 Implementation
4 Experiments
4.1 Experimental Setup
4.2 Comparison
4.3 Impact of
4.4 Impact of
5 Conclusion and Future Work
References
Link Prediction in Schema-Rich Heterogeneous Information Network
1 Introduction
2 Preliminary and Problem Definition
3 The Method Description
3.1 Automatic Meta Path Generation
3.2 Integration of Meta Path
4 Experiment
4.1 Dataset
4.2 Criteria
4.3 Effectiveness Experiments
4.4 Influence of the Size of Training Set
4.5 Impact of Weight Learning
4.6 Efficiency
5 Conclusions
References
FastStep: Scalable Boolean Matrix Decomposition
1 Introduction
2 Background and Related Work
3 Proposed Method
3.1 Formal Objective
3.2 Step Matrix Decomposition
3.3 FastStep Matrix Decomposition
4 Experimental Evaluation
4.1 Scalability
4.2 Low Reconstruction Error
4.3 Discoveries
5 Conclusion
References
Applications
An Expert-in-the-loop Paradigm for Learning Medical Image Grouping
1 Introduction
2 Paradigm Initialization
3 Interface Design
4 Visualizing Image Groups
5 Expert Knowledge Constraints
5.1 Constraint on Neighboring Matrix, W
5.2 Constraint on Topic-Coefficient Matrix, C
6 Evaluation and Discussions
7 Related Work
8 Conclusions
References
Predicting Post-operative Visual Acuity for LASIK Surgeries
1 Introduction
2 Introduction to Laser Surgeries
3 Features for Post-operative UCVA Prediction
3.1 Demography Features
3.2 Pre-operative Examination Features
3.3 Surgery Settings
4 Approaches for Post-operative UCVA Prediction
5 Experiments
5.1 Dataset
5.2 Metrics
5.3 Results
6 Related Work
7 Conclusion
References
LBMF: Log-Bilinear Matrix Factorization for Recommender Systems
1 Introduction
2 Related Work
3 Preliminaries
3.1 Notations
3.2 Latent Factor Models
4 Log-Bilinear Matrix Factorization
4.1 Log-Bilinear Document Model
4.2 LBMF
5 Experiments
5.1 Compared Algorithms
5.2 Evaluation
5.3 Rating Prediction
5.4 Parameter Sensitivity
6 Conclusion
References
An Empirical Study on Hybrid Recommender System with Implicit Feedback
1 Introduction
2 Previous Work
2.1 Content-Based and Collaborative Filtering Methods
2.2 Methods with Implicit Feedback
2.3 Hybrid Model
3 Our Model
3.1 Collaborative Component
3.2 Content-Based Component
3.3 Hybrid System
4 Experimental Study
4.1 Data Description
4.2 Evaluation Methods
4.3 Results and Discussion
5 Conclusions and Future Works
References
Who Will Be Affected by Supermarket Health Programs? Tracking Customer Behavior Changes via Preference Modeling
1 Introduction
2 Related Work
3 Methodology
3.1 Extracting Customer Preferences
3.2 Constructing Temporal Model for Customer Preferences
3.3 Analyzing Customer Preference Changes
3.4 Evaluating Program Influence on Customer Segments
4 Results for Our Case Study
4.1 Visualization of Customer Preference Changes
4.2 Program Effects for Different Types of Customers
5 Conclusion
References
TrafficWatch: Real-Time Traffic Incident Detection and Monitoring Using Social Media
1 Introduction
2 Related Work
3 Methods
3.1 Filters
3.2 NLP Components
3.3 Machine Learning Processes
4 Experiments and Results
4.1 Data Set
4.2 Named-Entity Recognition
4.3 Classification
4.4 Incident Detection for Special Event
5 Live Traffic Monitoring System
6 Conclusions
References
Automated Setting of Bus Schedule Coverage Using Unsupervised Machine Learning
1 Introduction
2 Methodology
2.1 Modeling the Daily Profiles
2.2 Expectation-Maximization (EM) for Clustering Analysis
2.3 Automated Selection of Number of Schedules
3 Case Study
4 Experiments
4.1 Impact Evaluation Through a Data-Driven Simulation
4.2 Results
4.3 Discussion
5 Final Remarks
References
Effective Local Metric Learning for Water Pipe Assessment
1 Introduction
2 Related Works
3 The Proposed Method
3.1 Data Collection
3.2 Fuzzy-Based Local Metric Learning
3.3 The Proposed Kernel Density-Based Fuzzy Metric Learning
4 Experiments and Results
5 Conclusion
References
Classification with Quantification for Air Quality Monitoring
1 Introduction
2 Related Work
3 Methodology
3.1 System Architecture
3.2 Feature Extraction
3.3 Classification
3.4 Confidence Scores
3.5 Scalable Gaussian Process for Quantification
4 Experiments and Evaluation
5 Conclusions and Future Work
6 Reproducibility
References
Predicting Unknown Interactions Between Known Drugs and Targets via Matrix Completion
1 Introduction
2 Methods
2.1 Drug-Target Interaction Databases
2.2 Motivation by Data Visualization
2.3 The Drug-Target Interaction Prediction Method
2.4 Performance Metrics
3 Result
3.1 Performance Comparison of NBI, GIP, KBMF2K, PMF and Our Method on Gold Standard Datasets
3.2 Comparison with the NetLapRLs Method
4 Discussion
4.1 Validated New Pairs in the Latest Databases
4.2 Limitation
5 Conclusions
References
Author Index