logo资料库

Java Data Science Cookbook.pdf

第1页 / 共509页
第2页 / 共509页
第3页 / 共509页
第4页 / 共509页
第5页 / 共509页
第6页 / 共509页
第7页 / 共509页
第8页 / 共509页
资料共509页,剩余部分请下载后查看
Java Data Science Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Why subscribe?
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it...
How it works...
There's more...
See also
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Obtaining and Cleaning Data
Introduction
Retrieving all filenames from hierarchical directories using Java
Getting ready
How to do it...
Retrieving all filenames from hierarchical directories using Apache Commons IO
Getting ready
How to do it...
Reading contents from text files all at once using Java 8
How to do it...
Reading contents from text files all at once using Apache Commons IO
Getting ready
How to do it...
Extracting PDF text using Apache Tika
Getting ready
How to do it...
Cleaning ASCII text files using Regular Expressions
How to do it...
Parsing Comma Separated Value (CSV) Files using Univocity
Getting ready
How to do it...
Parsing Tab Separated Value (TSV) file using Univocity
Getting ready
How to do it...
Parsing XML files using JDOM
Getting ready
How to do it...
Writing JSON files using JSON.simple
Getting ready
How to do it...
Reading JSON files using JSON.simple
Getting ready
How to do it ...
Extracting web data from a URL using JSoup
Getting ready
How to do it...
Extracting web data from a website using Selenium Webdriver
Getting ready
How to do it...
Reading table data from a MySQL database
Getting ready
How to do it...
2. Indexing and Searching Data
Introduction
Indexing data with Apache Lucene
Getting ready
How to do it...
How it works...
Searching indexed data with Apache Lucene
Getting ready
How to do it...
3. Analyzing Data Statistically
Introduction
Generating descriptive statistics
How to do it...
Generating summary statistics
How to do it...
Generating summary statistics from multiple distributions
How to do it...
There's more...
Computing frequency distribution
How to do it...
Counting word frequency in a string
How to do it...
How it works...
Counting word frequency in a string using Java 8
How to do it...
Computing simple regression
How to do it...
Computing ordinary least squares regression
How to do it...
Computing generalized least squares regression
How to do it...
Calculating covariance of two sets of data points
How to do it...
Calculating Pearson's correlation of two sets of data points
How to do it...
Conducting a paired t-test
How to do it...
Conducting a Chi-square test
How to do it...
Conducting the one-way ANOVA test
How to do it...
Conducting a Kolmogorov-Smirnov test
How to do it...
4. Learning from Data - Part 1
Introduction
Creating and saving an Attribute-Relation File Format (ARFF) file
How to do it...
Cross-validating a machine learning model
How to do it...
Classifying unseen test data
Getting ready
How to do it...
Classifying unseen test data with a filtered classifier
How to do it...
Generating linear regression models
How to do it...
Generating logistic regression models
How to do it...
Clustering data points using the KMeans algorithm
How to do it...
Clustering data from classes
How to do it...
Learning association rules from data
Getting ready
How to do it...
Selecting features/attributes using the low-level method, the filtering method, and the meta-classifier method
Getting ready
How to do it...
5. Learning from Data - Part 2
Introduction
Applying machine learning on data using Java Machine Learning (Java-ML) library
Getting ready
How to do it...
Classifying data points using the Stanford classifier
Getting ready
How to do it...
How it works...
Classifying data points using Massive Online Analysis (MOA)
Getting ready
How to do it...
Classifying multilabeled data points using Mulan
Getting ready
How to do it...
6. Retrieving Information from Text Data
Introduction
Detecting tokens (words) using Java
Getting ready
How to do it...
Detecting sentences using Java
Getting ready
How to do it...
Detecting tokens (words) and sentences using OpenNLP
Getting ready
How to do it...
Retrieving lemma, part-of-speech, and recognizing named entities from tokens using Stanford CoreNLP
Getting ready
How to do it...
Measuring text similarity with Cosine Similarity measure using Java 8
Getting ready
How to do it...
Extracting topics from text documents using Mallet
Getting ready
How to do it...
Classifying text documents using Mallet
Getting ready
How to do it...
Classifying text documents using Weka
Getting ready
How to do it...
7. Handling Big Data
Introduction
Training an online logistic regression model using Apache Mahout
Getting ready
How to do it...
Applying an online logistic regression model using Apache Mahout
Getting ready
How to do it...
Solving simple text mining problems with Apache Spark
Getting ready
How to do it...
Clustering using KMeans algorithm with MLib
Getting ready
How to do it...
Creating a linear regression model with MLib
Getting ready
How to do it...
Classifying data points with Random Forest model using MLib
Getting ready
How to do it...
8. Learn Deeply from Data
Introduction
Creating a Word2vec neural net using Deep Learning for Java (DL4j)
How to do it...
How it works...
There's more
Creating a Deep Belief neural net using Deep Learning for Java (DL4j)
How to do it...
How it works...
Creating a deep autoencoder using Deep Learning for Java (DL4j)
How to do it...
How it works...
9. Visualizing Data
Introduction
Plotting a 2D sine graph
Getting ready
How to do it...
Plotting histograms
Getting ready
How to do it...
Plotting a bar chart
Getting ready
How to do it...
Plotting box plots or whisker diagrams
Getting ready
How to do it...
Plotting scatter plots
Getting ready
How to do it...
Plotting donut plots
Getting ready
How to do it...
Plotting area graphs
Getting ready
How to do it...
Java Data Science Cookbook
Table of Contents Java Data Science Cookbook Credits About the Author About the Reviewer www.PacktPub.com Why subscribe? Customer Feedback Preface What this book covers What you need for this book Who this book is for Sections Getting ready How to do it… How it works… There’s more… See also Conventions Reader feedback Customer support Downloading the example code Downloading the color images of this book Errata Piracy Questions 1. Obtaining and Cleaning Data Introduction Retrieving all filenames from hierarchical directories using Java
Getting ready How to do it… Retrieving all filenames from hierarchical directories using Apache Commons IO Getting ready How to do it… Reading contents from text files all at once using Java 8 How to do it… Reading contents from text files all at once using Apache Commons IO Getting ready How to do it… Extracting PDF text using Apache Tika Getting ready How to do it… Cleaning ASCII text files using Regular Expressions How to do it… Parsing Comma Separated Value (CSV) Files using Univocity Getting ready How to do it… Parsing Tab Separated Value (TSV) file using Univocity Getting ready How to do it… Parsing XML files using JDOM Getting ready How to do it… Writing JSON files using JSON.simple Getting ready How to do it… Reading JSON files using JSON.simple Getting ready How to do it … Extracting web data from a URL using JSoup
Getting ready How to do it… Extracting web data from a website using Selenium Webdriver Getting ready How to do it… Reading table data from a MySQL database Getting ready How to do it… 2. Indexing and Searching Data Introduction Indexing data with Apache Lucene Getting ready How to do it… How it works… Searching indexed data with Apache Lucene Getting ready How to do it… 3. Analyzing Data Statistically Introduction Generating descriptive statistics How to do it… Generating summary statistics How to do it… Generating summary statistics from multiple distributions How to do it… There’s more… Computing frequency distribution How to do it… Counting word frequency in a string How to do it… How it works…
Counting word frequency in a string using Java 8 How to do it… Computing simple regression How to do it… Computing ordinary least squares regression How to do it… Computing generalized least squares regression How to do it… Calculating covariance of two sets of data points How to do it… Calculating Pearson’s correlation of two sets of data points How to do it… Conducting a paired t-test How to do it… Conducting a Chi-square test How to do it… Conducting the one-way ANOVA test How to do it… Conducting a Kolmogorov-Smirnov test How to do it… 4. Learning from Data - Part 1 Introduction Creating and saving an Attribute-Relation File Format (ARFF) file How to do it… Cross-validating a machine learning model How to do it… Classifying unseen test data Getting ready How to do it… Classifying unseen test data with a filtered classifier How to do it…
Generating linear regression models How to do it… Generating logistic regression models How to do it… Clustering data points using the KMeans algorithm How to do it… Clustering data from classes How to do it… Learning association rules from data Getting ready How to do it… Selecting features/attributes using the low-level method, the filtering method, and the meta-classifier method Getting ready How to do it… 5. Learning from Data - Part 2 Introduction Applying machine learning on data using Java Machine Learning (Java-ML) library Getting ready How to do it… Classifying data points using the Stanford classifier Getting ready How to do it… How it works… Classifying data points using Massive Online Analysis (MOA) Getting ready How to do it… Classifying multilabeled data points using Mulan Getting ready How to do it… 6. Retrieving Information from Text Data Introduction
分享到:
收藏