Differential Privacy and Applications 差分隐私综述.pdf

发布时间：2022-06-06 发布人：admin 分类：说明书资料大小：7.12M 资料格式：pdf 举报版权申诉

sandaojushi-10703500-4744300845366907375.pdf-第1页.png

第1页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第2页.png

第2页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第3页.png

第3页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第4页.png

第4页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第5页.png

第5页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第6页.png

第6页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第7页.png

第7页 / 共243页

sandaojushi-10703500-4744300845366907375.pdf-第8页.png

第8页 / 共243页

Preface

Acknowledgments

Contents

1 Introduction

1.1 Privacy Preserving Data Publishing and Analysis

1.2 Privacy Violations

1.3 Privacy Models

1.4 Differential Privacy

1.5 Outline and Book Overview

2 Preliminary of Differential Privacy

2.1 Notations

2.2 Differential Privacy Definition

2.2.1 The Privacy Budget

2.3 The Sensitivity

2.3.1 The Global Sensitivity

2.3.2 The Local Sensitivity

2.4 The Principle Differential Privacy Mechanisms

2.4.1 The Laplace Mechanism

2.4.1.1 The Gaussian Mechanism

2.4.2 The Exponential Mechanism

2.4.2.1 Mechanism Example

2.5 Utility Measurement of Differential Privacy

3 Differentially Private Data Publishing: Settings and Mechanisms

3.1 Interactive and Non-interactive Settings

3.2 Publishing Mechanism

4 Differentially Private Data Publishing: Interactive Setting

4.1 Transaction Data Publishing

4.1.1 Laplace

4.1.2 Transformation

4.1.3 Query Separation

4.1.4 Iteration

4.1.5 Discussion

4.2 Histogram Publishing

4.2.1 Laplace

4.2.2 Partition of Dataset

4.2.3 Consistency of Histogram

4.3 Stream Data Publishing

4.3.1 Laplace

4.3.2 Partition of Dataset

4.3.3 Iteration

4.3.4 Discussion

4.4 Graph Data Publishing

4.4.1 Edge Differential Privacy

4.4.2 Node Differential Privacy

4.4.3 Discussion

4.5 Summary on Interactive Setting

5 Differentially Private Data Publishing: Non-interactive Setting

5.1 Batch Queries Publishing

5.1.1 Laplace

5.1.2 Transformation

5.1.3 Partition of Dataset

5.1.4 Iteration

5.1.5 Discussion

5.2 Contingency Table Publishing

5.2.1 Laplace

5.2.2 Iteration

5.2.3 Transformation

5.3 Anonymized Dataset Publishing

5.4 Synthetic Dataset Publishing

5.4.1 Synthetic Dataset Publishing Basedon Learning Theory

5.4.1.1 Learning Theory in Differential Privacy

5.4.1.2 Synthetic Publishing

5.4.2 High Dimensional Synthetic Dataset Publishing

5.5 Summary on Non-interactive Setting

6 Differentially Private Data Analysis

6.1 Laplace/Exponential Framework

6.1.1 SuLQ and PINQ Interface

6.1.1.1 SuLQ

6.1.1.2 PINQ

6.1.2 Specific Algorithms in the Laplace/Exponential Framework

6.1.2.1 Supervised Learning

6.1.2.2 Unsupervised Learning

6.1.2.3 Frequent Itemset Mining

6.1.3 Summary on Laplace/Exponential Framework

6.2 Private Learning Framework

6.2.1 Foundation of ERM

6.2.2 Private Learning in ERM

6.2.2.1 Output Perturbation

6.2.2.2 Objective Perturbation

6.2.2.3 Risk Bound in Different Learning Algorithms

6.2.2.4 Discussion

6.2.3 Sample Complexity Analysis

6.2.3.1 Relaxing Privacy Requirement

6.2.3.2 Relaxing Hypothesis

6.2.3.3 Semi-Supervised Learning

6.2.3.4 Discussion

6.2.4 Summary on Private Learning Framework

6.3 Summary of Differentially Private Data Analysis

7 Differentially Private Deep Learning

7.1 Introduction

7.2 Preliminary

7.2.1 Deep Learning Structure

7.2.2 Stochastic Gradient Descent

7.2.2.1 Deep Auto-Encoder

7.3 Differentially Private Deep Learning

7.3.1 Basic Laplace Method

7.3.2 Private SGD Method

7.3.2.1 Norm Clipping

7.3.2.2 Grouping Batches

7.3.2.3 Privacy Composition

7.3.3 Deep Private Auto-Encoder Method

7.3.3.1 Deep Private Auto-Encoder Algorithm

7.3.3.2 Functional Mechanism

7.3.3.3 Sensitivity Measurements

7.3.4 Distributed Private SGD

7.3.4.1 Sparse Vector Technique

7.4 Experimental Methods

7.4.1 Benchmark Datasets

7.4.2 Learning Objectives

7.4.3 Computing Frameworks

7.5 Summary

8 Differentially Private Applications: Where to Start?

8.1 Solving a Privacy Problem in an Application

8.2 Challenges in Differentially Private Applications

8.2.1 High Sensitivity Challenge

8.2.2 Dataset Sparsity Challenge

8.2.3 Large Query Set Challenge

8.2.4 Correlated Data Challenge

8.2.5 Computational Complexity Challenge

8.2.6 Summary

8.3 Useful Public Datasets in Applications

8.3.1 Recommender System Datasets

8.3.2 Online Social Network Datasets

8.3.3 Location Based Datasets

8.3.4 Other Datasets

8.4 Applications Settings

9 Differentially Private Social Network Data Publishing

9.1 Introduction

9.2 Preliminaries

9.3 Basic Differentially Private Social Network Data Publishing Methods

9.3.1 Node Differential Privacy

9.3.1.1 Truncation and Smooth Sensitivity

9.3.1.2 Lipschitz Extension

9.3.1.3 Iterative Based Mechanism

9.3.2 Edge Differential Privacy

9.4 Graph Update Method

9.4.1 Overview of Graph Update

9.4.2 Graph Update Method

9.4.3 Update Function

9.4.4 Privacy and Utility Analysis

9.4.4.1 Privacy Analysis

9.4.4.2 Utility Analysis

9.4.5 Experimental Evaluation

9.4.5.1 Datasets and Configuration

9.4.5.2 Performance Evaluation on Diverse Size of Query Sets

9.5 Summary

10 Differentially Private Recommender System

10.1 Introduction

10.2 Preliminaries

10.2.1 Collaborative Filtering

10.2.2 Neighborhood-Based Methods: k Nearest Neighbors

10.2.3 Model-Based Methods: Matrix Factorization

10.3 Basic Differentially Private Recommender Systems

10.3.1 Differentially Private Untrustworthy Recommender System

10.3.2 Differentially Private TrustworthyRecommender System

10.3.2.1 Matrix Factorization with Private Input Perturbation

10.3.2.2 Private Stochastic Gradient Perturbation

10.3.2.3 ALS with Output Perturbation

10.4 Private Neighborhood-Based Collaborative Filtering Method

10.4.1 KNN Attack to Collaborative Filtering

10.4.2 The Private Neighbor Collaborative FilteringAlgorithm

10.4.2.1 The Private Neighbor Selection

10.4.2.2 Recommendation-Aware Sensitivity

10.4.2.3 Private Neighbor Selection Implementation

10.4.3 Privacy and Utility Analysis

10.4.3.1 Utility Analysis

10.4.3.2 Privacy Analysis

10.4.4 Experiment Analysis

10.4.4.1 Datasets and Measurements

10.4.4.2 Performance of PNCF

10.5 Summary

11 Privacy Preserving for Tagging Recommender Systems

11.1 Introduction

11.2 Preliminaries

11.2.1 Notations

11.2.2 Tagging Recommender Systems

11.2.3 Related Work

11.3 Private Tagging Publishing Method

11.3.1 User Profiles

11.3.2 Private Tagging Release Algorithm Overview

11.3.3 Private Topic Model Generation

11.3.3.1 LDA Model Construction

11.3.3.2 Private Model Generation

11.3.3.3 Topic-Based Profile Generation

11.3.4 Topic Weight Perturbation

11.3.5 Private Tag Selection

11.3.6 Privacy and Utility Analysis

11.3.6.1 Privacy Analysis

11.3.6.2 Utility Analysis

11.3.7 Experimental Evaluation

11.3.7.1 Datasets

11.3.7.2 Performance of Tagging Recommendation

11.4 Summary

12 Differentially Location Privacy

12.1 Introduction

12.2 Preliminary

12.3 Basic Location Privacy Methods

12.3.1 Snapshot Location Privacy: Geo-Indistinguishability

12.3.1.1 Probabilistic Model

12.3.1.2 Geo-Indistinguishability Definition

12.3.1.3 Geo-Indistinguishability Method

12.3.2 Trajectory Privacy

12.4 Hierarchical Snapshot Location Publishing

12.4.1 Hierarchical Sensitivity

12.4.2 Overview of Private Location Release

12.4.3 Private Location Release Algorithm

12.4.3.1 Private Location Clustering

12.4.3.2 Cluster Weight Perturbation

12.4.3.3 Private Location Selection

12.4.4 Utility and Privacy

12.4.4.1 Utility Analysis

12.4.4.2 Privacy Analysis

12.4.5 Experimental Evaluation

12.4.5.1 Datasets

12.4.5.2 Estimation of Distance Error

12.5 Summary

13 Differentially Private Spatial Crowdsourcing

13.1 Introduction

13.2 Basic Method

13.2.1 Background of Crowdsourcing

13.2.2 Differentially Private Crowdsourcing Methods

13.3 Differential Privacy in Reward-Based Crowdsourcing

13.3.1 Problem Statement

13.3.2 Building a Contour Plot with DP Guarantee

13.3.3 Task Assignment

13.3.3.1 Modeling Acceptance Probability and ASP

13.3.3.2 Optimized Strategy and Radius Estimation

13.3.4 Experimental Evaluation

13.3.4.1 Performance on DE

13.3.4.2 Performance on RejR

13.3.4.3 Performance on EC

13.4 Summary

14 Correlated Differential Privacy for Non-IID Datasets

14.1 Introduction

14.2 An Example: Correlated Records in a Dataset

14.3 Basic Methods

14.3.1 Pufferfish

14.3.2 Blowfish

14.4 Correlated Differential Privacy

14.4.1 Correlated Differential Privacy Problem

14.4.2 Research Issues and Challenges

14.4.3 Correlated Dataset Analysis

14.4.4 Correlated Sensitivity

14.4.5 Correlated Iteration Mechanism

14.4.5.1 Overview of Correlated Iteration Mechanism

14.4.5.2 Correlated Update Function

14.4.5.3 Parameters Discussion

14.4.6 Mechanism Analysis

14.4.6.1 Privacy Analysis

14.4.6.2 Utility Analysis

14.4.7 Experiment and Analysis

14.4.7.1 Datasets and Configuration

14.4.7.2 The Performance of CIM

14.4.7.3 Correlated Sensitivity vs. Global Sensitivity

14.5 Summary

15 Future Directions and Conclusion

15.1 Adaptive Data Analysis: Generalization in Machine Learning

15.2 Personalized Privacy

15.3 Secure Multiparty Computations with Differential Privacy

15.4 Differential Privacy and Mechanism Design

15.5 Differential Privacy in Genetic Data

15.6 Local Differential Privacy

15.7 Learning Model Publishing

15.8 Conclusion

References

Index

Advances in Information Security 69 Tianqing Zhu Gang Li Wanlei Zhou Philip S. Yu Differential Privacy and Applications

Advances in Information Security Volume 69 Series editor Sushil Jajodia, George Mason University, Fairfax, VA, USA

More information about this series at http://www.springer.com/series/5576

Tianqing Zhu • Gang Li Wanlei Zhou Philip S. Yu Differential Privacy and Applications 123

Tianqing Zhu Deakin University Melbourne, Australia Wanlei Zhou Deakin University Melbourne, Australia Gang Li Deakin University Melbourne, Australia Philip S. Yu University of Illinois at Chicago Chicago, IL, USA ISSN 1568-2633 Advances in Information Security ISBN 978-3-319-62002-2 DOI 10.1007/978-3-319-62004-6 ISBN 978-3-319-62004-6 (eBook) Library of Congress Control Number: 2017946488 © Springer International Publishing AG 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface Corporations, organizations, and governments have collected, digitized, and stored information in digital forms since the invention of computers, and the speed of such data collection and volumes of stored data have increased dramatically over the past few years, thanks to the pervasiveness of computing devices and the associated applications that are closely linked to our daily lives. For example, hospitals collect records of patients, search engines collect online behaviors of users, social network sites collect connected friends of people, e-commerce sites collect shopping habits of customers, and toll road authorities collect travel details of vehicles. Such huge amounts of datasets provide excellent opportunities for businesses and governments to improve their services and to bring economic and social beneﬁts, especially through the use of technologies dealing with big data, including data mining, machine learning, artiﬁcial intelligence, data visualization, and data analytics. For example, by releasing some statistics of hospital records may help medical research to combat diseases. However, as most of the collected datasets are personally related and contain private or sensitive information, data releasing may also provide a fertile ground for adversaries to obtain certain private and sensitive information, even though simple anonymization techniques are used to hide such information. Privacy preserving has, therefore, become an urgent issue that needs to be addressed in the digital age. Differential privacy is a new and promising privacy framework and has become a popular research topic in both academia and industry. It is one of the most prevalent privacy models as it provides a rigorous and provable privacy notion that can be applied in various research areas, and can be potentially implemented in various application scenarios. The goal of this book is to summarize and analyze the state-of-the-art research and investigations in the ﬁeld of differential privacy and its applications in privacy-preserving data publishing and releasing, so as to provide an approachable strategy for researchers and engineers to implement this new framework in real world applications. This is the ﬁrst book with a balanced view on differential privacy theory and its applications, as most existing books related to privacy preserving either do not v

vi Preface touch the topic of differential privacy or only focus on the theoretical analysis of differential privacy. Instead of using abstract and complex notions to describe the concepts, methods, algorithms, and analysis on differential privacy, this book presents these difﬁcult topics in a combination of applications, in order to help students, researchers, and engineers with less mathematical background understand the new concepts and framework, enabling a wider adoption and implementation of differential privacy in the real world. The striking features of the book, differs from others, can be illustrated from three basic aspects: A detailed coverage on differential privacy in the perspective of engineering, rather than computing theory. The most difﬁcult part in comprehending differ- ential privacy is the complexity and the level of abstract of the theory. This book presents the theory of differential privacy in a more natural and easy to understand way. A rich set of state-of-the-art examples on various application areas helps readers to understand how to implement differential privacy in real world scenarios. Each application example includes a brief introduction to the problem and its challenges, a detailed implementation of differential privacy to solve the problem, and an analysis on the privacy and utility. A comprehensive collection of contemporary research results and issues in differential privacy. Apart from the basic theory, most of the contents of the book are from the recent publications in the last 5 years, reﬂecting the state-of-the-art of research and development in the area of differential privacy. This book intends to enable readers, especially postgraduate and senior under- graduate students, to study up-to-date concepts, methods, algorithms, and analytic skills for building modern privacy-preserving applications through differential privacy. It enables students not only to master the concepts and theories in relation to differential privacy but also to readily use the material introduced into implementa- tion practices. Therefore, the book is divided into two parts: theory and applications. In the theory part, after an introduction of the differential privacy preliminaries, the book presents detailed descriptions from an engineering viewpoint on areas of differentially private data publishing and differentially private data analysis where research on differential privacy has been conducted. In the applications part, after a summary on the steps to follow when solving the privacy-preserving problem in a particular application, the book then presents a number of state-of-the- art application areas of differential privacy, including differentially private social network data publishing, differentially private recommender system, differential location privacy, spatial crowdsourcing with differential privacy preservation, and correlated differential privacy for non-IID datasets. The book also includes a ﬁnal chapter on the future direction of differential privacy and its applications.

Preface Acknowledgments vii We are grateful to many research students and colleagues at Deakin University in Melbourne and University of Illinois at Chicago, who have made a lot of comments to our presentations as their comments inspire us to write this book. We would like to acknowledge some support from research grants we have received, in particular, the Australian Research Council Grant no. DP1095498, LP120200266, and DP140103649, NSF through grants IIS-1526499, and CNS-1626432, and NSFC (Nos. 61672313, 61502362). Some interesting research results presented in the book are taken from our research papers that indeed (partially) supported through these grants. We also would like to express our appreciations to the editors in Springer, especially Susan Lagerstrom-Fife, for the excellent professional support. Finally we are grateful to the family of each of us for their consistent and persistent supports. Without their support, the book may just become some unpublished discussions. Melbourne, Australia Melbourne, Australia Melbourne, Australia Chicago, IL, USA May 2017 Tianqing Zhu Wanlei Zhou Gang Li Philip S. Yu

分享到：

赞收藏

资料库

Differential Privacy and Applications 差分隐私综述.pdf

相关推荐

安全技术

热门标签

最新资料