logo资料库

OSDI2018 paper集合.pdf

第1页 / 共827页
第2页 / 共827页
第3页 / 共827页
第4页 / 共827页
第5页 / 共827页
第6页 / 共827页
第7页 / 共827页
第8页 / 共827页
资料共827页,剩余部分请下载后查看
osdi18-huang
Introduction
Problem Statement
Panorama System
Overview
Abstractions and APIs
Local Observation Store
Observers
Observation Exchange
Judging Failure from Observations
Design Pattern and Observability
A Failed Case
Observability Patterns
Implications
Observability Analysis
Locate Observation Boundary
Identify Observer and Observed
Extract Observation
Handling Indirection
Implementation
Evaluation
Experiment Setup
Integration with Several Systems
Detection of Crash Failures
Detection of Gray Failures
Fault Localization
Transient Failure, Normal Operations
Performance
Discussion and Limitations
Related Work
Conclusion
osdi18-cui
Introduction
Overview
Problem Statement
Design Choices
Challenges
Irreversible Instructions
Missing Memory Writes
Concurrent Memory Writes
Design
Instruction Reversal
Irreversible Instruction Handling
Recovering Memory Writes
Data Inference Graph
Error Correction
Handling Concurrency
Implementation
Online Hardware Tracing
Offline Binary Analysis
Deployment
Evaluation
Accuracy
Single-Thread Accuracy
Multiple-Thread Accuracy
Efficiency
Effectiveness
Deployment
Discussion
Related Work
Conclusion
Acknowledgments
osdi18-mohan
Introduction
Background
Studying Crash-Consistency Bugs
B3: Bounded Black-Box Crash Testing
Overview
Bounds used by B3
Fine-grained correctness checking
Limitations
CrashMonkey and Ace
CrashMonkey
Automatic Crash Explorer (Ace)
Testing and Bug Analysis
Evaluation
Experimental Setup
Bug Finding
CrashMonkey Performance
Ace Performance
Resource Consumption
Related Work
Conclusion
osdi18-alquraan
osdi18-shan
osdi18-cutler
osdi18-khawaja
osdi18-maeng
osdi18-qin
osdi18-yang
osdi18-sriraman
Introduction
Motivation
A Taxonomy of Threading Models
Key dimensions
Synchronous models
Asynchronous models
Tune: System Design
Implementation
Experimental Setup
Evaluation
Threading model characterization
Synchronous vs. Asynchronous
Synchronous models
Asynchronous models
Load adaptation
Comparison to the state-of-the-art
Steady-state adaptation
Load transients
Discussion
Related Work
Conclusion
Acknowledgement
References
osdi18-berger
Introduction
Background and Challenges
The goal of RobinHood
Challenges of caching for tail latency
Time-varying latency imbalance
Latency is not correlated with specific queries nor with query rate
Latency depends on request structure, which varies greatly
The RobinHood Caching System
The basic RobinHood algorithm
Refining the RobinHood algorithm
RobinHood architecture
System Implementation and Challenges
Implementation and testbed
Implementation challenges
Generating experimental data
Evaluation
Competing caching systems
How much does RobinHood improve SLO violations for OneRF's workload?
How much variability can RobinHood handle?
How robust is RobinHood to simultaneous latency spikes?
How much space does RobinHood save?
What is the overhead of running RobinHood?
Discussion
Related Work
Conclusion
osdi18-gjengset
osdi18-wei
osdi18-mahajan
osdi18-hsieh
1 Introduction
2 Background and Motivation
2.1 Convolutional Neural Networks
2.2 Characterizing Real-world Videos
2.2.1 Excluding large portions of videos
2.2.2 Limited set of object classes in each video
2.2.3 Feature vectors for finding duplicate objects
3 Overview of Focus
4 Video Ingest & Querying Techniques
4.1 Approximate Index via Cheap Ingest
4.2 Video-specific Specialization of Ingest CNN
4.3 Redundant Object Elimination
4.4 Trading off Ingest Cost and Query Latency
5 Implementation
5.1 Ingest Processor
5.2 Stream Tuner
5.3 Query Processor
6 Evaluation
6.1 Methodology
6.2 End-to-End Performance
6.3 Effect of Different Focus Components
6.4 Ingest Cost vs. Query Latency Trade-off
6.5 Sensitivity to Recall/Precision Target
6.6 Sensitivity to Object Class Numbers
7 Other Applications
8 Related Work
9 Conclusion
osdi18-sigurbjarnarson
osdi18-chajed
Introduction
Related work
Goal and challenges
Approach to proving atomicity
Design of Cspec
Layers
Defining correctness
Implementation
Proving
Cspec's proof patterns
Mover pattern
Protocol pattern
Abstraction pattern
Other patterns
Implementation
Evaluation
Verifying Cmail
Verifying a counter on x86-TSO
Speedup
Effort
Patterns
Trusted computing base
Conclusion
osdi18-ileri
Introduction
Related Work
Motivation: bugs
Goal
Threat model
Challenges
Specification: data noninterference
Proof approach: sealed blocks
Formalizing sealed blocks
Proving noninterference
Code generation
Case study: File system
Specifying security
Modifying the implementation
Proving security
Implementation
Evaluation
Specification trustworthiness
Effort
Performance
Conclusion
osdi18-setty
1 Introduction
2 Problem statement and background
2.1 A prior instantiation of VSMs
2.1.1 Interacting with external resources
2.1.2 Handling state
2.2 Outlook and roadmap
3 Efficient storage operations in VSMs
3.1 SetKV: A verifiable key-value store
3.2 Building VSMs using SetKV
4 Supporting concurrent services
4.1 Executing requests concurrently
4.2 Supporting transactional semantics
5 Efficient instantiations
5.1 Parallelizing audits
5.2 Efficient cryptographic primitives
6 Implementation and applications
6.1 Applications of Spice
7 Experimental evaluation
7.1 Spice's approach to state VS. prior solutions
7.2 Benefits of Spice's concurrent execution
7.3 Performance of apps built with Spice
8 Related work
9 Discussion and summary
osdi18-lockerman
osdi18-veeraraghavan
osdi18-alagappan
osdi18-maricq
osdi18-klimovic
Introduction
Storage for Serverless Analytics
Ephemeral Storage Requirements
Existing Systems
Pocket Design
System Architecture
Application Interface
Life of a Pocket Application
Handling Node Failures
Rightsizing Resource Allocations
Rightsizing Application Allocation
Rightsizing the Storage Cluster
Implementation
Evaluation
Methodology
Microbenchmarks
Rightsizing Resource Allocations
Comparison to S3 and Redis
Discussion
Related Work
Conclusion
osdi18-annamalai
Introduction
Motivation
Capital and operational costs matter
Service request movements
Low read-write ratios
Ineffectiveness of distributed caches
Separate locality management layer
Background
Akkio Design and Implementation
Design guidelines
Requirements
Architectural overview
Akkio Location Service (ALS)
Access Counter Service
Akkio Data Placement Service (DPS)
Provisioning new µ-shards
Determining optimal µ-shard placement
µ-shard migration
Replica set collection changes
DPS fault recovery
Evaluation
Implementation metrics
Use cases analysis
ViewState
AccessState
Instagram Connection-Info
Instagram Direct
Analysis of Akkio services
Related Work
Concluding Remarks
osdi18-zuo
Introduction
Background and Motivation
Data Consistency in NVM
Hashing Index Structures for NVM
Conventional Hashing Schemes
Hashing Schemes for NVM
Resizing a Hash Table
The Level Hashing Design
Write-optimized Hash Table Structure
Cost-efficient In-place Resizing
Low-overhead Consistency Guarantee
Concurrent Level Hashing
Performance Evaluation
Experimental Setup
Experimental Results
Maximum Load Factor
Insertion Latency
Update Latency
Search Latency
Deletion Latency
Resizing Time
Concurrent Throughput
Related Work
Conclusion
osdi18-zhang
osdi18-bhagwan
osdi18-jindal
Introduction
Key Insights
Competing/similar apps are abundant
Similar apps differ in energy drain
Framework services dominate app energy drain
How to Diff Energy Profiles?
What diffing granularity?
How do app tasks manifest in call trees?
What tree structures to diff?
How to perform Eflask matching?
Need for approximate DiffProf matching
Prior tree matching algorithms
The Eflask matching algorithm
Preprocessing CCTs to facilitate effective matching
Implementation and Usage
Evaluation
Experimental setup
Diffing results
Effectiveness
Unmatched (extra) tasks
Matched tasks
Discussions
Related work
Conclusion
osdi18-zhou
osdi18-quinn
Introduction
Usage
Debugging tools
Parallel retro-logging
Continuous function evaluation
Retro-timing
Scenarios
Atomicity Violation
Apache 45605
Apache 25520
Data corruption
Wild store
Memory leak
Lock Contention
Design and implementation
Background: Deterministic record and replay
Sledgehammer API
Tracers
Tracepoints
Preparing for debugging queries
Running a parallel debugging tool
Isolation
Support for continuous function evaluation
Support for retro-timing
Aggregating results
Evaluation
Experimental Setup
Benchmarks
Scalability
Scaling bottlenecks
Benefit of parallel analysis
Isolation
Recording Overhead
Related Work
Conclusion
osdi18-moritz
osdi18-chen
Introduction
Overview
Optimizing Computational Graphs
Generating Tensor Operations
Tensor Expression and Schedule Space
Nested Parallelism with Cooperation
Tensorization
Explicit Memory Latency Hiding
Automating Optimization
Schedule Space Specification
ML-Based Cost Model
Schedule Exploration
Distributed Device Pool and RPC
Evaluation
Server-Class GPU Evaluation
Embedded CPU Evaluation
Embedded GPU Evaluation
FPGA Accelerator Evaluation
Related Work
Conclusion
osdi18-xiao
osdi18-lee
Introduction
Model Serving: State-of-the-Art and Limitations
White Box Prediction Serving: Design Principles
The Pretzel System
Off-line Phase
Flour
Oven
Object Store
On-line Phase
Runtime
Scheduler
Additional Optimizations
Evaluation
Memory
Latency
Micro-benchmark
End-to-end
Throughput
Heavy Load
Micro-benchmark
End-to-end
Limitations and Future Work
Related Work
Conclusion
osdi18-kulkarni
osdi18-yeo
osdi18-phothilimthana
osdi18-volos
osdi18-konoth
Introduction
Background
DRAM Organization
The Rowhammer Bug
Rowhammer Defenses
Threat Model
Design
Implementation
ZebRAM Prototype Components
Implementation Details
Security Evaluation
Traditional Rowhammer Exploits
ZebRAM-aware Exploits
Attacking the Unsafe Region
Attacking the Safe Region
Performance Evaluation
Related work
Discussion
Prototype
Alternative Implementations
Conclusion
osdi18-lazar
osdi18-crooks
osdi18-iyer
osdi18-wang
Introduction
Problems with State-of-the-Art Systems
Challenges and Contributions
Summary of Results
Background and Overview
Background
RStream Overview
Programming Model
RStream Implementation
Preprocessing
Join Implementation
Redundancy Removal via Automorphism Checks
Pattern Aggregation via Isomorphism Checks
Evaluation
Comparisons with Mining Systems
Comparisons with Datalog Engines
RStream Performance Breakdown
Related Work
Conclusion
osdi18-kalavri
osdi18-essertel
Blank Page
i P r o c e e d n g s o f t h e 1 3 t h U S E N X S y m p o s u m o n O p e r a t i I i n g S y s t e m s D e s g n a n d I i l m p e m e n t a t i o n C a r l s b a d , C A , U S A O c t o b e r 8 – 1 0 , 2 0 1 8 ISBN 978-1-931971-47-8 conference proceedings 13th USENIX Symposium on Operating Systems Design and Implementation Carlsbad, CA, USA October 8–10, 2018 Sponsored by In cooperation with ACM SIGOPS
OSDI ’18 Sponsors Gold Sponsors Silver Sponsors Bronze Sponsors Industry Partners and Media Sponsors ACM Queue FreeBSD Foundation © 2018 by The USENIX Association All Rights Reserved This volume is published as a collective work. Rights to individual papers remain with the author or the author’s employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. Permission is granted to print, primarily for one person’s exclusive use, a single copy of these Proceedings. USENIX acknowledges all trademarks herein. ISBN 978-1-931971-47-8 USENIX Supporters USENIX Patrons Facebook • Google • Microsoft NetApp • Private Internet Access USENIX Benefactors Amazon • Bloomberg • Oracle Squarespace • VMware USENIX Partners Booking.com • Can Stock Photo • Cisco Meraki Fotosearch • Teradactyl • TheBestVPN.com Open Access Publishing Partner PeerJ
USENIX Association Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation October 8–10, 2018 Carlsbad, CA, USA
Symposium Organizers Program Co-Chair Andrea Arpaci-Dusseau, University of Wisconsin— Madison Geoff Voelker, University of California, San Diego Program Committee Rachit Agarwal, Cornell University Marcos K. Aguilera, VMware Research Mahesh Balakrishnan, Yale University/Facebook Ranjita Bhagwan, Microsoft Research India Edouard Bugnion, École Polytechnique Fédérale de Lausanne (EPFL) Miguel Castro, Microsoft Research Cambridge Kang Chen, Tsinghua University Vijay Chidambaram, The University of Texas at Austin Landon Cox, Duke University Angela Demke Brown, University of Toronto Sandhya Dwarkadas, University of Rochester Sasha Fedorova, University of British Columbia Roxana Geambasu, Columbia University Cristiano Giuffrida, Vrije Universiteit Amsterdam Haryadi Gunawi, University of Chicago Andreas Haeberlen, University of Pennsylvania Jon Howell, VMware Research Rebecca Isaacs, Twitter Michael Isard, Google Frans Kaashoek, Massachusetts Institute of Technology Manos Kapritsos, University of Michigan Taesoo Kim, Georgia Institute of Technology Sam King, University of California, Davis Christoforos Kozyrakis, Stanford University Jinyang Li, New York University Wyatt Lloyd, Princeton University Jay Lorch, Microsoft Research Richard Mortier, University of Cambridge Gilles Muller, Inria KyoungSoo Park, Korea Advanced Institute of Science and Technology (KAIST) Raluca Popa, University of California, Berkeley Don Porter, The University of North Carolina at Chapel Hill Justine Sherry, Carnegie Mellon University Liuba Shrira, Brandeis University Ryan Stutsman, University of Utah Steve Swanson, University of California, San Diego Michael Swift, University of Wisconsin—Madison Dan Tsafrir, VMware Research and Technion—Israel Institute of Technology Rashmi Vinayak, Carnegie Mellon University Xi Wang, University of Washington Andrew Warfield, Amazon Roger Wattenhofer, ETH Zurich Hakim Weatherspoon, Cornell University Ming Wu, Microsoft Research Yubin Xia, Shanghai Jiao Tong University Ding Yuan, University of Toronto Matei Zaharia, Stanford University Irene Zhang, Microsoft Research Yiying Zhang, Purdue University Poster Session Co-Chairs Vijay Chidambaram, The University of Texas at Austin Yiying Zhang, Purdue University Steering Committee Brad Chen, Google Jason Flinn, University of Michigan Casey Henderson, USENIX Association Kimberly Keeton, Hewlett Packard Labs Hank Levy, University of Washington James Mickens, Harvard University Brian Noble, University of Michigan Timothy Roscoe, ETH Zurich Margo Seltzer, University of British Columbia Amin Vahdat, Google and University of California, San Diego
Remzi Arpaci-Dusseau Anish Athalye Jonathan Behrens Gino Brunner Tej Chajed Nishanth Chandran Haibo Chen Charlie Curtsinger Cody Cutler Manuel Eichelberger Jon Gjengset Michael Gleicher Joseph Gonzalez Matthew Hicks Chris Hodsdon Atalay Mert İleri External Reviewers Pankaj Khanchandani David Lion Haonan Lu Yu Luo Tony Mason Darya Melnyk Ellis Michael Robert Morris Mihir Nanavati Luke Nelson Khiem Ngo Amy Ousterhout Pál András Papp Ali Razeen Xiang Ren Oliver Richter Kirk Rodrigues Edo Roth Brian Sandler Malte Schwarzkopf Igor Smolyar Zhenyu Song Theano Stavrinos Julian Steger Simon Tanner Yuyi Wang Michael Wei Tian Yang Idan Yaniv Yongle Zhang Xu Zhao Aviad Zuck
Message from the OSDI ’18 Program Co-Chairs Dear colleagues, Welcome to the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18), held in Carlsbad, CA, USA! This year’s technical program matches OSDI ’16 in including 47 exceptionally strong papers; these papers represent the many strengths of our community and cover a wide range of topics, including file and storage systems, networking, scheduling, security, formal verification of systems, graph processing, system support for machine learning, programming languages, fault-tolerance and reliability, debugging, and, of course, operating systems design and implementation. OSDI ’18 received 257 paper submissions, which the program committee reviewed in multiple rounds. Our program committee consisted of 49 reviewers with a mixture of academic and industrial research and practical experience. The PC was divided into 24 “light” and 25 “heavy” members. All papers received three reviews in the first round; based on those reviews, 122 papers were selected to proceed to the second round. Second round papers received a minimum of two additional reviews from heavy PC members. For a small number of papers, where opinions were divided or where a paper was particularly specialized, we solicited additional expert reviews. In total, the PC and external reviewers wrote over 2^10 reviews. As in previous OSDI review cycles, this year’s process included a response period in which authors could answer reviewer questions and address factual errors in the initial reviews. Authors of 191 papers submitted a response. Responses had a measurable impact on both our online and in-person discussions, and the author responses and ensuing online discussion influenced some PC members to adjust their reviews and reconsider their ratings. Overall, we believe author responses helped improve the quality of the selected program. After more than a week of online discussion across the full PC, we picked 83 papers for the heavy PC members to discuss at a 1.5-day PC meeting held at the University of Wisconsin in Madison, WI, USA. Almost all heavy PC members were able to attend in person, with just one person calling in remotely. As PC chairs, we strove to ensure that all the discussed papers received full and fair consideration, coming to a consensus agreement in most cases. Papers were placed into high-level categories according to their main topic so that similar papers could be discussed together at the PC meeting. All discussed papers received a summary of the PC discussion written by a heavy PC member. In the end, the PC selected 47 papers for presentation at the conference, resulting in an 18% acceptance rate. Each of the accepted papers was allocated an additional two pages and shepherded by a member of the heavy PC to help the authors address the reviewers’ comments in their camera-ready versions. After finalizing the program, we created a separate committee to decide the Jay Lepreau Best Paper Awards com- posed of PC members with no conflicts with the papers under consideration. PC members could nominate papers for these awards in their reviews or directly to us. We selected six papers with at least two votes for best paper as can- didates for the award. After reading the nominated papers and considering the reviews from the full PC, the awards committee agreed on three Jay Lepreau Best Paper Awards. As PC co-chairs, we stand on the shoulders of so many who did a tremendous amount of hard work to make OSDI ’18 a success. First, we thank the authors of all submitted papers for choosing to send their work to OSDI. Thanks also to the program committee for their hard work in reviewing and discussing the submissions and in shepherding the accepted papers. We particularly thank Yiying Zhang and Vijay Chidambaram for organizing an extensive poster session of more than 83 posters to be presented across two evenings. We are also grateful to the external reviewers who provided additional perspectives. We thank the USENIX staff, who have been fundamental in organizing OSDI ’18. Finally, OSDI wouldn’t be what it is without our attendees—thank you for listening to our speakers, asking challenging and insightful questions, sharing your ideas with others, and networking with one another in the hallways! We hope you will find OSDI ’18 interesting, educational, and inspiring! Andrea Arpaci-Dusseau, University of Wisconsin-Madison Geoff Voelker, University of California, San Diego OSDI ’18 Program Co-Chairs
OSDI ’18: 13th USENIX Symposium on Operating Systems Design and Implementation October 8–10, 2018 Carlsbad, CA, USA Understanding Failures Capturing and Enhancing In Situ System Observability for Failure Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Peng Huang, Johns Hopkins University; Chuanxiong Guo, ByteDance Inc.; Jacob R. Lorch and Lidong Zhou, Microsoft Research; Yingnong Dang, Microsoft REPT: Reverse Debugging of Failures in Deployed Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Weidong Cui and Xinyang Ge, Microsoft Research Redmond; Baris Kasikci, University of Michigan; Ben Niu, Microsoft Research Redmond; Upamanyu Sharma, University of Michigan; Ruoyu Wang, Arizona State University; Insu Yun, Georgia Institute of Technology Finding Crash-Consistency Bugs with Bounded Black-Box Crash Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Jayashree Mohan, Ashlie Martinez, Soujanya Ponnapalli, and Pandian Raju, University of Texas at Austin; Vijay Chidambaram, University of Texas at Austin and VMware Research An Analysis of Network-Partitioning Failures in Cloud Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Ahmed Alquraan, Hatem Takruri, Mohammed Alfatafta, and Samer Al-Kiswany, University of Waterloo Operating Systems LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation . . . . . . . . . . . . . . . . . . . . . . 69 Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang, Purdue University The benefits and costs of writing a POSIX kernel in a high-level language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Cody Cutler, M. Frans Kaashoek, and Robert T. Morris, MIT CSAIL Sharing, Protection, and Compatibility for Reconfigurable Fabric with AmorphoS . . . . . . . . . . . . . . . . . . . . 107 Ahmed Khawaja, Joshua Landgraf, and Rohith Prakash, UT Austin; Michael Wei and Eric Schkufza, VMware Research Group; Christopher J. Rossbach, UT Austin and VMware Research Group Adaptive Dynamic Checkpointing for Safe Efficient Intermittent Computing . . . . . . . . . . . . . . . . . . . . . . . . . 129 Kiwan Maeng and Brandon Lucia, Carnegie Mellon University Scheduling Arachne: Core-Aware Thread Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Henry Qin, Qian Li, Jacqueline Speiser, Peter Kraft, and John Ousterhout, Stanford University Principled Schedulability Analysis for Distributed Storage Systems using Thread Architecture Models . . . 161 Suli Yang, Ant Financial Services Group; Jing Liu, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, UW-Madison µTune: Auto-Tuned Threading for OLDI Microservices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Akshitha Sriraman and Thomas F. Wenisch, University of Michigan RobinHood: Tail Latency Aware Caching – Dynamic Reallocation from Cache-Rich to Cache-Poor . . . . . . 195 Daniel S. Berger and Benjamin Berg, Carnegie Mellon University; Timothy Zhu, Pennsylvania State University; Siddhartha Sen, Microsoft Research; Mor Harchol-Balter, Carnegie Mellon University (continued on next page)
分享到:
收藏