logo资料库

Practical Monitoring.pdf

第1页 / 共169页
第2页 / 共169页
第3页 / 共169页
第4页 / 共169页
第5页 / 共169页
第6页 / 共169页
第7页 / 共169页
第8页 / 共169页
资料共169页,剩余部分请下载后查看
Practical Monitoring EFFECTIVE STRATEGIES FOR THE REAL WORLD Mike Julian www.iebukes.com www.iebukes.com
Practical Monitoring Effective Strategies for the Real World Mike Julian Beijing Beijing Boston Boston Farnham Sebastopol Farnham Sebastopol Tokyo Tokyo www.iebukes.com www.iebukes.com
Practical Monitoring by Mike Julian Copyright © 2018 Mike Julian. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/insti‐ tutional sales department: 800-998-9938 or corporate@oreilly.com. Editors: Virginia Wilson and Nikki McDonald Production Editor: Justin Billing Copyeditor: Dwight Ramsey Proofreader: Amanda Kersey Indexer: Wendy Catalano Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest November 2017: First Edition Revision History for the First Edition 2017-10-26: First Release See http://oreil.ly/2y3s5AB for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Practical Monitoring, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-491-95735-6 [LSI] www.iebukes.com www.iebukes.com
Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Part I. Monitoring Principles Anti-Pattern #1: Tool Obsession 1. Monitoring Anti-Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Monitoring Is Multiple Complex Problems Under One Name 4 Avoid Cargo-Culting Tools 6 Sometimes, You Really Do Have to Build It 7 8 The Single Pane of Glass Is a Myth 8 Anti-Pattern #2: Monitoring-as-a-Job 9 Anti-Pattern #3: Checkbox Monitoring 10 10 11 Anti-Pattern #4: Using Monitoring as a Crutch 11 Anti-Pattern #5: Manual Configuration 12 Wrap-Up 13 What Does “Working” Actually Mean? Monitor That. OS Metrics Aren’t Very Useful—for Alerting Collect Your Metrics More Often Pattern #1: Composable Monitoring The Components of a Monitoring Service 2. Monitoring Design Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 16 24 25 It’s Cheaper 26 26 You’re (Probably) Not an Expert at Architecting These Tools SaaS Allows You to Focus on the Company’s Product 27 Pattern #2: Monitor from the User Perspective Pattern #3: Buy, Not Build www.iebukes.com iii
27 No, Really, SaaS Is Actually Better Pattern #4: Continual Improvement 28 Wrap-Up 28 What Makes a Good Alert? On-Call Stop Using Email for Alerts Write Runbooks Arbitrary Static Thresholds Aren’t the Only Way Delete and Tune Alerts Use Maintenance Periods Attempt Automated Self-Healing First 3. Alerts, On-Call, and Incident Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 32 33 33 34 35 35 36 37 37 37 38 40 42 43 Fixing False Alarms Cutting Down on Needless Firefighting Building a Better On-Call Rotation Incident Management Postmortems Wrap-Up 4. Statistics Primer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Before Statistics in Systems Operations Math to the Rescue! Statistics Isn’t Magic Mean and Average Median Seasonality Quantiles Standard Deviation Wrap-Up 45 45 46 47 47 49 49 50 51 52 Part II. Monitoring Tactics 5. Monitoring the Business. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Business KPIs 57 60 Two Real-World Examples 60 61 62 62 63 Tying Business KPIs to Technical Metrics My App Doesn’t Have Those Metrics! Finding Your Company’s Business KPIs Yelp Reddit iv | Table of Contents www.iebukes.com
Wrap-Up 64 6. Frontend Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 The Cost of a Slow App 66 Two Approaches to Frontend Monitoring 67 Document Object Model (DOM) 68 Frontend Performance Metrics 69 OK, That’s Great, but How Do I Use This? 71 Logging 72 Synthetic Monitoring 72 Wrap-Up 73 7. Application Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Instrumenting Your Apps with Metrics 75 How It Works Under the Hood 77 Monitoring Build and Release Pipelines 79 Health Endpoint Pattern 80 Application Logging 84 Wait a Minute…Should I Have a Metric or a Log Entry? 85 What Should I Be Logging? 85 Write to Disk or Write to Network? 86 Serverless / Function-as-a-Service 87 Monitoring Microservice Architectures 87 Wrap-Up 91 8. Server Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Standard OS Metrics 93 CPU 94 Memory 94 Network 95 Disk 95 Load 96 SSL Certificates 97 SNMP 98 Web Servers 98 Database Servers 100 Load Balancers 101 Message Queues 101 Caching 102 DNS 102 NTP 103 Miscellaneous Corporate Infrastructure 103 Table of Contents | v
DHCP 103 SMTP 104 Monitoring Scheduled Jobs 104 Logging 106 Collection 106 Storage 107 Analysis 107 Wrap-Up 108 9. Network Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 The Pains of SNMP 110 What Is SNMP? 110 How Does It Work? 110 A Word on Security 112 How Do I Use SNMP? 113 Interface Metrics 116 Interface and Logging 118 Recap 118 Configuration Tracking 119 Voice and Video 119 Routing 120 Spanning Tree Protocol (STP) 121 Chassis 121 CPU and Memory 121 Hardware 121 Flow Monitoring 122 Capacity Planning 123 Working Backward 123 Forecasting 123 Wrap-up 124 10. Security Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Monitoring and Compliance 126 User, Command, and Filesystem Auditing 127 Setting Up auditd 127 auditd and Remote Logs 128 Host Intrusion Detection System (HIDS) 129 rkhunter 129 Network Intrusion Detection System (NIDS) 130 Wrap-Up 132 vi | Table of Contents
11. Conducting a Monitoring Assessment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Business KPIs 133 Frontend Monitoring 134 Application and Server Monitoring 134 Security Monitoring 136 Alerting 136 Wrap-Up 137 A. An Example Runbook: Demo App. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 B. Availability Chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Table of Contents | vii
分享到:
收藏