www.it-ebooks.info
TCP/IP ARCHITECTURE, DESIGN,
AND IMPLEMENTATION
IN LINUX
Copyright © 2008 by IEEE Computer Society.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey. All rights reserved.
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee
to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax
(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should
be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ
07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts
in preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifi cally disclaim any implied warranties of
merchantability or fi tness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be
suitable for your situation. You should consult with a professional where appropriate. Neither the
publisher nor author shall be liable for any loss of profi t or any other commercial damages, including
but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at
(317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print,
may not be available in electronic formats. For more information about Wiley products, visit our web
site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
ISBN 978-0470-14773-3
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
www.it-ebooks.info
CONTENTS
Preface
Acknowledgments
1 INTRODUCTION
1.1 Overview of TCP/IP Stack
1.1.1 Moving Down the Stack
1.1.2 Moving Up the Stack
Source Code Organization for Linux 2.4.20
1.2.1
Source Code Organization for Networking Code
1.2
1.3 TCP/IP Stack and Kernel Control Paths
1.4 Linux Kernel Until Version 2.4 Is Non-preemptible
System Call on Linux
1.4.1
1.4.2 Adding New System Call
1.5 Linux Process and Thread
1.5.1
fork()
1.5.2 Thread
1.5.3 Kernel Threads
1.6 Kernel Synchronization Mechanism
Semaphore
1.6.1
1.6.2 Atomic Operations
1.6.3
Spin Lock
1.7 Application Interfaces for TCP/IP Programming
Server Application
1.7.1
1.7.2 Client Application
1.7.3
Socket Options
1.7.4 Option Values
Shutdown
1.8.1 Kernel Shutdown Implementation
1.8.2
1.8.3 Receive Shutdown
I/O
1.9.1
read()
1.9.2 write()
Send Shutdown
1.8
1.9
www.it-ebooks.info
xxi
xxvii
1
2
3
5
5
7
7
11
14
16
17
17
18
19
22
22
23
23
24
25
27
29
29
35
36
36
36
38
38
38
v
vi
CONTENTS
1.9.3
1.9.4
1.9.5
recv()
send()
select()
1.10 TCP State
1.10.1 Partial Close
1.10.2
tcpdump Output for Partial Close
1.11 Summary
2 PROTOCOL FUNDAMENTALS
2.1 TCP
2.1.1 TCP Header
2.2 TCP Options (RFC 1323)
2.2.1 mss Option
2.2.2 Window-Scaling Option
2.2.3 Timestamp Option
2.2.4
Selective Acknowledgment Option
2.3 TCP Data Flow
2.3.1 ACKing of Data Segments
2.4 Delayed Acknowledgment
2.5 Nagle’s Algorithm (RFC 896)
2.6 TCP Sliding Window Protocol
2.7 Maximizing TCP Throughput
2.8 TCP Timers
2.8.1 Retransmission Timer
2.8.2
Persistent Timer
2.8.3 Keepalive Timer
2.8.4 TIME_WAIT Timer
2.9 TCP Congestion Control
2.10 TCP Performance and Reliability
2.10.1 RTTD
2.10.2 SACK/DSACK
2.10.3 Window Scaling
2.11 IP (Internet Protocol)
2.11.1 IP Header
2.12 Routing
2.13 netstat
2.14
traceroute
2.14.1
2.15 ICMP
2.16 ping
2.17 ARP/RARP
2.18 Summary
traceroute Mechanism
www.it-ebooks.info
38
39
39
39
45
47
48
49
50
50
54
55
55
56
57
58
58
67
69
72
79
82
82
83
84
85
85
86
86
86
87
87
88
90
90
92
93
93
95
97
99
CONTENTS
vii
3 KERNEL IMPLEMENTATION OF SOCKETS
Socket Layer
3.1
3.2 VFS and Socket
3.3
3.4
3.5
3.6
3.7
Protocol Socket Registration
struct inet_protosw
Socket Organization in the Kernel
Socket
inet_create
3.7.1
Flow Diagram for Socket Call
Summary
3.8
3.9
Sock
101
102
103
105
107
107
108
110
112
118
118
121
122
122
124
124
125
125
125
125
125
126
129
129
130
130
130
131
131
133
135
137
138
139
139
142
142
147
150
4 KERNEL IMPLEMENTATION OF TCP CONNECTION SETUP
4.1 Connection Setup
4.1.1
4.1.2
4.2 Bind
Server Side Setup
Server Side Operations
tcp_ehash
tcp_listening_hash
tcp_bhash
tcp_hashinfo
tcp_bind_hashbucket
tcp_bind_bucket
bind()
4.2.1 Data Structures Related to Socket BIND
4.2.2 Hash Buckets for tcp Bind
4.2.3
4.2.4
4.2.5
4.2.6
4.2.7
4.2.8
4.2.9
4.2.10 sys_bind()
4.2.11 sockfd_lookup()
4.2.12
4.2.13
4.2.14
4.2.15
4.3 Listen
fget()
inet_bind()
tcp_v4_get_port()
tcp_bind_confl ict()
sys_listen()
inet_listen()
tcp_listen_start()
4.3.1
4.3.2
4.3.3
4.3.4 Listen Flow
4.3.5
4.3.6 Accept Queue Is Full
4.3.7 Established Sockets Linked in tcp_ehash Hash
struct open_request
Table
www.it-ebooks.info