www.it-ebooks.info
TCP/IP ARCHITECTURE, DESIGN, 
AND IMPLEMENTATION 
IN LINUX
Copyright © 2008 by IEEE Computer Society.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey. All rights reserved.
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form 
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as 
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior 
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee 
to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax 
(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should 
be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 
07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts 
in preparing this book, they make no representations or warranties with respect to the accuracy or 
completeness of the contents of this book and specifi cally disclaim any implied warranties of 
merchantability or fi tness for a particular purpose. No warranty may be created or extended by sales 
representatives or written sales materials. The advice and strategies contained herein may not be 
suitable for your situation. You should consult with a professional where appropriate. Neither the 
publisher nor author shall be liable for any loss of profi t or any other commercial damages, including 
but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support please contact our 
Customer Care Department within the United States at (800) 762-2974, outside the United States at 
(317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, 
may not be available in electronic formats. For more information about Wiley products, visit our web 
site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
ISBN 978-0470-14773-3
Printed in the United States of America
10  9  8  7  6  5  4  3  2  1
www.it-ebooks.info
CONTENTS
Preface 
Acknowledgments 
  1  INTRODUCTION 
1.1  Overview of TCP/IP Stack 
1.1.1  Moving Down the Stack 
1.1.2  Moving Up the Stack 
Source Code Organization for Linux 2.4.20 
1.2.1 
Source Code Organization for Networking Code 
1.2 
1.3  TCP/IP Stack and Kernel Control Paths 
1.4  Linux Kernel Until Version 2.4 Is Non-preemptible 
System Call on Linux 
1.4.1 
1.4.2  Adding New System Call 
1.5  Linux Process and Thread 
1.5.1 
fork() 
1.5.2  Thread 
1.5.3  Kernel Threads 
1.6  Kernel Synchronization Mechanism 
Semaphore 
1.6.1 
1.6.2  Atomic Operations 
1.6.3 
Spin Lock 
1.7  Application Interfaces for TCP/IP Programming 
Server Application 
1.7.1 
1.7.2  Client Application 
1.7.3 
Socket Options 
1.7.4  Option Values 
Shutdown 
1.8.1  Kernel Shutdown Implementation 
1.8.2 
1.8.3  Receive Shutdown 
I/O 
1.9.1 
read() 
1.9.2  write() 
Send Shutdown 
1.8 
1.9 
www.it-ebooks.info
xxi
xxvii
1
2
3
5
5
7
7
11
14
16
17
17
18
19
22
22
23
23
24
25
27
29
29
35
36
36
36
38
38
38
v
vi 
CONTENTS
1.9.3 
1.9.4 
1.9.5 
recv() 
send() 
select() 
1.10  TCP State 
1.10.1  Partial Close 
1.10.2 
tcpdump Output for Partial Close 
1.11  Summary 
  2  PROTOCOL FUNDAMENTALS 
2.1  TCP 
2.1.1  TCP Header 
2.2  TCP Options (RFC 1323) 
2.2.1  mss Option 
2.2.2  Window-Scaling Option 
2.2.3  Timestamp Option 
2.2.4 
Selective Acknowledgment Option 
2.3  TCP Data Flow 
2.3.1  ACKing of Data Segments 
2.4  Delayed Acknowledgment 
2.5  Nagle’s Algorithm (RFC 896) 
2.6  TCP Sliding Window Protocol 
2.7  Maximizing TCP Throughput 
2.8  TCP Timers 
2.8.1  Retransmission Timer 
2.8.2 
Persistent Timer 
2.8.3  Keepalive Timer 
2.8.4  TIME_WAIT Timer 
2.9  TCP Congestion Control 
2.10  TCP Performance and Reliability 
2.10.1  RTTD 
2.10.2  SACK/DSACK 
2.10.3  Window Scaling 
2.11  IP (Internet Protocol) 
2.11.1  IP Header 
2.12  Routing 
2.13  netstat 
2.14 
traceroute 
2.14.1 
2.15  ICMP 
2.16  ping 
2.17  ARP/RARP 
2.18  Summary 
traceroute Mechanism 
www.it-ebooks.info
38
39
39
39
45
47
48
49
50
50
54
55
55
56
57
58
58
67
69
72
79
82
82
83
84
85
85
86
86
86
87
87
88
90
90
92
93
93
95
97
99
CONTENTS 
vii
  3  KERNEL IMPLEMENTATION OF SOCKETS 
Socket Layer 
3.1 
3.2  VFS and Socket 
3.3 
3.4 
3.5 
3.6 
3.7 
Protocol Socket Registration 
struct inet_protosw 
Socket Organization in the Kernel 
Socket 
inet_create 
3.7.1 
Flow Diagram for Socket Call 
Summary 
3.8 
3.9 
Sock 
101
102
103
105
107
107
108
110
112
118
118
121
122
122
124
124
125
125
125
125
125
126
129
129
130
130
130
131
131
133
135
137
138
139
139
142
142
147
150
  4  KERNEL IMPLEMENTATION OF TCP CONNECTION SETUP 
4.1  Connection Setup 
4.1.1 
4.1.2 
4.2  Bind 
Server Side Setup 
Server Side Operations 
tcp_ehash 
tcp_listening_hash 
tcp_bhash 
tcp_hashinfo 
tcp_bind_hashbucket 
tcp_bind_bucket 
bind() 
4.2.1  Data Structures Related to Socket BIND 
4.2.2  Hash Buckets for tcp Bind 
4.2.3 
4.2.4 
4.2.5 
4.2.6 
4.2.7 
4.2.8 
4.2.9 
4.2.10  sys_bind() 
4.2.11  sockfd_lookup() 
4.2.12 
4.2.13 
4.2.14 
4.2.15 
4.3  Listen 
fget() 
inet_bind() 
tcp_v4_get_port() 
tcp_bind_confl ict() 
sys_listen() 
inet_listen() 
tcp_listen_start() 
4.3.1 
4.3.2 
4.3.3 
4.3.4  Listen Flow 
4.3.5 
4.3.6  Accept Queue Is Full 
4.3.7  Established Sockets Linked in tcp_ehash Hash 
struct open_request 
Table 
www.it-ebooks.info