Introduction
Welcome to How Tomcat Works. This book is meant to be a tutorial. It dissects Tomcat
and explains the internal workings of its free, open source, and most popular servlet
container code-named Catalina. Tomcat is a complex system, consisting of many
different components. Those who want to learn how Tomcat works often find it hard to
begin. What this book does is provide the big picture and then build a simpler version of
each component to make understanding that component easier. Only after that will the
real component be explained.
Please help make this book better. If you find any technical inaccuracy, bugs in the code,
grammatical or spelling mistakes, please inform the author at budi@brainysoftware.com.
Thank you.
You should start by reading this Introduction as it explains the structure of the book
and gives you the brief outline of the applications built. The section, Preparing Required
Software, gives you detailed instructions on what software you need to download, how to
make a directory structure for your code, etc.
This book is for anyone working with the Java technology.
* Clearly, this book is for you if you are a servlet programmer or a Tomcat user and
*
*
you are interested in knowing how a servlet container works.
If you want to join the Tomcat development team, you need to first learn how the
existing code works. This book provides a good tutorial.
If you've never been involved in web development but you have interest in
software development in general, then you can learn how a large application such
as Tomcat was designed and developed.
* Those who need to customize Tomcat.
To understand the discussion in this book, you need to know about object-oriented
programming with Java and servlet programming. There are a lot of books on servlets,
including my own Java for the Web with Servlets, JSP, and EJB. To make the material
easier to understand, each chapter starts with background information that will be needed
to understand the topic in discussion.
© Budi Kurniawan 2003
1
A servlet container is a complex system. However, basically there are three things that a
servlet container does to serve a request for a servlet:
* Creating a request object and populate it with information that may be used by the
invoked servlet, such as parameters, headers, cookies, query string, URI, etc. A
request object is an instance of the javax.servlet.ServletRequest
interface or the javax.servlet.http.ServletRequest interface.
* Creating a response object that the invoked servlet uses to send the response to the
web client. A response object is an instance of the
javax.servlet.ServletResponse interface or the
javax.servlet.http.ServletResponse interface.
*
Invoking the service method of the servlet, passing the request and response
objects. Here the servlet reads values from the request object and writes to the
response object.
As you read the chapters, you will find a detailed discussion of Catalina servlet container.
Catalina is the servlet container in Tomcat. It is a very sophisticated piece of software,
which was elegantly designed and developed. It is also modular too. Based on the tasks
mentioned in the section "How A Servlet Container Works", you can view Catalina as
two main modules: the connector and the container.
Figure I.1: Catalina's main modules
The block diagram in Figure I.1 is, of course, simplistic. Later in the following
chapters you will unveil all smaller modules one by one.
Now, back to Figure I.1, the connector is there to connect a request with the
container. Its job is to construct a Request object and a Response object for each HTTP
request it receives. It then passes processing to the container. The container receives the
Request and Response objects from the connector and is responsible for invoking the
servlet's service method.
Bear in mind though, that the description above is only the tip of the iceberg. There
are a lot of things that a container does. For example, before it can invoke a servlet's
© Budi Kurniawan 2003
2
service method, it must load the servlet, authenticate the user (if required), update the
session for that user, etc. It's not surprising then that a container uses many different
modules for processing. For example, the manager module is for processing user
sessions. A loader is for loading servlets, etc.
This book consists of 18 chapters. The first two chapters serve as an introduction.
Chapter 1 explains how an HTTP server works and Chapter 2 features a simple servlet
container. The next two chapters focus on the connector and Chapters 5 to 14 cover each
of the components in a Tomcat 4 container. The rest of the book (Chapters 15 to 18)
details how Tomcat 5 implements the new features in Servlet 2.4 specification. The
following is the summary of each of the chapters.
For each chapter, there is an accompanying application similar to the component being
explained.
Chapter 1 starts this book by presenting a simple HTTP server. To build a working
HTTP server, you need to know the internal working of two classes in the java.net
package: Socket and ServerSocket. There is sufficient background information on the two
classes for you to understand how the accompanying application works.
Chapter 2 explains how a simple servlet container works. This chapter comes with an
application that functions as a servlet container. There is also a servlet that can be run
within the servlet container and that you can invoke from a Web browser.
Chapter 3 explains the details of a Catalina connector.
Chapter 4 presents the Catalina default connector. This connector has been deprecated
in favor of a faster connector called Coyote. Nevertheless, the default connector is
simpler and easier to understand. Therefore, it serves as a good learning tool.
Chapter 5 discusses the container.
Chapter 6 explains lifecycles.
Chapter 7 covers loggers.
Chapter 8 explains about loaders.
Chapter 9 discusses managing sessions.
Chapter 10 covers security.
Chapter 11 explains in detail the org.apache.catalina.core.StandardWrapper class and
related classes. StandardWrapper represents a servlet in a Web application.
© Budi Kurniawan 2003
3
Chapter 12 covers the org.apache.catalina.core.StandardContext class, which
represents a context (Web application) in Catalina.
Chapter 13 presents the two other containers: Host and Engine. You can also find the
standard implementation of these two containers: org.apache.catalina.core.StandardHost
and org.apache.catalina.core.StandardEngine.
Chapter 14 offers the Server and Service and how the Tomcat configuration file
works.
Chapters 15 to 18 cover Tomcat 5 implementation of the new features in Servlet 2.4.
Because this book was designed as a tutorial, every servlet container application in each
chapter gradually evolves from the application in the previous chapter, until a fully-
functional Tomcat servlet container is achieved in Chapter 18.
Several articles that provide background information on Java techniques can be found in
the Articles directory of the files accompanying this book, downloadable from
www.brainysoftware.com.
The articles are not included in the zip file for the review project.
Each chapter comes with one or more applications that focus on a specific component in
Catalina. Normally you'll find the simplified version of the component being explained or
code that explains how to use a Catalina component. All classes and interfaces in the
chapters' applications reside in the ex[chapter number].pyrmont package or its
subpackages. For example, the classes in the application in Chapter 1 are part of
ex01.pyrmont package.
!"#$
The applications accompanying this book run with J2SE version 1.4. The zipped source
files can be downloaded from the author's web site www.brainysoftware.com. It contains
the source code for Tomcat 4.1.12 and the applications used in this book. Assuming you
have installed J2SE 1.4 and your path environment variable includes the location of the
JDK, follow these steps:
© Budi Kurniawan 2003
4
1.
Extract the zip files. All extracted files will reside in a new directory called
HowTomcatWorks. HowTomcatWorks is your working directory. There will be
four subdirectories under HowTomcatWorks: lib (containing all needed
libraries), src (containing the source files), webroot (containing an HTML file
and two servlets), and webapps (containing sample applications).
2. Change directory to the working directory and compile the java files. If you are
using Windows, run the win-compile.bat file. If your computer is a Linux
machine, type the following: (don't forget to chmod the file if necessary)
./linux-compile.sh
Also note that the Catalina source code is part of the org.apache.catalina package
and its sub-packages. The interfaces in the org.apache.catalina package define the
contracts among the various components in the servlet container. You will learn each of
these interfaces starting from Chapter 3.
This book comes with two servlets: PrimitiveServlet (to be used in Chapter 2) and
ModernServlet (for all chapters).
© Budi Kurniawan 2003
5
Chapter 1
A Simple Web Server
In this chapter you will build a simple web server. A web server is also called a Hypertext
Transfer Protocol (HTTP) server because it uses HTTP to communicate with its clients,
which are usually web browsers. A Java-based web server uses two important classes:
java.net.Socket and java.net.ServerSocket, and communications are done
through HTTP messages. Therefore, this chapter starts with the discussion of HTTP and
the two classes. Afterwards, I'll explain about the simple web server application that
accompanies this chapter.
HTTP is the protocol that allows web servers and browsers to send and receive data over
the Internet. It is a request and response protocol. The client requests a file and the server
responds to the request. HTTP uses reliable TCP connections—by default on TCP port
80. The first version of HTTP was HTTP/0.9, which was then overridden by HTTP/1.0.
Replacing HTTP/1.0 is the current version of HTTP/1.1, which is defined by RFC 2616
and downloadable from http://www.w3.org/Protocols/HTTP/1.1/rfc2616.pdf.
This section covers HTTP 1.1 briefly, enough to make you understand the messages sent
by the web server application. If you are interested in more details, read RFC 2616.
In HTTP, it’s always the client who initiates a transaction by establishing a
connection and sending an HTTP request. The server is in no position to contact a client
or make a callback connection to the client. Either the client or the server can prematurely
terminate a connection. For example, when using a web browser you can click the Stop
button on your browser to stop the download process of a file, effectively closing the
HTTP connection with the web server.
HTTP Requests
An HTTP request consists of three components:
*
*
*
Method—–URI—Protocol/Version
Request headers
Entity body
An example of an HTTP request is the following:
POST /servlet/default.jsp HTTP/1.1
Accept: text/plain; text/html
© Budi Kurniawan 2003
1
Accept-Language: en-gb
Connection: Keep-Alive
Host: localhost
Referer: http://localhost/ch8/SendDetails.htm
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
Content-Length: 33
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
LastName=Franks&FirstName=Michael
The method—URI—protocol version appears as the first line of the request.
POST /servlet/default.jsp HTTP/1.1
where POST is the request method, /servlet/default.jsp represents the URI and
HTTP/1.1 the Protocol/Version section.
Each HTTP request can use one of the many request methods as specified in the
HTTP standards. The HTTP 1.1 supports seven types of request: GET, POST, HEAD,
OPTIONS, PUT, DELETE, and TRACE. GET and POST are the most commonly used in
Internet applications.
The URI specifies an Internet resource completely. A URI is usually interpreted as
being relative to the server’s root directory. Thus, it should always begin with a forward
slash /. A URL is actually a type of URI (see http://www.ietf.org/rfc/rfc2396.txt). The
protocol version represents the version of the HTTP protocol being used.
The request header contains useful information about the client environment and the
entity body of the request. For example, it could contain the language the browser is set
for, the length of the entity body, and so on. Each header is separated by a carriage
return/linefeed (CRLF) sequence.
Between the headers and the entity body, there is a blank line (CRLF) that is
important to the HTTP request format. The CRLF tells the HTTP server where the entity
body begins. In some Internet programming books, this CRLF is considered the fourth
component of an HTTP request.
In the previous HTTP request, the entity body is simply the following line:
LastName=Franks&FirstName=Michael
The entity body could easily become much longer in a typical HTTP request.
© Budi Kurniawan 2003
2
HTTP Responses
Similar to requests, an HTTP response also consists of three parts:
*
*
*
Protocol—Status code–—Description
Response headers
Entity body
The following is an example of an HTTP response:
HTTP/1.1 200 OK
Server: Microsoft-IIS/4.0
Date: Mon, 3 Jan 1998 13:13:33 GMT
Content-Type: text/html
Last-Modified: Mon, 11 Jan 1998 13:23:42 GMT
Content-Length: 112
HTTP Response Example
Welcome to Brainy Software
The first line of the response header is similar to the first line of the request header. The
first line tells you that the protocol used is HTTP version 1.1, the request succeeded (200
= success), and that everything went okay.
The response headers contain useful information similar to the headers in the request.
The entity body of the response is the HTML content of the response itself. The headers
and the entity body are separated by a sequence of CRLFs.
A socket is an endpoint of a network connection. A socket enables an application to read
from and write to the network. Two software applications residing on two different
computers can communicate with each other by sending and receiving byte streams over
a connection. To send a message from your application to another application, you need
to know the IP address as well as the port number of the socket of the other application.
In Java, a socket is represented by the java.net.Socket class.
To create a socket, you can use one of the many constructors of the Socket class. One
of these constructors accepts the host name and the port number:
© Budi Kurniawan 2003
3