Wavelet-based reversible watermarking for authentication
Digimarc Corporation, 19801 SW 72nd Avenue, Tualatin, OR 97062, USA
Jun Tian
ABSTRACT
In the digital information age, digital content (audio, image, and video) can be easily copied, manipulated, and
distributed. Copyright protection and content authentication of digital content has become an urgent problem to
content owners and distributors. Digital watermarking has provided a valuable solution to this problem. Based
on its application scenario, most digital watermarking methods can be divided into two categories: robust
watermarking and fragile watermarking. As a special subset of fragile watermark, reversible watermark (which
is also called lossless watermark, invertible watermark, erasable watermark) enables the recovery of the original,
unwatermarked content after the watermarked content has been detected to be authentic. Such reversibility
to get back unwatermarked content is highly desired in sensitive imagery, such as military data and medical
data. In this paper we present a reversible watermarking method based on an integer wavelet transform. We
look into the binary representation of each wavelet coefficient and embed an extra bit to “expandable” wavelet
coefficient. The location map of all “expanded” coefficients will be coded by JBIG2 compression and these
coefficient values will be losslessly compressed by arithmetic coding. Besides these two compressed bit streams,
an SHA-256 hash of the original image will also be embedded for authentication purpose.
Keywords: Reversible Watermark, Integer Wavelet Transform, JBIG2, SHA-256, Digital Watermarking, Con-
tent Authentication, Digimarc
1. INTRODUCTION
Digital watermarking has grown explosively in the last few years.
It embeds an invisible (in some cases,
visible) mark (payload) into digital content for the purpose of copyright communication and protection, content
authentication, counterfeit deterrence, forensic tracking, connected content, or broadcast monitoring, etc. For
a detailed review of digital watermarking, we refer to1–6 etc.
From the application point of review, most digital watermarking methods can be divided into two cate-
gories: robust watermarking and fragile watermarking. Robust watermarking is mainly aimed at copyright
protection. Here “robust” means the embedded watermark should be very resistant to various signal processing
operations. On the other hand, fragile watermarking is aimed at content authentication. A fragile watermark
will be altered or destroyed when the digital content is modified. As a special subset of fragile watermarking,
reversible watermarking4, 7–12 has drawn lots of attention recently. Reversible watermark, (which is also called
lossless watermark, invertible watermark, erasable watermark), has an additional advantage such that when
watermarked content has been detected to be authentic, one can remove the watermark to retrieve the original,
unwatermarked content. Such reversibility to get back unwatermarked content is highly desired in sensitive
imagery, such as military data and medical data.
In this paper, we present a reversible watermarking method of digital images. Our method can be applied
to digital audio and video as well. Compared with other reversible watermarking methods, our method employs
an integer wavelet transform to losslessly remove redundancy in a digital image to allocate space for watermark
embedding. The embedding algorithm starts with a reversible color conversion transform. Then, we apply the
integer wavelet transform to one (or more) decorrelated component. The purpose of both the reversible color
conversion transform and the integer wavelet transform is to remove irregular redundancy in the digital image,
such that we can embed regular redundancy into the digital image, for the purpose of content authentication
and original content recovery. The regular redundancy could be a hash of the image, a compressed bit stream of
Further author information:
E-mail: jtian@digimarc.com, Phone: 1-503-495-4691, Fax: 1-503-495-4606
the image, or some other image content dependent watermark. In the integer wavelet domain, we look into the
binary representation of each wavelet coefficient and embed an extra bit to “expandable” wavelet coefficient.
Besides original content retrieval bit streams, an SHA-256 hash of the original image will also be embedded for
authentication purpose.
The paper is organized as follows. We give a brief description of reversible watermarking in Sect. 2. Before
presenting our reversible watermarking method, we start with a simple example in Sect. 3. Then we move to our
method in Sect. 4, with a detailed explanation of the algorithm and its implementation. The paper is concluded
in Sect. 5.
2. THE WHAT, WHY, AND HOW OF REVERSIBLE WATERMARKING
Reversible watermark is a special subset of fragile watermark. Like all fragile watermarks, it can be used for
digital content authentication. But reversible watermark is much more than content authentication. It has an
additional advantage that when watermarked content has been detected to be authentic, one can remove the
watermark to retrieve the original, unwatermarked content.
As illustrated in Fig. 1, we embed a reversible watermark in a digital image I, and obtain the watermarked
image I’. Before sending it to the content authenticator, the image I’ might have been tampered by some
If the authenticator finds that no tampering happened in I’, i.e., I’ is
intentional attacker or might not.
authentic, then the authenticator will remove the reversible watermark from I’ and retrieve the original image,
which results in a new image I”. By definition of reversible watermark, the retrieved image I” will be exactly
the same as the original image I, pixel by pixel.
Digital
Image I
Embed
Watermark
✲
Watermarked
Image I’
Content
Authentication
✟✟✟✯
✲
❍❍❍❥
Authentic
Original
Retrieval
✲
Retrieved
Image I”
(= I)
Tampered
Figure 1: Reversible watermark diagram.
The motivation of reversible watermark is distortion-free data embedding.10 Though one basic requirement
of digital watermarking is its imperceptibility, embedding a watermark will inevitably change the original
content. Even a very slight change in pixel values may not be desirable, especially in sensitive imagery, such as
military data and medical data. In such scenario, every bit of information is important. Any change will effect
the intelligence of the digital content, and the access to the original, raw data is always required. Reversible
watermarks will provide the original, raw data when the digital content is authentic.
On reversible watermark, it might seem impossible to achieve content authentication, original retrieval, and
a very low false positive rate, all at the same time. If we take a digital grayscale image as a two dimensional
matrix, with integer entries ranging from 0 to 255, Cox, et al., have shown4 that “the only way to achieve 100%”
content authentication and original retrieval “is to allow for 100% false positives”. However, the same authors
also noted4 “in natural images, neighboring pixels are strongly correlated”, which implies that the whole set of
digital grayscale images is only a small subset of two dimensional integer matrices whose entries ranging from
0 to 255. It is the pixel value correlation exhibited in digital images that enables us to develop a reversible
watermark which provides content authentication, original retrieval, and a very low false positive rate, all at
the same time.
Let’s take a look at Fig. 2, which is the 512 × 512, 8 bit per pixel (bpp) grayscale “Lena” image. If we take
out a 3 × 3 region out of it, the pixel values may look like
205
206
208
196
200
201
207
211
213
Figure 2: 512 × 512, 8 bpp grayscale “Lena”.
where pixel values are very close to their neighbors, showing a high redundance or a strong correlation.
The earliest reference to reversible watermark we could find is the Barton patent,7 filed in 1994. In his
invention, the bits to be overlayed will be compressed and added to the bitstring, which will be embedded into
the data block. For retrieval, original bits will be decompressed and used to restore modified bits to derive
original data block. Honsinger, et al.,11 reconstruct the embedded watermark after a payload has been decoded
from a watermarked image, then subtract the watermark from the watermarked image to losslessly recover the
original image. Fridrich, et al.,9 use a JBIG lossless compression scheme for compressing some least significant
bit (LSB) planes to make space to insert an image hash. Later10 they utilize an order-2 function (whose inverse
function is itself) for high-capacity data embedding that is distortion-free. For more details of these methods
and other reversible watermark methods, we refer to,4, 7–12 etc.
Our reversible watermark method is based on an integer wavelet transform, JBIG2 compression, and arith-
metic coding. In a wide-sense, Barton’s method, Fridrich’s methods, and our method could all be categorized
as “compression-based reversible watermark”. Let’s start with a simple example which will illustrate the basic
idea behind our reversible watermark method.
Assume we have two grayscale values (x, y), where x, y ∈ Z, 0 ≤ x, y ≤ 255, and we would like to embed one
bit b, with b ∈ {0, 1} into (x, y) in a reversible way. More specifically, let’s assume
3. A SIMPLE EXAMPLE
x = 205 , y = 200 , and b = 0 .
First we compute the average and difference of x and y,
405
2
205 + 200
l =
=
x + y
2
=
2
= 202 , h = x − y = 205 − 200 = 5 ,
where the symbol · denotes the integer part of a number. For example, 2.7 = 2, −1.2 = −2. Next we
expand the difference number h into its binary representation
h = 5 = 1012 .
Then we add b into the binary representation of h at the location right after the most significant bit (MSB)
(note that the MSB is always 1),
Finally we compute the new grayscale values, based on the new difference number h
number l,
and the original average
x
= l +
h
+ 1
2
= 202 +
9 + 1
2
= 207 , y
= x
− h
= 207 − 9 = 198 .
= 1b012 = 10012 = 9 .
h
The above embedding process is also illustrated in Fig. 3.
(205, 200)
✲ Average: 202
✲ Difference: 5 = 1012
New Difference: 10012 = 9
✲
✻
Embed
Payload “0”
✲ (207,198)
Figure 3: A simple example.
From the embedded pair (x
), the watermark detector (or authenticator) can extract the embedded bit b
and get back the original pair (x, y) by a similar process as the embedding. Again we compute the average and
difference
, y
207 + 198
= 202 , h
= x
− y
= 207 − 198 = 9 .
l
=
+ y
x
2
=
2
Look into the binary representation of h
h
= 9 = 10012 ,
extract the second most significant bit, which is “0” in this case, as the embedded bit b, which leaves
h
= 1012 = 5 .
Now with the average number l
pair (x, y).
and difference number h
, we can retrieve exactly the original grayscale-valued
In the above example, although the embedded pair (207, 198) is still 8 bpp, we have embedded an extra
bit by increasing the bit length of the difference number h from 3 bits (which is number 5) to 4 bits (which
is number 9). And such embedding process is totally reversible. If we have a sequence of pairs of grayscale
values (x1, y1), (x2, y2),··· , (xn, yn), where xi, yi ∈ Z, 0 ≤ xi, yi ≤ 255, 1 ≤ i ≤ n, we can embed the payload
b = {b1, b2,··· , bn}, where bi ∈ {0, 1}, 1 ≤ i ≤ n, by repeating the above process,
li =
xi + yi
2
, hi = xi − yi , 1 ≤ i ≤ n .
For each difference number hi, expand it into its binary representation,
where ri,0 = 1 is the MSB, ri,m ∈ {0, 1}, for 1 ≤ m ≤ j(i), with j(i) + 1 as the bit length of hi in its binary
representation. Then we could embed bi into hi by
hi = ri,0ri,1 ··· ri,j(i) ,
= ri,0biri,1 ··· ri,j(i) .
hi
Alternatively, we can combine all the bits ri,m ∈ {0, 1}, with 1 ≤ m ≤ j(i), 1 ≤ i ≤ n (note that we do not
select the MSBs), and b = {bi} into a single bit stream
B = r1,1r1,2 ··· r1,j(1)r2,1r2,2 ··· r2,j(2) ··· rn,1rn,2 ··· rn,j(n)b1b2 ··· bn ,
and use a reversible mapping f (which could be encryption, lossless compression, or other invertible operation,
or combination of some of them) to form a new bit stream C
C = f(B) = c1c2 ··· ck ,
where ci ∈ {0, 1}, for 1 ≤ i ≤ k, with k as the bit length of C. Then we could embed C into the difference
numbers hi, 1 ≤ i ≤ n, by
where cs(i−1)+1cs(i−1)+2 ··· cs(i) is a truncated subsequence of C with
= ri,0cs(i−1)+1cs(i−1)+2 ··· cs(i) ,
hi
The bit length of hi
is still one more than that of hi. For detection, as f is reversible, we can get back B by
s(0) = 0 , and s(i) = s(i − 1) + j(i) + 1 .
B = f
−1(C) ,
and consequently we can get back the original pairs (x1, y1), (x2, y2),··· , (xn, yn).
The reason that we could increase the bit length of the difference number h is because of the high redundance
in pixel values of natural images. Thus in most cases h will be very small and have a short bit length in its
binary representation. In an edge area or an area containing lots of activity, the difference number h from a
pair of grayscale values could be large. For example, if x = 105, y = 22, then h = x − y = 83 = 10100112. If we
= 137,
embed a bit “0” into h, h
= −10. This will cause an underflow problem as grayscale values are restricted to the range [0, 255]. In the
y
next section, we will present the definition of “expandable” pairs, which will prevent overflow and underflow
problems. Working with these “expandable” pairs, we are assured to achieve the goal of content authentication,
original retrieval, and a very low false positive rate, all at the same time.
= 100100112 = 147. With l = 63 unchanged, the embedded pair will be x
4. ALGORITHM AND IMPLEMENTATION
As illustrated in Fig. 4, our reversible watermark embedding method consists of six steps: reversible color
conversion transform, integer wavelet transform, JBIG2 compression of the location map, arithmetic coding of
wavelet coefficient values, SHA-256 hash, and embedding. We will discuss each of them in the follow subsections.
Digital
Image
✲ Color
Conversion
✲ Wavelet
Transform
✲ Expandable
Coefficients
SHA-256
❄
Hash
✟✟✟✯
❍❍❍❥
Location
Map
JBIG2
Coefficient
Values
Arithmetic
Coding
✲
✻
Embed
Figure 4: Reversible watermark embedding algorithm.
4.1. Reversible Color Conversion
Reversible color conversion transform13 decorrelates the dependence among different color components to a
large extent. It is a lossless color transform and the transform output is still integer-valued. For a RGB color
image, the reversible color conversion transform is
Its inverse transform will be
,
,
R + 2G + B
Y r =
4
U r = R − G ,
V r = B − G .
G = Y r −
R = U r + G ,
B = V r + G .
U r + V r
4
The reversible color conversion transform maps a grayscale-valued triplet to an integer-valued triplet. It can
be thought of as an integer approximation of the CCIR 601 standard which provides a conversion to Y CrCb
space defined by the following matrix
Y
Cr
Cb
=
0.299
0.587
0.114
0.500 −0.419 −0.081
−0.169 −0.331
0.500
R
G
B
.
The RGB to Y CrCb transform matrix is not integer-valued.
It requires floating point computing. Such
transform will introduce small roundoff errors, and will be not a reversible transform. As reversible watermarking
requires original retrieval with 100% accuracy, we choose the reversible color conversion transform instead of
the RGB to Y CrCb transform.
For a grayscale image there will be no reversible color conversion transform as we apply the next step, the
integer wavelet transform, directly.
4.2. Integer Wavelet Transform
The integer wavelet transform14 maps integers to integers and allows for perfect invertibility with finite precision
arithmetic (i.e. reversible). The wavelet filters for integer wavelet transforms are dyadic rational,15 i.e., integers
or rational numbers whose denominators are powers of 2, like 13/4, -837/32. Thus the integer wavelet transform
can be implemented with only three operations, addition, subtraction, and shift, on a digital computer. The fast
multiplication-free implementation is another advantage of the integer wavelet transform over standard discrete
wavelet transform.
For example, for the Haar wavelet filter, the integer wavelet transform will be the average and difference
calculation we used in Sect. 3,
li =
x2i + x2i+1
2
, hi = x2i − x2i+1 .
And for the BCW-3 filters,15 a biorthogonal filter pair with four vanishing moments for all four filters, the
integer wavelet transform will be
(x2i + x2i+2) − 1
hi = x2i+1 −
16
(hi−1 + hi) − 1
32
(x2i−2 + x2i+4) +
(hi−2 + hi+1) +
, li = x2i +
9
16
1
2
.
1
2
9
32
In this paper, we use will the Haar integer wavelet transform as an example to illustrate our reversible watermark
method. The generalization to other integer wavelet transforms will be straightforward and is omitted here.
After the reversible color conversion transform, we apply the integer wavelet transform to one (or more)
decorrelated component. In our implementation , we choose the Y r component, which is the luminance com-
ponent. For a grayscale image, as mentioned in Sect. 4.1, we apply the integer wavelet transform directly to
the whole image.
4.3. Expandable Wavelet Coefficient
As seen in Sect. 3, for the grayscale-valued pair (105, 22) and a payload bit “0” (or “1”), a brute-force embedding
will cause an underflow problem. Now we will study how to prevent the overflow and underflow problems.
For a grayscale-valued pair (x, y), where x, y ∈ Z, 0 ≤ x, y ≤ 255, as in Sect. 3, define the average and
difference as
Then the inverse transform to get back (x, y) from the average number l and difference number h is
Thus to prevent the overflow and underflow problems, i.e., to restrict x, y in the range of [0, 255] is equivalent
to have
.
(1)
, h := x − y .
, y = l −
l :=
x + y
2
x = l +
h + 1
2
h
2
h
2
0 ≤ l +
h + 1
2
≤ 255 , 0 ≤ l −
≤ 255 .
|h| ≤ 2(255 − l) ,
|h| ≤ 2l + 1 ,
128 ≤ l ≤ 255
0 ≤ l ≤ 127
if
if
l =
x + y
2
, h = x − y .
Since both l and h are integers, one can derive that the above inequalities are equivalent to
|h| ≤ 2(255 − l) , and |h| ≤ 2l + 1 .
(2)
Condition (2) sets a limit on the absolute value of the difference number h. As long as h is in such range,
it is guaranteed that (x, y) computed from Eqn. (1) will be grayscale values. Furthermore, Condition (2) is
equivalent to
With the above condition, we now define an expandable grayscale-valued pair.
Definition 4.1. For a grayscale-valued pair (x, y), where x, y ∈ Z, 0 ≤ x, y ≤ 255, define
Then (x, y) is an expandable pair if and only if
h = 0 , and 2
log2 |h|+2 − 1 ≤ min (2(255 − l), 2l + 1) .
Note that if h = 0, the bit length of the binary representation of |h| is log2
|h| + 1. Thus 2log2 |h|+2 − 1
is the largest number whose bit length is one more than that of |h|. Thus for an expandable pair (x, y), if
we embed an extra bit (“0” or “1”) into the binary representation of the difference number h at the location
still satisfies Condition (2), that is, the new pair computed
right after the MSB, the new difference number h
from Eqn. (1) is guaranteed to be grayscale values. For simplicity, we will also call h expandable if (x, y) is an
expandable pair.
Thus from the average number l, one can tell whether or not a difference number h is expandable, i.e.,
whether or not the bit length of h could be increased by 1 without causing any overflow or underflow problem.
Further we define the changeable bits of h as
Definition 4.2. For a grayscale-valued pair (x, y), assume h = 0, and the binary representation of |h| is
|h| = r0r1 ··· rj ,
where r0 = 1, rm ∈ {0, 1}, for 1 ≤ m ≤ j, with j ≥ 0, and j + 1 is the bit length. If g ≤ j is the largest number
such that
ri2j−i
+ 2g − 1 ≤ min(2(255 − l), 2l + 1) ,
j−g
i=0
then we say (x, y), or equivalently h, has g changeable bits, and they are rj−g+1, rj−g+2,··· , rj.
Since
|h| = r0r1 ··· rj =
j
ri2j−i ,
i=0
by definition, h has g changeable bits if the last g bits in the binary representation are all changed to “1”, it
still satisfies Condition (2), or the new pair computed from Eqn. (1) is still grayscale values. Let’s look at two
extreme cases:
• If g = 0, then h has no changeable bits.
• If g = j, then all bits (excluding the MSB) in its binary representation are changeable. It is clear that if
h is expandable, then g = j. However the inverse is not true, i.e., g = j does not imply h is expandable.
The number “0” does not have a proper binary representation. We can increase it (along with all positive
numbers) by 1 to fit it into the definition of expandable and changeable. With such preparation, we extract
bits from wavelet coefficients as follows:
1. For the Y r component of a color image or a grayscale image, apply the integer wavelet transform.
2. If hi ≥ 0 and li < 255, we increase hi by 1, hi = hi + 1.
3. Construct a bit stream R, which consists of changeable bits from all hi. The scanning order of hi is
determined by a fixed pattern (for example, zigzag).
4.4. JBIG2 Compression
For a grayscale-valued pair (x, y), by Definition 4.1, we can tell whether or not it is expandable. When (x, y)
has been modified by the embedder, it will not be clear to the watermark detector whether or not the original
pair has been expanded, i.e., whether the bit length of the binary representation of the difference number has
been increased by 1 (thus larger than the original one), or it is the same as the original one. In order to remove
the watermark and retrieve the original, unwatermarked image, the detector needs to know the location of
expanded difference numbers h in the original image.
We can define a location map of expanded difference numbers by setting its value to “1” at each location
when it is expanded or “0” otherwise. The location map can be viewed as a bi-level image. To store the location
map, we can losslessly compress the bi-level image and store the compressed bit stream instead. We will employ
JBIG2, the new international standard for lossless compression of bi-level images, to compress the location map
of expanded difference numbers h. For convenience, we will denote the JBIG2 compressed bit stream of the
location map of expanded h as J . Alternatively, the location map could be compressed by run-length coding.
4.5. Arithmetic Coding
To make more room for embedding the payload, we could further losslessly compressed the collected bit stream
R, which are all the changeable bits from difference numbers h. Either arithmetic coding16 or Huffman coding
could be used for this purpose. In our implementation, we use arithmetic coding
C = ArithmeticCoding(R) ,
where C is the compressed bit stream from the arithmetic coding.
4.6. SHA-256 Hash
To authenticate a watermarked image and detect tampering, we will embed a hash of the image into itself. The
new hash algorithm SHA-256 is a 256-bit hash function that is intended to provide 128 bits of security against
collision attacks. SHA-256 is more consistent with the new encryption standard, the Advanced Encryption
Standard (AES) algorithm, than SHA-1, which provides no more than 80 bits of security against collision
attacks. We calculate the SHA-256 hash of the digital image (before the reversible color conversion transform)
and denote the hash as H.