硕士学位论文 
 
 
3D 视频编码中深度图的相关技术研究 
 
 
RESEARCH ON DEPTH MAP IN 3D VIDEO 
CODING   
 
 
                                   
张涛 
 
 
 
 
 
 
 
 
 
 
 
 
哈尔滨工业大学 
2012 年 6 月 
                                                                       
国内图书分类号:TP391.4                                                        学校代码:10213 
国际图书分类号:681.39                                                                      密级:公开   
 
         
 
 
 
 
工学硕士学位论文 
 
 
3D 视频编码中深度图的相关技术研究 
 
 
 
硕 士 研 究 生 : 张涛 
导        师 : 高文 教授 
申 请 学 位 : 工学硕士 
学
科 : 计算机科学与技术 
所   在   单   位 : 计算机科学与技术学院 
答   辩   日   期 : 2012 年 6 月 
授予学位单位 : 哈尔滨工业大学 
 
Classified Index: TP391.4       
U.D.C: 681.39                     
 
 
 
Dissertation for the Master Degree in Engineering 
 
 
RESEARCH ON DEPTH MAP IN 3D VIDEO 
CODING 
 
 
 
 
 
 
Tao Zhang 
Prof. Wen Gao 
Candidate: 
Supervisor: 
Academic Degree Applied for:  Master of Engineering 
Speciality: 
Affiliation: 
Computer Science and Technology 
School  of  Computer  Science  and 
Technology   
Date of Defence: 
Degree-Conferring-Institution:  Harbin Institute of Technology 
June, 2012 
 
哈尔滨工业大学工学硕士学位论文 
摘    要 
随着 3D 显示器和交互式多媒体系统的发展,新的 3D 视频应用,如三维电
视(3DTV)和自由视点视频(FVV)已经越来越引起人们的兴趣。为了使这些 3D 视
频应用成为可能,由多视点视频及其对应的深度图序列组成的新的 3D 视频格
式(Multiview Video plus Depth,MVD)被提出。该视频格式利用基于深度图的视
点合成技术(Depth  Image  Based  Rendering,DIBR)可以合成任意位置的虚拟视
点。如何对 MVD 数据进行高效的压缩是目前 3D 视频编码标准研究的一个重要
问题。多视点视频在多视点视频编码标准(Multiview  Video  Coding,MVC)中已
经进行了较详细的研究。本文主要对 MVD 中的深度图序列的编码和质量恢复
进行了研究。深度图与传统的视频信息有很大的不同,深度图中的值表示的是
场景中物体到摄像机的距离,它是由很有很多平滑区域构成,这些平滑区域由
尖锐的边缘分隔开,而且这些边缘信息对视点合成非常敏感。此外,深度图不
需要在终端呈现给用户进行观看,它主要是用来进行视点合成的。目前由于深
度传感器物理上的限制,获取到的深度图通常具有很大程度的模糊和噪声,所
以需要对深度图的质量进行复原。鉴于深度图的这些特性,本文提出了两项针
对深度图编码的技术和一项针对深度图质量复原的技术。本文的主要工作和创
新之处在于: 
1.  提出了基于合成视点失真估计的深度图编码方法。 
本文充分分析了深度图编码对合成视点质量的影响,提出了用于深度图编
码的失真模型。该失真模型用来估计深度编码对合成视点的影响。本文利用估
计的合成视点失真来代替原有深度图失真进行 RD 最优化的模式决策过程。 
2.  提出了基于视差的深度图编码方法。 
        本文提出了对深度图对应的视差图进行编码来大幅度降低编码深度信息的
码率,而对合成视点的质量没有较大的影响。该方法考虑了一种特殊的应用场
合,即虚拟视点的位置在编码前已知。在目前的 3D 系统中虚拟视点的位置可
以通过一个反馈网络得到,所以基于视差的深度图编码有着重要意义。 
3.  提出了基于稀疏表示深度图恢复方法。 
该方法利用稀疏表示在图像逆问题中的重要作用,结合深度图像本身的一
些先验知识:深度图中绝大部分区域是平滑的,深度图像中边缘与其对应的纹
理图的边缘具有很强的相关性,对深度图进行恢复。 
 
关键词:3D 视频编码;多视点视频;视点合成;深度图;稀疏表示 
 
- I - 
哈尔滨工业大学工学硕士学位论文 
Abstract 
With  the  development  of  3D  display  and  interactive  multimedia  systems,  new 
3D  video  applications,  such  as  3DTV  and  Free  Viewpoint  Video,  are  attracting 
significant  interests.  In  order  to  enable  these  new  applications,  new  3D  data 
formats(Multiview  Video  plus  Depth,  MVD)  including  captured  multiview  video 
sequences  and  corresponding  depth  maps  have  been  proposed.  This  new  video 
formats can synthesize arbitrary virtual views by using depth image based rendering 
(DIBR). How to efficiently compress MVD data is a important issue in study of 3D 
video  standard.  Multiview  video  have  been  detailedly  studied  in  Multiview  Video 
Coding(MVC) standard. This thesis mainly studys the coding and quality restoration 
of the depth data in MVD. The  depth maps have different characteristics than video 
signal. The value in depth map represents the distance between an object in a scene 
and  the  camera.  Depth  maps  are  composed  of  many  smooth  regions  and  those 
smooth regions are separated by sharp edge, and the edge information is sensitive to 
render  virtual  views.  Moreover,  depth  maps  are  not  directly  displayed  but  used  in 
view rendering. Because of the physical limit of the depth sensor, the obtained depth 
maps  are  blurring  and  noisy,  so  we  should  enhance  the  quality  of  the  depth  maps. 
Based on the characteristics of depth maps, the thesis proposed two technologies  on 
the  coding  of  depth  maps  and  a  technology  on  the  quality  enhancement  of  depth 
maps. The main content and novelties are listed below: 
1.  Proposed  a  depth  coding  method  based  on  the  distortion  estimation  of  the 
rendered view. 
The  thesis  sufficiently  analysed  the  relationship  between  the  depth  coding 
distortion and the synthesized view distortion and proposed a new distortion model 
in depth coding. The distortion model is used to estimate the distortion of rendered 
view  caused  by  depth  coding.  We  use  the  estimated  distortion  to  replace  the 
distortion of depth in RD optimization mode selection process of depth coding. 
2.  Proposed a disparity based depth coding method 
The  thesis  proposed  to  compress  disparity  maps  rather  than  depth  maps  to 
greatly  reduce  the  bitrate  without  significant  quality  degradation  for  synthesized 
virtual views. The method is used in a special situation that the virtual view position 
is known before encoding. In current 3D system, the virtual view position could be 
easily  derived  by  a  feedback  system,  so  the  proposed  disparity  based  depth  coding 
method is meaningful. 
3.  Proposed a sparse representation based depth enhancement method. 
The  method  uses  the  importance  of  the  sparse  representation  in  image  inverse  
 
- II - 
哈尔滨工业大学工学硕士学位论文 
problem and combines the priors of the depth maps:most areas in depth maps are 
smooth and the edges of depth maps and correspoding video have strong  correlation, 
to enhance the quality of the depth maps. 
 
 
Keywords: 3D video coding, Multiview video, View synthesis, Depth map, Sparse 
representation 
 
- III - 
哈尔滨工业大学工学硕士学位论文 
目    录 
摘    要 .......................................................................................................................... I 
ABSTRACT ................................................................................................................ II 
第 1 章  绪    论 ........................................................................................................... 1 
1.1  课题背景及研究的目的和意义 ................................................................. 1 
1.2 3D 视频的研究现状 ................................................................................... 3 
1.2.1  多视点视频编码 ........................................................................................ 4 
1.2.2  基于 MVD 的 3D 视频编码...................................................................... 8 
1.3  深度图的编码和恢复研究现状 ................................................................. 9 
1.3.1  深度图的编码方法 .................................................................................. 10 
1.3.2  深度图的复原方法 .................................................................................. 12 
1.4  本文的主要研究内容及论文组织 ............................................................ 13 
第 2 章  基于合成视点失真估计的深度图编码 ..................................................... 15 
2.1  引言 ........................................................................................................ 15 
2.2  深度值失真对合成视点几何失真的影响 ................................................. 16 
2.3  基于视差舍入的几何失真模型 ............................................................... 17 
2.4  合成视点的失真估计模型 ....................................................................... 18 
2.5  改进的深度图编码率失真模型 ............................................................... 19 
2.6  实验结果与分析 ...................................................................................... 20 
2.7  本章小结 ................................................................................................. 23 
第 3 章  基于视点视差的深度图编码 ..................................................................... 24 
3.1  引言 ........................................................................................................ 24 
3.2 1D 平行摄像机配置下视点合成分析 ....................................................... 25 
3.3  基于视差的深度图编码方法 ................................................................... 26 
3.4  基于视差的深度图编码的应用 ............................................................... 28 
3.5  实验结果与分析 ...................................................................................... 29 
3.6  本章小结 ................................................................................................. 31 
第 4 章  基于稀疏表示的深度图复原方法 ............................................................. 33 
4.1  引言 ........................................................................................................ 33 
4.2  图像复原模型及稀疏表示理论 ............................................................... 33 
4.2.1  图像复原模型 .......................................................................................... 34 
 
- IV - 
哈尔滨工业大学工学硕士学位论文 
4.2.2  稀疏表示理论 .......................................................................................... 35 
4.3  基于稀疏表示的深度图复原方法 ............................................................ 37 
4.3.1  基于自适应稀疏域选择(ASDS)的稀疏表示 ......................................... 37 
4.3.2  基于 AR 模型的空间自适应的正则化 .................................................. 40 
4.3.3  基于联合双边滤波(JBF)的正则化 ........................................................ 41 
4.3.4  本文提出算法的总结 .............................................................................. 42 
4.4  实验结果与分析 ...................................................................................... 42 
4.5  本章小结 ................................................................................................. 47 
结    论 ....................................................................................................................... 48 
参考文献 ................................................................................................................... 49 
攻读学位期间发表的论文及其它成果 ................................................................... 54 
哈尔滨工业大学学位论文原创性声明及使用授权说明 ....................................... 55 
致    谢 ....................................................................................................................... 56 
 
 
- V -