Improved PixelBased Rate Allocation ForPixelDomain Distributed Video CodersWithout Feedback Channel
Marleen Morb´ee
1
, Josep PradesNebot
2
, Antoni Roca
2
, Aleksandra Piˇzurica
1
and Wilfried Philips
1
1
TELINIPIIBBT
2
GTSITEAMGhent University Universidad Polit´ecnica de ValenciaGhent, Belgium Valencia, Spainmmorbee@telin.ugent.be jprades@dcom.upv.es
Abstract.
In some video coding applications, it is desirable to reducethe complexity of the video encoder at the expense of a more complexdecoder. Distributed Video (DV) Coding is a new paradigm that aims atachieving this. To allocate a proper number of bits to each frame, mostDV coding algorithms use a feedback channel (FBC). However, in somecases, a FBC does not exist. In this paper, we therefore propose a rateallocation (RA) algorithm for pixeldomain distributed video (PDDV)coders without FBC. Our algorithm estimates at the encoder the number of bits for every frame without signiﬁcantly increasing the encodercomplexity. For this calculation we consider each pixel of the frame individually, in contrast to our earlier work where the whole frame is treated jointly. Experimental results show that this pixelbased approach delivers better estimates of the adequate encoding rate than the framebasedapproach. Compared to the PDDV coder with FBC, the PDDV coderwithout FBC has only a small loss in RD performance, especially at lowrates.
1 Introduction
Some video applications, e.g., wireless lowpower surveillance, disposable cameras, multimedia sensor networks, and mobile camera phones require lowcomplexity coders. Distributed video (DV) coding is a new paradigm that fulﬁllsthis requirement by performing intraframe encoding and interframe decoding [1]. Since DV
de
coders and not
en
coders perform motion estimation andmotion compensated interpolation, most of the computational load is movedfrom the encoder to the decoder.One of the most diﬃcult tasks in DV coding is allocating a proper number of bits to encode each video frame. This is mainly because the encoder does not have
This work has been partially supported by the Spanish Ministry of Education andScience and the European Commission (FEDER) under grant TEC200507751C0201. A. Piˇzurica is a postdoctoral research fellow of FWO, Flanders.
2 M. Morb´ee, J. PradesNebot, A. Roca, A. Piˇzurica and W. Philips
access to the motion estimation information of the decoder and because smallvariations in the allocated number of bits can cause large changes in distortion.Most DV coders solve this problem by using a feedback channel (FBC), whichallows the decoder to request additional bits from the encoder when needed.Although this way an optimal rate is allocated, it is not a valid solution inunidirectional and oﬄine applications, and increases the decoder complexity andlatency [2].In this paper, we propose a rate allocation (RA) algorithm for pixeldomaindistributed video (PDDV) coders that do not use a FBC. Our algorithm computes the number of bits to encode each video frame without signiﬁcantly increasing the encoder complexity. The proposed method is related to our previouswork [3] on PDDV coders without FBC. However, in this paper, the algorithmis improved by estimating the error probabilities for each pixel separately instead of for the whole frame jointly. We also adapted the algorithm for thecase of lossy (instead of lossless) coding of the key frames. The experimentalresults show that the RA algorithm delivers good estimates of the rate and theframe qualities provided by our algorithm are quite close to the ones providedby a FBCbased algorithm. Furthermore, we observe that the rate estimates andframe quality are signiﬁcantly improved compared to our previous work [3].The paper is organized as follows. In Section 2, we study the basics of PDDVcoding. In Section 3, we study the RA problem and the advantages and inconveniences of using a FBC. Then, in Section 4, we describe the RA algorithm.Subsequently, in Section 5, we compare the performance of a DV coder usinga FBC and the performance of the same DV coder using our RA algorithm.Finally, the conclusions are presented in Section 6.
2 PixelDomain DV coding
In DV coders, the frames are organized into key frames (Kframes) and WynerZiv frames (WZframes). The Kframes are coded using a conventional intraframe coder. The WZframes are coded using the WynerZiv paradigm, i.e., theyare intraframe encoded, but they are conditionally decoded using side information (Figure 1). In most DV coders, the odd frames are encoded as Kframes,and the even frames are encoded as WZframes [3–5]. Coding and decoding isdone unsequentially in such a way that, before decoding the WZframe
X
, thepreceding and succeeding Kframes (
X
B
and
X
F
) have already been transmittedand decoded. Thus, the receiver can obtain a good approximation
S
of
X
byinterpolating its two closest decoded frames (ˆ
X
B
andˆ
X
F
).
S
is used as part of the side information to conditionally decode
X
, as will be explained below.The DV coders can be divided into two classes: the scalable coders [2,3,5],and the nonscalable coders [4]. The scalable coders have the advantages thatthe rate can be ﬂexibly adapted and that the rate control is easier than in thenonscalable case. In this paper, we focus on the practical scalable PDDV coderdepicted in Figure 1 [2,3,5]. In this scheme, we ﬁrst extract the
M
Bit Planes(BPs)
X
k
(1
≤
k
≤
M
) from the WZframe
X
.
M
is determined by the number
Improved PixelBased Rate Allocation for PDDV Coders without FBC 3
BPRate
...
. . . . . .
BP&
IntraframeDecoderWZframesReceiverTransmitterbitsParity
X X
B
Kframes
X
F
IntraframeEncoderRec.ˆ
X
EncoderTurboTurboDecoderˆ
X
F
ˆ
X
B
S
IntraframeDecoderˆ
X
B
,
ˆ
X
F
extractionAllocationBuﬀerInterpolationFrame
X
k
SlepianWolf codec
S
k
FBCextractionselection
X
k
Fig.1.
General block diagram of a scalable PDDV coder.
of bits by which the pixel values of
X
are represented. Subsequently, the
m
mostsigniﬁcant BPs
X
k
(1
≤
k
≤
m,
1
≤
m
≤
M
) are encoded independently of eachother by a SlepianWolf (SW) coder [6]. The transmission and decoding of BPsis done in order of signiﬁcance (the most signiﬁcant BPs are transmitted anddecoded ﬁrst). The SW coding is implemented with eﬃcient channel codes thatyield parity bits of
X
k
, which are transmitted over the channel. At the receiverside, the SW decoder obtains the srcinal BP
X
k
from the transmitted paritybits, the corresponding BP
S
k
extracted from the interpolated frame
S
, and thepreviously decoded BPs
{
X
1
,...,X
k
−
1
}
. Note that
S
k
can be considered theresult of transmitting
X
k
through a noisy
virtual channel
. The SW decoder is achannel decoder that recovers
X
k
from its noisy version
S
k
. Finally, the decoderobtains the reconstruction ˆ
x
of each pixel
x
∈
X
by using the decoded bits
x
k
∈
X
k
(
k
= 1
,...,m
) and the corresponding pixel
s
of the interpolated frame
S
throughˆ
x
=
x
L
, s < x
L
s, x
L
≤
s
≤
x
R
x
R
, s > x
R
(1)with
x
L
=
m
i
=1
x
i
2
8
−
i
and
x
R
=
x
L
+ 2
8
−
m
−
1
.
(2)
4 M. Morb´ee, J. PradesNebot, A. Roca, A. Piˇzurica and W. Philips
3 The rate allocation problem
In PDDV coders, the optimum rate
R
∗
is the
minimum
rate necessary to losslessly
1
decode the BPs
X
k
(
k
= 1
,...,m
). The use of a rate higher than
R
∗
doesnot lead to a reduction in distortion, but only to an unnecessary bit expense. Onthe other hand, encoding with a rate lower than
R
∗
can cause the introductionof a large number of errors in the decoding of
X
k
, which can greatly increasethe distortion. This is because of the threshold eﬀect of the channel codes usedin DV coders.A common RA solution adopted in DV coders is the use of a FBC and aratecompatible punctured turbo code (RCPTC) [7]. In this conﬁguration, theturbo encoder generates all the parity bits for the BPs to be encoded, saves thesebits in a buﬀer (see Figure 1), and divides them into parity bit sets. The size of a parity bit set is
N/T
punc
, where
T
punc
is the puncturing period of the RCPTCand
N
is the number of pixels in each frame. To determine the adequate numberof parity bit sets to send for a certain BP
X
k
, the encoder ﬁrst transmits oneparity bit set from the buﬀer. Then, if the decoder detects that the residualerror probability
Q
k
(for the calculation see Section 4.4) is above a threshold
t
,it requests an additional parity bit set from the buﬀer through the FBC. Thistransmissionrequest process is repeated until
Q
k
< t
. If we denote by
K
k
thenumber of transmitted parity bit sets, then the encoding rate
R
k
for BP
X
k
is
R
k
=
r K
k
N T
punc
,
(3)with
r
being the frame rate of the video.However, although the FBC allows the system to allocate an optimal rate,this FBC cannot be implemented in oﬄine applications or in those applicationswhere communication from the decoder to the encoder is not possible. In thoseapplications, an appropriate RA algorithm at the encoder can take over its role.In the following section, we will describe this RA algorithm to suppress the FBCin more detail.
4 The rate allocation algorithm
The main idea of the proposed method is to estimate at the encoder side, foreach BP of the WZframes, the optimal (i.e. the minimal required) number of parity bits for a given residual error probability. An important aspect of theproposed approach is also avoiding underestimation of the optimal number of parity bits. Indeed, if the rate is underestimated, the decoding of the BPs of theframes will not be errorfree and this will lead to a large increase in distortion.Let us denote by
U
the diﬀerence between the srcinal frame and the sideinformation frame:
U
=
X
−
S
. As in [3–5], we assume that a pixel value
u
∈
U
1
In practical PDDV coding, SW decoders are allowed to introduce a certain smallamount of errors
Improved PixelBased Rate Allocation for PDDV Coders without FBC 5
follows a Laplacian distribution with a probability density function (pdf)
p
(
u
) =
α
2
e
(
−
α

u

)
(4)where
α
=
√
2
/σ
and
σ
is the standard deviation of the diﬀerence frame
U
.Estimationof
{
P
k
}
of
σ
2
Estimationˆ
σ
2
Estimationof
{
R
k
}{
R
k
}{
P
k
}
X,
ˆ
X
B
,
ˆ
X
F
Fig.2.
Rate allocation module at the encoder.
As every BP of a WZframe
X
is separately encoded, a diﬀerent encodingrate
R
k
must be allocated to each BP
X
k
. As the virtual channel is assumedto be a binary symmetric channel, to obtain
R
k
, we need to know the bit errorprobability
P
k
of each BP
X
k
. To calculate this probability, we ﬁrst make anestimate ˆ
σ
2
of the parameter
σ
2
(Section 4.1). Then, for each BP
X
k
, we useˆ
σ
to estimate
P
k
(Section 4.2). Once
P
k
is estimated, we can determine theencoding rate
R
k
for BP
X
k
by taking into account the error correcting capacityof the turbo code (Section 4.3). In Figure 2, a block diagram of the RA moduleis depicted.Although we aim at an overestimation of the rate, this is not always achieved.Therefore, once the parity bits have been decoded, the residual error probability
Q
k
is estimated at the decoder (ˆ
Q
k
) (Section 4.4). If ˆ
Q
k
is above a threshold
t
,the parity bits of the considered BP are discarded and the frame is reconstructedwith the available previously decoded BPs. This way, we prevent an increase inthe distortion caused by an excessive number of errors in a decoded BP. In thefollowing, we explain each step of our RA algorithm in more detail.
4.1 Estimation of
σ
2
We estimate
σ
2
at the encoder so the estimate should be very simple in orderto avoid signiﬁcantly increasing the encoder complexity. We adopt the approachof [3], but we take the coding of the Kframes into account. ˆ
σ
2
is then the meansquared error (MSE) between the current WZframe and the average of the twoclosest decoded Kframes:ˆ
σ
2
=1
N
(
v,w
)
∈
X
X
(
v,w
)
−
ˆ
X
B
(
v,w
) +ˆ
X
F
(
v,w
)2
2
(5)with
N
denoting the number of pixels in each frame. The decoded frames areobtained by the intraframe decoding unit at the encoder site (see Figure 1).In general, the resulting ˆ
σ
2
is an overestimate of the real
σ
2
since it is expected that the motion compensated interpolation performed at the decoder to