Objectmix
Tags Register Mark Forums Read

Optimizing alpha blending of two 32bit images : Graphics

This is a discussion on Optimizing alpha blending of two 32bit images within the Graphics forums in Theory and Concepts category; Hi, I'm working on the alpha blending algorithm of two 32bit images. The background image is not opaque. The result should be an image that contains both the alpha channel mixed. I searched for the algorithm and I found this rule: FinalAlpha = 255-(255-Alpha1)*(255-Alpha2)/255; FinalColor = Alpha1*Color1/FinalAlpha + Alpha2*Color2*(255-Alpha1)/ FinalAlpha/255; where Color1/Alpha1 represents the foreground image, and Color2/Alpha2 represents the background. So I came up with the following code to blend them together: // pxD is the RGBA sequence for the output, // pxFG is foreground while pxBg is background BYTE *pxD, *pxFg, *pxBg; BYTE aD; for (px=0; px < ...


Object Mix > Theory and Concepts > Graphics > Optimizing alpha blending of two 32bit images

Reply

 

LinkBack Thread Tools
  #1  
Old 04-07-2008, 07:46 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Optimizing alpha blending of two 32bit images

Hi,

I'm working on the alpha blending algorithm of two 32bit images. The
background image is not opaque. The result should be an image that
contains both the alpha channel mixed. I searched for the algorithm
and I found this rule:

FinalAlpha = 255-(255-Alpha1)*(255-Alpha2)/255;
FinalColor = Alpha1*Color1/FinalAlpha + Alpha2*Color2*(255-Alpha1)/
FinalAlpha/255;

where Color1/Alpha1 represents the foreground image, and Color2/Alpha2
represents the background.

So I came up with the following code to blend them together:

// pxD is the RGBA sequence for the output,
// pxFG is foreground while pxBg is background
BYTE *pxD, *pxFg, *pxBg;
BYTE aD;
for (px=0; px < nTotalPx; px+=4)
{
aD = 255 - (255-pxFg[px+3]) * (255-pxBg[px+3]) / 255;
if (aD != 0 )
{
pxD[px+0] = pxFg[px+3] * pxFg[px+0] / aD + pxBg[px+0] * pxBg[px+3]
* (255 - pxFg[px+3]) / aD / 255;
pxD[px+1] = pxFg[px+3] * pxFg[px+1] / aD + pxBg[px+1] * pxBg[px+3]
* (255 - pxFg[px+3]) / aD / 255;
pxD[px+2] = pxFg[px+3] * pxFg[px+2] / aD + pxBg[px+2] * pxBg[px+3]
* (255 - pxFg[px+3]) / aD / 255;
}
pxD[px+3] = aD;
}

It works but it looks pretty complicated to me. And I have no idea how
to optimize it. I appreciate it if anyone can give me some hints. Will
MMX help in this situation?

Thanks in advance,
--
Bill Holt
Reply With Quote
  #2  
Old 04-07-2008, 08:22 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

Bill Holt <mailbill@21cn.com> wrote:

> Hi,
>
> I'm working on the alpha blending algorithm of two 32bit images. The
> background image is not opaque. The result should be an image that
> contains both the alpha channel mixed.


You mean: final_color = background + foreground*(bck_alpha * fore_alpha)?
Integer SIMD instructions (MMX/SSE2) will help.

w.
Reply With Quote
  #3  
Old 04-07-2008, 08:42 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

On Apr 7, 9:22 pm, Wojciech Muła
<wojciech_m...@poczta.null.onet.pl.invalid> wrote:
> You mean: final_color = background + foreground*(bck_alpha * fore_alpha)?
> Integer SIMD instructions (MMX/SSE2) will help.
>
> w.


Thanks, but where did you find that algorithm? The result doesn't look
right. The bright part is brighter than expected, and the dark part is
darker. What I want to simulate, is the effect that one translucent
image is layered above another translucent image in Photoshop, without
the background layer of course.

Best regards,
Bill Holt
Reply With Quote
  #4  
Old 04-07-2008, 09:21 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

Look at Graphics32 it has MMX optimisation
Reply With Quote
  #5  
Old 04-07-2008, 04:56 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

Bill Holt wrote:
> I'm working on the alpha blending algorithm of two 32bit images. The
> background image is not opaque. The result should be an image that
> contains both the alpha channel mixed. I searched for the algorithm
> and I found this rule:
>
> FinalAlpha = 255-(255-Alpha1)*(255-Alpha2)/255;
> FinalColor = Alpha1*Color1/FinalAlpha + Alpha2*Color2*(255-Alpha1)/
> FinalAlpha/255;
>
> where Color1/Alpha1 represents the foreground image, and Color2/Alpha2
> represents the background.


This rule can be derived in many different ways and I think it's probably
one of the most essential rules in digital image compositing.

It was first described in 1981 in a SIGGRAPH paper by Bruce A. Wallace and
Mark Levoy:

<http://graphics.stanford.edu/papers/merging-sig81/>

It's also been described in the following paper by Alvy Ray Smith (where he
argues that premultiplied alpha values should be used whenever possible):

<ftp://ftp.alvyray.com/Acrobat/4_Comp.pdf>


One way to approach this problem is to think of it in terms of physical
transportation of light. If we think of a single photon being transmitted
onto the foreground layer, we know that it may be either reflected or
absorbed by the layer, or it may be transmitted through it. We also know
that the probability of each of these events depends on the color and the
translucensy of this particular layer. In fact, the opacity of the layer can
be thought of as the probability by which the photon is either absorbed or
reflected.

Now, the derivation of the formula will require some understanding of
conditional probability and Bayesian statistics.

Let us start by defining the following events:

r1 -- the photon is reflected or absorbed by the foreground layer,
r2 -- the photon is reflected or absorbed by the background layer,
rx -- the photon is reflected or absorbed by either layer [r1 OR r2].

We can then define the probability P(r1), that the photon is
reflected/absorbed by layer 1 and the conditional probabilities P(r2|r1 =
false) and P(r2|r1 = true) that the photon is reflected/absorbed by layer 2,
given that it is first transmitted through layer 1:

P(r1) = a1 (a1 = foreground alpha)
P(r2|r1 = false) = a2 (a2 = background alpha)

The probability that the photon is reflected/absorbed by layer 2, P(r2), is
given as the joint probability of r1 = false and r2, i.e.:

P(r2) = P(r1 = false, r2) = P(r1 = false) * P(r2|r1 = false) = (1 - a1) *
a2.

We know that the output alpha, a_out, should yield the same reflectance
probability as the total reflectance probability of both of the unmerged
layers, and hence:

a_out = P(rx) = P(r1) + P(r2) = a1 + a2 - a1 * a2

Furthermore, we may think of the color value of a layer is the ratio between
what is reflected and what is absorbed. In order to determine the output
color, c_out, we need to know the color contribution of the foreground and
the background layer respectively. Thus, we need to know the conditional
probability that given that the photon is reflected/absorbed, it is
reflected/absorbed by either the foreground or the background, i.e. P(r1|rx)
and P(r2|rx).

This is where Bayes theorem comes to rescue. Bayes theorem states that we
can relate two random events A and B according to the following formula:

P(A|B) = P(B|A)*P(A)/P(B)

We know that given that the photon is reflected by either layer, the
probability of rx must necessarily be 1,

P(rx|r1) = P(rx|r2) = 1,

and consequently, from Bayes theorem, it follows that

P(r1|rx) = P(rx|r1)*P(r1)/P(rx) = a1 / (a1 + a2 - a1 * a2)
P(r2|rx) = P(rx|r2)*P(r2)/P(rx) = (1 - a1) * a2 / (a1 + a2 - a1 * a2)

If c1 and c2 are the colors of the foreground and the background
respectively, then the resultant color is given as sum of the contribution
of each layer:

c_out = c1 * P(r1|rx) + c2 * P(r2|rx)
= (c1 * a1 + c2 * (1 - a1) * a2) / (a1 + a2 - a1 * a2)
= (c1 * a1 + c2 * (1 - a1) * a2) / a_out

By using premultiplied opacity values, this can further be simplified to:

c_out = c1 + c2 * (1 - a1)

Mattias

Reply With Quote
  #6  
Old 04-07-2008, 11:06 PM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

On Apr 8, 5:56 am, "Mattias Andersson" <matt...@centaurix.com> wrote:
> By using premultiplied opacity values, this can further be simplified to:
>
> c_out = c1 + c2 * (1 - a1)
>
> Mattias


Thank you Mattias for the explanation. Now I understand the origin of
this rule. As you suggested, I used premultiplied alpha and your last
equation to rewrite the blending code. Evaluation showed a 30% of
performance increase. I guess premultiplication contributed a lot.
With division optimization I found somewhere else, I got another 20%
increase. Now the code looks like this:

#define DIVIDE_BY_255(thisInput)\
((thisInput+(((thisInput+128)>>8)+128))>>8 )

WORD tmpVar1, tmpVar2;
DWORD tmpVar3;
for (px=0; px < nTotalPx; px+=4)
{
tmpVar1 = pixelsFg[px+3]*pixelsBg[px+3];
pxD[px+3] = pixelsFg[px+3]+pixelsBg[px+3]-DIVIDE_BY_255(tmpVar1);

tmpVar1 = 255-pixelsFg[px+3];

tmpVar2 = pixelsFg[px+0]*pixelsFg[px+3];
tmpVar3 = pixelsBg[px+0]*pixelsBg[px+3]*tmpVar1;
pxD[px+0] =
DIVIDE_BY_255(tmpVar2)+DIVIDE_BY_255(DIVIDE_BY_255 (tmpVar3));

tmpVar2 = pixelsFg[px+1]*pixelsFg[px+3];
tmpVar3 = pixelsBg[px+1]*pixelsBg[px+3]*tmpVar1;
pxD[px+1] =
DIVIDE_BY_255(tmpVar2)+DIVIDE_BY_255(DIVIDE_BY_255 (tmpVar3));

tmpVar2 = pixelsFg[px+2]*pixelsFg[px+3];
tmpVar3 = pixelsBg[px+2]*pixelsBg[px+3]*tmpVar1;
pxD[px+2] =
DIVIDE_BY_255(tmpVar2)+DIVIDE_BY_255(DIVIDE_BY_255 (tmpVar3));
}

It's a bit hard to read. But it works perfectly.

I'm still looking for SSE or MMX hints to optimize the above code. Or
is it still worth further working?

Best regards,
Bill Holt
Reply With Quote
  #7  
Old 04-08-2008, 05:41 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

Bill Holt wrote:
> I'm still looking for SSE or MMX hints to optimize the above code. Or
> is it still worth further working?


Well, there is always a trade-off between performance and accuracy. In this
routine you rely on using a DIVIDE_BY_255 look up table. If you were using
MMX, this wouldn't be a feasible, since there are no instructions for
reading from multiple memory locations at once. Instead you would have to
use bitwise shift instructions, which will impact the accuracy.

In the Graphics32 library, we decided not to use MMX, since the look up
table approach was both faster and had better accuracy.

One optimization, however, would be to also take some special cases into
account, e.g.:

if (a1 == 0) return c2;
if (a1 == 1) return c1;
if (a2 == 0) return c1;
if (a2 == 1) return a1 * c1 + (1 - a1) * c2;

Because quite often pixels may be either fully transparent or fully opaque.

HTH,
Mattias

Reply With Quote
  #8  
Old 04-08-2008, 07:19 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

"Mattias Andersson" <mattias@centaurix.com> wrote:

> Bill Holt wrote:
> > I'm still looking for SSE or MMX hints to optimize the above code. Or
> > is it still worth further working?

>
> Well, there is always a trade-off between performance and accuracy. In this
> routine you rely on using a DIVIDE_BY_255 look up table. If you were using
> MMX, this wouldn't be a feasible, since there are no instructions for
> reading from multiple memory locations at once. Instead you would have to
> use bitwise shift instructions, which will impact the accuracy.
>
> In the Graphics32 library, we decided not to use MMX, since the look up
> table approach was both faster and had better accuracy.


MMX has too short registers and do not provide any instruction to get a
permutation of bytes (it is really hard to populate alpha values).
However SSE instruction set brings 128-bit registers, and SSSE3 (or
SSE3) has nice instruction called PSHUFB, that shuffles bytes in any
order. It is possible (in best case) to add & multiply eight 16-bit
components at the same time.

Intel has published lot of MMX application notes, I think most of them
remain feasible for SSE integer instructions:
http://softwarecommunity.intel.com/a...s/eng/1713.htm.

w.
Reply With Quote
  #9  
Old 04-08-2008, 07:21 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

Bill Holt <mailbill@21cn.com> wrote:

> Thanks, but where did you find that algorithm?


Sorry, I didn't understand your problem and post too fast.

w.
Reply With Quote
  #10  
Old 04-08-2008, 11:38 AM
Junior Member
 
Join Date: Nov 2009
Posts: 0
Application Development is on a distinguished road
Default Re: Optimizing alpha blending of two 32bit images

For division by 255 there is the good old Jim Blinn-Trick which encodes
the first continued fraction and adds some rounding:


ui32 iyteblend (ui8 a, ui8 b)
// returns a*b/255. Exact in all corner-cases but
// some difference in the inbetweens.
{
i32 t = a * b;
return ((t>>8) + t + 1)>>8;
}


For ARGB32 blending you might want to check out this weblink:

http://www.stereopsis.com/doubleblend.html

Nils
Reply With Quote
Reply

Thread Tools



All times are GMT -5. The time now is 01:39 PM.