Vertex arrays and unexpected behaviour - Graphics
This is a discussion on Vertex arrays and unexpected behaviour - Graphics ; Hello,
I've been looking at options for displaying large terrains quickly and
efficiently, and come across some performance data that is the reverse of
what I would expect.
I've looked at
1) Immediate mode glBegin, glVertex etc
2) Immediate mode ...
-
Vertex arrays and unexpected behaviour
Hello,
I've been looking at options for displaying large terrains quickly and
efficiently, and come across some performance data that is the reverse of
what I would expect.
I've looked at
1) Immediate mode glBegin, glVertex etc
2) Immediate mode with arrays glVertexPointer, glDrawElements etc
3) VBO objects, glBindBufferARB etc
VBOs (3) seems to be incredibly slow, whereas (1) seems slightly faster
than (2) - this is completely reverse behaviour than what I've always been
led to believe! (should be slower->123->faster surely?)
I've ruled out how I use glEnableClientState() - I get the same bahaviour
whether I turn it on for the terrains polygons, or leave it on all the time.
All my data is aligned as floats or doubles.
I'm at a complete loss as to what's going on.
if (file->use_vbo)
{
glBindBufferARB(GL_ARRAY_BUFFER_ARB,file->vxbuffers[BUFFER_VERTEX]);
glVertexPointer(3,GL_DOUBLE,0,NULL);
if (texture)
{
glBindBufferARB(GL_ARRAY_BUFFER_ARB,file->vxbuffers[BUFFER_UV]);
glTexCoordPointer(2,GL_FLOAT,0,NULL);
}
}
else
{
glVertexPointer(3,GL_DOUBLE,0,file->vertex_buffer.Items());
if (texture)
glTexCoordPointer(2,GL_FLOAT,0,file->uv_buffer.Items());
}
unsigned int *inx=...
glDrawElements(mode,count,GL_UNSIGNED_INT,inx);
-
Re: Vertex arrays and unexpected behaviour
"Makhno" <root@127.0.0.1> wrote in message
news:yJudnaeSXph97E_ZRVnysw@bt.com...
> Hello,
> I've been looking at options for displaying large terrains quickly and
> efficiently, and come across some performance data that is the reverse of
> what I would expect.
> I've looked at
>
> 1) Immediate mode glBegin, glVertex etc
> 2) Immediate mode with arrays glVertexPointer, glDrawElements etc
> 3) VBO objects, glBindBufferARB etc
>
>
> VBOs (3) seems to be incredibly slow, whereas (1) seems slightly faster
> than (2) - this is completely reverse behaviour than what I've always been
> led to believe! (should be slower->123->faster surely?)
>
> I've ruled out how I use glEnableClientState() - I get the same bahaviour
> whether I turn it on for the terrains polygons, or leave it on all the
> time.
> All my data is aligned as floats or doubles.
> I'm at a complete loss as to what's going on.
>
>
> if (file->use_vbo)
> {
> glBindBufferARB(GL_ARRAY_BUFFER_ARB,file->vxbuffers[BUFFER_VERTEX]);
> glVertexPointer(3,GL_DOUBLE,0,NULL);
> if (texture)
> {
> glBindBufferARB(GL_ARRAY_BUFFER_ARB,file->vxbuffers[BUFFER_UV]);
> glTexCoordPointer(2,GL_FLOAT,0,NULL);
> }
>
> }
> else
> {
> glVertexPointer(3,GL_DOUBLE,0,file->vertex_buffer.Items());
> if (texture)
> glTexCoordPointer(2,GL_FLOAT,0,file->uv_buffer.Items());
> }
>
> unsigned int *inx=...
> glDrawElements(mode,count,GL_UNSIGNED_INT,inx);
>
>
>
>
>
>
>
It's very possible that 1 can be faster...It can well be faster than a naive
2, which doesn't have the correct async control unless you use the "VAR &
fence" extensions. (GL_NV_vertex_array_range and GL_NV_fence)
[http://www.delphi3d.net/articles/vie...=varfence.htm]
Some off the cuff ideas;
0) are you sure that you are accurately timing OpenGL ? That is, are you
taking a average draw rate over a fairly large number of frames ?
1) How do you build the VBO's ?
(glBufferDataARB(... --->GL_STATIC_DRAW_ARB<---) , I hope ? Or memory-mapped
?
Are you *sure* you aren't loading the buffer into VRAM every draw time ?
2) Try #1 and #2 with Display Lists just for laughs. They could be 10x
faster that way. Not that that explains (3).
But, if (1) in a display list is slower, or not much faster, then you
probably aren't geometry limited at all, but rather fill- or texture-fetch-
rate limited, in which case vertex acceleration methods won't help you much.
3) try floats for coordinates. Doubles are a waste of memory and may be
the root cause of the problem. It's unlikely that your card internally does
doubles anyway. Since you only build the VBO's once (right?), a CPU convert
to floats is just a one-time hit.
-jbw
-
Re: Vertex arrays and unexpected behaviour
> It's very possible that 1 can be faster...It can well be faster than a
> naive 2, which doesn't have the correct async control unless you use the
> "VAR & fence" extensions. (GL_NV_vertex_array_range and GL_NV_fence)
Radeon 9000 card.
> Some off the cuff ideas;
>
> 0) are you sure that you are accurately timing OpenGL ?
I can count seconds between the frames with my VBO implementation. I do not
need to time it to know it's slow.
> 1) How do you build the VBO's ?
> (glBufferDataARB(... --->GL_STATIC_DRAW_ARB<---) , I hope ? Or
> memory-mapped ?
> Are you *sure* you aren't loading the buffer into VRAM every draw time ?
Quite sure.
> 2) Try #1 and #2 with Display Lists just for laughs. They could be 10x
> faster that way. Not that that explains (3).
I will.
> But, if (1) in a display list is slower, or not much faster, then
> you probably aren't geometry limited at all, but rather fill- or
> texture-fetch- rate limited, in which case vertex acceleration methods
> won't help you much.
Is it possible it is something to do with glDrawElements being called for
each triangle?
> 3) try floats for coordinates.
It appears you are correct, using floats does indeed speed up the VBO until
it is at least as fast as (1) was, possibly faster.
I'm using doubles because I hope that some day graphics cards will truly use
64-bit doubles. For certain applications this is needed as single-precision
float become quite inaccurate at 100km or so.
Does anybody know of any new cards that use doubles? For verticies as well
as depth buffer?
-
Re: Vertex arrays and unexpected behaviour
Makhno wrote:
> glVertexPointer(3,GL_DOUBLE,0,NULL);
>
"double"? There's your problem. The hardware
can't work with doubles so the driver has
to do something horrible to get arrays to
work. Converting doubles to floats in
immediate mode is cheap by comparison.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
-
Re: Vertex arrays and unexpected behaviour
"Makhno" <root@127.0.0.1> wrote in message
news:BPCdnfmuI_3AC0_ZRVnyhQ@bt.com...
>> It's very possible that 1 can be faster...It can well be faster than a
>> naive 2, which doesn't have the correct async control unless you use the
>> "VAR & fence" extensions. (GL_NV_vertex_array_range and GL_NV_fence)
>
> Radeon 9000 card.
>
>> Some off the cuff ideas;
>>
>> 0) are you sure that you are accurately timing OpenGL ?
>
> I can count seconds between the frames with my VBO implementation. I do
> not
> need to time it to know it's slow.
>
>> 1) How do you build the VBO's ?
>> (glBufferDataARB(... --->GL_STATIC_DRAW_ARB<---) , I hope ? Or
>> memory-mapped ?
>
>
>> Are you *sure* you aren't loading the buffer into VRAM every draw time ?
>
> Quite sure.
>
>> 2) Try #1 and #2 with Display Lists just for laughs. They could be 10x
>> faster that way. Not that that explains (3).
>
> I will.
Still might be interesting to see even tho the VBO problem is solved...
>
>> But, if (1) in a display list is slower, or not much faster, then
>> you probably aren't geometry limited at all, but rather fill- or
>> texture-fetch- rate limited, in which case vertex acceleration methods
>> won't help you much.
>
> Is it possible it is something to do with glDrawElements being called for
> each triangle?
That would be a heck yes. Immediate mode is faster than doing that!
>
>> 3) try floats for coordinates.
>
> It appears you are correct, using floats does indeed speed up the VBO
> until it is at least as fast as (1) was, possibly faster.
>
aha.
> I'm using doubles because I hope that some day graphics cards will truly
> use
> 64-bit doubles. For certain applications this is needed as
> single-precision
> float become quite inaccurate at 100km or so.
Well, maybe so, but consumer graphics cards won't be doing that soon, so you
need to refactor your app.
24-bit fixed point depth buffer seems to be pretty entrenched at this point.
Don't need more for games, so,
don't bet on a game card having more any time.
>
> Does anybody know of any new cards that use doubles? For verticies as well
> as depth buffer?
>
>
>
-
Re: Vertex arrays and unexpected behaviour
>>> 2) Try #1 and #2 with Display Lists just for laughs. They could be 10x
>>> faster that way. Not that that explains (3).
>>
>> I will.
>
> Still might be interesting to see even tho the VBO problem is solved...
I will try an have a look, but architecturally, it a hard thing to do as my
terrain is a scene graph with multiple LoDs; At what level in the hierarchy
should I turn commands into lists?
>> Is it possible it is something to do with glDrawElements being called for
>> each triangle?
>
> That would be a heck yes. Immediate mode is faster than doing that!
Unfortunately, each triangle could potentially have a different texture and
other settings (colour, material, culling, depth bias etc), which makes it
difficult to get any command to draw more than one triangle at a time.
Now I'm using floats, VBO are noticeably faster, but not considerably, than
immediate mode.
>> I'm using doubles because I hope that some day graphics cards will truly
>> use 64-bit doubles. For certain applications this is needed as
>> single-precision float become quite inaccurate at 100km or so.
>
> Well, maybe so, but consumer graphics cards won't be doing that soon, so
> you need to refactor your app.
> 24-bit fixed point depth buffer seems to be pretty entrenched at this
> point. Don't need more for games, so, don't bet on a game card having more
> any time.
Which is annoying, as even if the card was x4 expensive, we'd still buy it
if it natively handled doubles, as it would solve a lot of our problems with
large terrain areas.
Thansk for your help - my previous reply was a little peeved as I had been
up to 2am figuring out my VBO problem.
-
Re: Vertex arrays and unexpected behaviour
>>>> 2) Try #1 and #2 with Display Lists just for laughs. They could
>>>> be 10x faster that way.
Though they appear to be running slightly faster once compiled, the actual
compilation slows the application down, and there's no way I can do this in
advance due to the size of the terrain.
Moving slowly across the terrain helps a bit, but this would sacrifice my
engine's capability for marginal (if any) performance gain.
-
Re: Vertex arrays and unexpected behaviour
Makhno wrote:
>
> I will try an have a look, but architecturally, it a hard thing to do as my
> terrain is a scene graph with multiple LoDs;
>
LODs are _almost_ a waste of time on modern cards
- most cards can now process a triangle per clock
cycle which is usually much more then the rasterizer
can handle (ie. the bottleneck isn't vertices)
It might be worth it for really big terrains which
page to/from disk but with well organized VBOs the
number of triangles is pretty much irrelevant. The
latest cards will handle hundreds of millions of
triangles per second, no problem.
> Which is annoying, as even if the card was x4 expensive, we'd still buy it
> if it natively handled doubles, as it would solve a lot of our problems with
> large terrain areas.
>
I don't think that's going to happen any time soon.
Floats are good enough for 3D graphics and switching
to doubles would eat up an awful lot of silicon.
If you're having accuracy problems on big terrains
you need to chop the terrain up into tiles, translate
each tile to the origin, then put an inverse transform
into the modelview matrix when you render the tile.
This will fix any problems with floating point
precision. The tiles also come in handy for culling.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
-
Re: Vertex arrays and unexpected behaviour
Makhno wrote:
> Though they appear to be running slightly faster once compiled, the actual
> compilation slows the application down, and there's no way I can do this in
> advance due to the size of the terrain.
>
> Moving slowly across the terrain helps a bit, but this would sacrifice my
> engine's capability for marginal (if any) performance gain.
>
Sounds like you're moving into the sort of areas
that big machines are designed to handle but PCs
are bad at.
If your terrain is paging to disk then you need
to allocate each level of detail in a separate
address space in shared memory. Then you need
a second process which tracks the position of
the viewer and guesses which bits of memory
are going to be needed "soon" and touch them
so they get paged in. This stops judders in
the main process.
--
<\___/>
/ O O \
\_____/ FTB. For email, remove my socks.
In science it often happens that scientists say, 'You know
that's a really good argument; my position is mistaken,'
and then they actually change their minds and you never
hear that old view from them again. They really do it.
It doesn't happen as often as it should, because scientists
are human and change is sometimes painful. But it happens
every day. I cannot recall the last time something like
that happened in politics or religion.
- Carl Sagan, 1987 CSICOP keynote address
-
Re: Vertex arrays and unexpected behaviour
>> Moving slowly across the terrain helps a bit, but this would sacrifice my
>> engine's capability for marginal (if any) performance gain.
>
> If your terrain is paging to disk then you need
> to allocate each level of detail in a separate
> address space in shared memory.
It's not paging - all verticies from the entire world are loaded into VBOs.
It is drawing only what the viewer sees, and placing into displays lists
those things he is about to see (takes too long to put the whole world into
a display list).
> Then you need
> a second process which tracks the position of
> the viewer and guesses which bits of memory
> are going to be needed "soon" and touch them
> so they get paged in. This stops judders in
> the main process.
The 'paging' in this case appears the be the construction of the display
list, which a second thread will not help.
Similar Threads
-
By Application Development in forum Python
Replies: 5
Last Post: 10-10-2007, 08:19 AM
-
By Application Development in forum c++
Replies: 2
Last Post: 08-07-2007, 04:20 AM
-
By Application Development in forum Java
Replies: 1
Last Post: 01-26-2007, 10:15 AM
-
By Application Development in forum Graphics
Replies: 2
Last Post: 12-25-2006, 11:30 AM
-
By Application Development in forum Graphics
Replies: 3
Last Post: 07-05-2006, 12:19 PM