Using Shared Memory Extensions with FBO Tetxtures

Hello All!

We have a current process that ‘renders to texture’ numerous 16MB images. We use a FrameBuffer to render the images off screen and then use glReadPixels to copy the newly created textures back into user space for post-processing analysis. Since the output textures are large and are ultimate processing rate is targeted for 2Hz, we are experiencing significant delays copying the textures from GPU to CPU memory.

We would like to implement the shared memory extensions available in v4.4 specifically the BufferObject and BufferTexture implementations. In all of our experiments using BufferTexture it seems that it is incompatible with the FBO as the BufferTexture really has no dimension (it is just a n-length byte array). All of our textures in the FBO are declared as GL_TEXTURE_2D. Output from our experiments is listed below with the FBO structure listed as the original implementation (that works but slow) and the attempt to use the BufferTexture approach but first a few questions…

[ul]
[li]Can we use the BufferTexture approach within a FrameBuffer Object?
[/li][li]If so, does anyone have an example?
[/li][li]If not, what would be an alternative approach?
[/li][li]Is it true that the BufferTexture is targeted for access within the shaders?
[/li][/ul]

Here is a diagnostic output from out program.

Original 2D texture
===== FBO STATUS =====
Max Number of Color Buffer Attachment Points: 8
Color Attachment 0: GL_TEXTURE, 4000x4000, GL_RED
Color Attachment 1: GL_TEXTURE, 4000x4000, GL_RED
Color Attachment 2: GL_TEXTURE, 4000x4000, GL_RED
Depth Attachment: GL_RENDERBUFFER, 4000x4000, GL_DEPTH_COMPONENT

Framebuffer complete.

With BufferTexture
===== FBO STATUS =====
Max Number of Color Buffer Attachment Points: 8
Color Attachment 0: GL_TEXTURE, 0x0, GL_RGBA
Color Attachment 1: GL_TEXTURE, 4000x4000, GL_RED
Color Attachment 2: GL_TEXTURE, 4000x4000, GL_RED
Depth Attachment: GL_RENDERBUFFER, 4000x4000, GL_DEPTH_COMPONENT

[ERROR] Framebuffer incomplete: Unknown error.

The composition of textures/renderbuffers is hidden in OpenGL and therefor you can not get raw access to it.
A BufferTexture is not a renderable texture. Even if it was possible, your GPU would starve of bandwidth during rendering. (Ignoring onboard/APUs here)

What I would suggest is two things.
-Use OpenGL to copy the renderbuffer for you into a client side buffer. So you don’t wast CPU time. (Pixel Buffer Objects)
-Use a ring buffer of like 3 renderbuffers/client side buffers. And render frames ahead befor you try to access the data on the client side.

Render your frame, then bind your client buffer to GL_PIXEL_PACK_BUFFER and use glReadPixels. Then set a fence.
After you did this for the first 3 frames you can start your normal loop.
Just wait for the oldest fance. Then work with the data on the CPU, render the next frame and wait for the now oldest fance.


                render1 pixelPackToClient1 fance1
                render2 pixelPackToClient2 fance2
                render3 pixelPackToClient3 fance3
waitFance1 cpu1 render1 pixelPackToClient1 fance1
waitFance2 cpu2 render2 pixelPackToClient2 fance2
waitFance3 cpu3 render3 pixelPackToClient3 fance3
waitFance1 cpu1 render1 pixelPackToClient1 fance1
waitFance2 cpu2 render2 pixelPackToClient2 fance2
waitFance3 cpu3 render3 pixelPackToClient3 fance3

This gives the driver/GPU enough time to render and transfer the date to your client side buffers and your GPU/CPU can work in paralel.

Also its best to use an format that the GPU is most likely to use internally, so it does not have to format the data during the pixel transfer. For example: BGRA.