Offsetting Buffer with NVIDIA GPU Address in atomic functions - Linkage fail

Hi guys,

currently I’m having weird issues using atomic operations on buffers that are resident and adressed by a GPU Ptr using the Nvidia GPU shader 5 extension.

While normal read/write access is no problem at all, my compute shader fails to link when using atomic function with an offset into the buffer - and only when trying to offset it! I don’t get any compilation errors or warnings though.

Here’s the shader code:
http://pastie.org/9331508#51

I’m going to try to work around this using a regular buffer binding - but I have the suspicion that this might by a NVIDIA compiler issue since I’m also having issues running Cyril Crassin’s linked list A-buffer implementation here. (Compiler error on atomicExchange functions).

Issue appears with 340.43 beta drivers as well as with 337.88.

Can anyone confirm this or point out what’s going wrong?

Thanks a lot in advance :slight_smile:

Hmm. On NVidia, albeit in a geometry shader not a compute shader, this has been working for me for quite a while for addressing an atomic located at a specified offset into a buffer:


# extension GL_ARB_gpu_shader5            : require
# extension GL_ARB_shader_atomic_counters : require
  ...
  layout( binding = 0, offset = 4 ) uniform atomic_uint my_count0;
  ...
  atomicCounterIncrement( my_count0 );

May or may not be some clues for you there… Haven’t really done much with compute shaders.

Hey, thanks for your reply! :slight_smile:

I don’t think there’s any difference between any shader type playing in there, however the main difference is the addressing of the buffer.

Right now I’m working around it with a regular bound shader storage buffer:

layout(binding = 1) coherent buffer vertexLightFlags32 //workound with regular buffer binding..
{
  uint lightFlags[];
} vertexLightFlagBuffer;

...

atomicOr(vertexLightFlagBuffer.lightFlags[vertices.x / 2], write0);


And this is working fine so far. However I’m currently refactoring my whole project in the spirit of ‘approaching zero driver overhead’ and therefore trying to get around state changes like buffer binds etc. as much as possible. That’s the reason I was trying to also do the atomic operations on a resident buffer addressed by its GPU address. I know it should be working because of the order-independent transparency implementation by Cyril Crassin I posted above - which in fact doesn’t run ony my system right now for probably the same reasons. That’s why I’m suspecting this might be a driver issue.

By the way the linked zip of the order-independent transparency implementation already contains a compiled binary, it’d be really interesting to know if other people on Nvidia HW have the same issue running it :confused: (it’s Nvidia only though, because of some vendor specific extensions).

[QUOTE=nattfoedd;1260221]…the order-independent transparency implementation by Cyril Crassin I posted above - which in fact doesn’t run ony my system right now for probably the same reasons. That’s why I’m suspecting this might be a driver issue.

By the way the linked zip of the order-independent transparency implementation already contains a compiled binary, it’d be really interesting to know if other people on Nvidia HW have the same issue running it :confused: (it’s Nvidia only though, because of some vendor specific extensions).[/QUOTE]

Downloaded Cyril’s OIT demo from here:

Specifically, the original version that doesn’t support AMD (the 2nd one allegedly does).

Had to make a few tweaks to get the C++ source compiled on Linux, but nothing big. Also based on this post, I nuked all the "inline " references in the GLSL shader source as these were causing compile errors with the latest NVidia drivers.

With those few mods, Cyril’s OIT demo compiles, links, and runs just fine on Linux with the NV 331.79 drivers (GPU: GTX 760).

Thanks so much for testing this!

I also read about the inline issue in the comments, though I kind of confused about it, since I can’t find any inline in any of the shader code :confused: Searched everything a couple of times now and only found inline functions in the C++ code. I’m feeling entirely stupid now, but in which files did you remove the "inline"s?

Going to test this at work tomorrow with the Quadro drivers as well - if it’s working there it’s probably indeed a Geforce driver issue on Windows.

I just rechecked, and you’re right. But yet I was definitely getting a bunch of errors out of this code whining about inlines not supported in shaders. I used perl to search/replace all the "inline " refs with “” and that seemed to fix it. But going back I don’t know what was causing that or why that seemed to fix it.

Anyway, recreating exactly what I did, here’s the procedure:

Go to this page:

and download the link ZIP file with windows executable, source code and VS2008 project. You can do this with wget via:

wget -U "I am not a bot" http://www.icare3d.org/files/ABufferGL4/ABufferGL4LinkedList.zip

unzip the tree, and apply this patch (this details what I changed):


diff -ruN t2/ABufferLinkedList/Matrix.h ABufferLinkedList/Matrix.h
--- t2/ABufferLinkedList/Matrix.h    2007-10-17 08:34:00.000000000 -0500
+++ ABufferLinkedList/Matrix.h    2014-06-28 20:46:21.189052045 -0500
@@ -273,11 +273,11 @@
     }
 
     ///
-    static Mat4<T> reflection(const Vector4<T> &plane) {
-        Mat4<T> res;
-        res.setReflection(v);
-        return res;
-    }
+    //static Mat4<T> reflection(const Vector4<T> &plane) {
+    //    Mat4<T> res;
+    //    res.setReflection(v);
+    //    return res;
+    //}
     void setReflection(const Vector4<T> &plane) {
         T x = plane.x;
         T y = plane.y;
@@ -342,4 +342,4 @@
 
 };
 
-#endif
\ No newline at end of file
+#endif
diff -ruN t2/ABufferLinkedList/ShadersManagment.cpp ABufferLinkedList/ShadersManagment.cpp
--- t2/ABufferLinkedList/ShadersManagment.cpp    2010-07-19 12:30:30.000000000 -0500
+++ ABufferLinkedList/ShadersManagment.cpp    2014-06-28 20:45:07.136931880 -0500
@@ -10,6 +10,7 @@
 #include <iostream>
 #include <fstream>
 #include <vector>
+#include "string.h"
 
 ///////////////////////////////////////////
 
@@ -334,4 +335,4 @@
     }
 
     return src;
-}
\ No newline at end of file
+}
diff -ruN t2/ABufferLinkedList/ShadersManagment.h ABufferLinkedList/ShadersManagment.h
--- t2/ABufferLinkedList/ShadersManagment.h    2010-06-22 21:13:12.000000000 -0500
+++ ABufferLinkedList/ShadersManagment.h    2014-06-28 20:44:53.694090200 -0500
@@ -10,6 +10,7 @@
 #ifdef WIN32
 #include <windows.h>
 #endif
+#include <stdio.h>
 
 // GLEW header for OpenGL extentions loading
 #include "GL/glew.h"
@@ -39,4 +40,4 @@
 }
 
 
-#endif
\ No newline at end of file
+#endif

Above the ABufferLinkedList directory, apply the patch with:

patch -p0 < my.patch

If you’re on UNIX/Linux, you’ll need to dos2unix the *.h and *.cpp files first.

Then compile/link:


g++ -o tst *.cpp -lglut -lGLEW -lGLU -lGL

Going to test this at work tomorrow with the Quadro drivers as well - if it’s working there it’s probably indeed a Geforce driver issue on Windows.

Ok. If you have any problems, feel free to post. There could still be something fishy going on here.

Thanks again for all the instructions! I’m going to try to reproduce this once I’m at home tonight.

Just for quick info: The demo does run flawlessly on a Quadro K6000 with driver version 333.11 here. And it does not run (same error as with the GTX780) on another system here that has a Titan running with driver version 335.23. So now we’ve got 5 drivers and all Windows Geforce drivers fail to compile/link the shader and all non Geforce/non windows drivers do. So I guess its time to file a bug report to Nvidia :wink:

Alright this issue is cleared - for anyone happening to come across the same problem, here the answer I received from Cyril Crassin:

This a regression that will be fixed in an upcoming driver. In the meantime: declare a temp variable with the expression passed.

Hoping that driver is coming up soon :wink: