Hi,
I am currently trying to understand the differences between the [var]coherent[/var] and [var]volatile[/var] qualifier. First, some quotes (in code tags for a better formatting) from the [var]ARB_shader_image_load_store[/var] extension doc:
Short description of [var]coherent[/var] and [var]volatile[/var]:
Qualifier Meaning
------------ -------------------------------------------------
coherent memory variable where reads and writes are coherent
with reads and writes from other shader invocations
volatile memory variable whose underlying value may be
changed at any point during shader execution by
some source other than the current shader invocation
Long description of [var]coherent[/var]:
Memory accesses to image variables declared using the "coherent" storage
qualifier are performed coherently with similar accesses from other shader
invocations. In particular, when reading a variable declared as
"coherent", the values returned will reflect the results of previously
completed writes performed by other shader invocations. When writing a
variable declared as "coherent", the values written will be reflected in
subsequent coherent reads performed by other shader invocations. As
described in the Section 2.20.X of the OpenGL Specification, shader memory
reads and writes complete in a largely undefined order. The built-in
function memoryBarrier() can be used if needed to guarantee the completion
and relative ordering of memory accesses performed by a single shader
invocation.
When accessing memory using variables not declared as "coherent", the
memory accessed by a shader may be cached by the implementation to service
future accesses to the same address. Memory stores may be cached in such
a way that the values written may not be visible to other shader
invocations accessing the same memory. The implementation may cache the
values fetched by memory reads and return the same values to any shader
invocation accessing the same memory, even if the underlying memory has
been modified since the first memory read. While variables not declared
as "coherent" may not be useful for communicating between shader
invocations, using non-coherent accesses may result in higher performance.
Long description of [var]volatile[/var]:
Memory accesses to image variables declared using the "volatile" storage
qualifier must treat the underlying memory as though it could be read or
written at any point during shader execution by some source other than the
executing shader invocation. When a volatile variable is read, its value
must be re-fetched from the underlying memory, even if the shader
invocation performing the read had previously fetched its value from the
same memory. When a volatile variable is written, its value must be
written to the underlying memory, even if the compiler can conclusively
determine that its value will be overwritten by a subsequent write. Since
the external source reading or writing a "volatile" variable may be
another shader invocation, variables declared as "volatile" are
automatically treated as coherent.
Issues section:
(26) What sort of qualifiers should we provide relevant to memory
referenced by image variables?
RESOLVED: We will support the qualifiers "coherent", "volatile",
"restrict", and "const" to be used in image variable declarations.
"coherent" is used to ensure that memory accesses from different shader
invocations are cached coherently (i.e., one invocation will be able to
observe writes from another when the other invocation's writes
complete). This coherence may mean the use of "coherent"-qualified
image variables may perform more slowly than of otherwise equivalent
unqualified variables.
"volatile" behaves as in C, and may be needed if an algorithm requires
reading image memory that may be written asynchronously by other shader
invocations.
My understanding of their uses:
[var]coherent[/var]:
[ul]
[li] only useful for dependent shader invocations (e.g. fragment shader invocations generated from a complete primitive after vertex shader has processed its vertices)
[/li][li] [var]memoryBarrier()[/var] function goes hand-in-hand with this qualifier (it does a cache/shared memory flush on [var]coherent[/var] qualified variables and determines order of memory accesses), you can say when to flush (btw: is there an implicit [var]memoryBarrier()[/var] call at the end of the shader, when there are [var]coherent[/var] qualified variables and no [var]memoryBarrier()[/var] was specified in the shader?)
[/li][li] non-[var]coherent[/var] qualified variables might be L-cached or resident in shared memory and hence (dependent) spawning threads on other SIMD processors might not observe their values directly
[/li][li] use-case: e.g. read values from an image in a dependent shader invocation, which were written by an invocation in a previous shader stage (values might still be cached, so have to be flushed via [var]memoryBarrier()[/var])
[/li][/ul]
[var]volatile[/var]:
[ul]
[li] [var]coherent[/var] is implicitly inherent
[/li][li] always fetches values directly from global memory (no caching)
[/li][li] always writes values directly to global memory (no caching)
[/li][li] might be more expensive than [var]coherent[/var] (absolutely no temporary caching/memory access optimizations allowed)
[/li][li] use-case: e.g. atomically increment a texel in an image ([var]volatile[/var] qualified) for independent shader invocations (shaders might be executed on different SIMD processors and use shared memory for atomically incrementing)
[/li][li] [var]memoryBarrier()[/var] only useful for avoiding (compiler) memory access reordering, since [var]volatile[/var] already guarantees a direct write to global memory
[/li][/ul]
Other important points I forgot? Or are there any errors in my understanding of the doc? Let us collect some more facts for a better understanding of those two qualifiers.