Customized clipping volume

According to the current specification the clipping volume is defined by:

-Wc <= Xc <= Wc
-Wc <= Yc <= Wc
Zmin <= Zc <= Wc

where Xc,Yc,Zc,Wc are the clip coordinates produced by the vertex shader as the components of gl_Position, and the Zmin is either -Wc or 0 depending on the value of depth mode set by glClipControl function. After the clipping, the division of {Xc,Yc,Zc} is performed by {Wc,Wc,Wc} producing the normalized device coordinates.

As the upper clipping bound (which is also the division vector) assembled from 3 identical values {Wc,Wc,Wc}, then all three coordinates {Xc,Yc,Zc} are divided by the same value. Technically, I see no reason to restrict the division vector from being assembled with unequal values (not necessarily {Wc=Wc=Wc}). By letting the vertex shader to output the division vector explicitly the additional functionality may be achieved (will be discussed below).

So the proposed extension references to the shading language adding an additional optional output for the vertex shader stage:

out vec3 gl_PositionDiv;

If the vertex shader writes to the gl_PositionDiv, then the clipping volume is defined by:

-gl_PositionDiv.x <= Xc <= gl_PositionDiv.x
-gl_PositionDiv.y <= Yc <= gl_PositionDiv.y
Zmin <= Zc <= gl_PositionDiv.z

where Zmin is either -gl_PositionDiv.z or 0 depending on the value of depth mode set by glClipControl function. If the vertex shader does not write to gl_PositionDiv then that vector is automatically assembled as:

gl_PositionDiv = gl_Position.www

which is being essentially an equivalent of the fixed functionality implemented currently.
After the clipping the normalized device coordinates are calculated by dividing the gl_Position.xyz by gl_PositionDiv.

The major problem this extension is targeted to solve is a poor z-buffer utilization due to the uniform division of x,y and z components. Specifying the separate division coefficients for xy and z opens a new possibilities to control the distribution of depth values. In particular, the Far clipping plane can be eliminated to allow drawing the objects any distance away and Near clipping plane can be set at much closer distances compared to the conventional setups. This will make it possible to render the large scenes without introductions of separate cameras or overpushed Near clipping plane; the special/oversized depth buffer formats would not be required either.

The example of such setup using the proposed extension is described in that post:

NOTE: the gl_PositionDiv is defined at that post as vec4 type. I can not edit the post as the time limit has expired. But it might be even better to make gl_PositionDiv a four-component vector and use the forth component as clipping bounds for Wc just the same way the first three are used (unless it will have a considerable performance cost). This will further increase the functionality of the proposed extension.

Can’t sleep well having no answer for the topic - so badly I want that extension. :slight_smile:
I even reposted that on nVidia forums and placed a reference at AMD forum - dead silence! Is there any way to contact anyone who can at least tell if this extension is possible to implement at all? Anybody? Please!.. :dejection:

If you wan’t this so badly, why don’t you devide the components of gl_Position by your special division vector, set the w component 1.0 and see if it really is what you want.

Also, if it can be implement in the shader by yourself and nobody else on earth needs it, why would you create a GL extension for it?

Do you even have a use case for this?

[QUOTE=Agent D;1261254]If you wan’t this so badly, why don’t you devide the components of gl_Position by your special division vector, set the w component 1.0 and see if it really is what you want.[/QUOTE]Division must be done after the clipping. Doing so in vertex shader is a potential source of overflow and division-by-zero for those which trap into the area close to the clipping plane (W plane, “division plane”).

[QUOTE=Agent D;1261254]Also, if it can be implement in the shader by yourself and nobody else on earth needs it, why would you create a GL extension for it?

Do you even have a use case for this?[/QUOTE]Right now I save the value of the negated z-component, interpolate it and write as gl_FragDepth implementing linear z-buffer which allows me to draw the scene without artifacts with zFar>200000 and zNear=0.001 (can be even smaller) having 24 bits of depth buffer. With this setup I have a z-buffer able to distinguish between surfaces 0.01 apart from each other over the whole 200000 distance.
This action is essentially equivalent to dividing the negated z-component by ‘1.0’ (instead of ‘Wc’ which is still used for x and y components) and using Zmin=0 instead of zMin=-Wc (which may be set via glClipControl).

Making the components of division vector separately specified instead of uniformly set to the same value will let us control the distribution of z-values independently from xy.

What will we get? We can draw incomparably larger scenes with an incomparably smaller constant zNear without a need to introduce additional cameras, dynamic adjusting of zNear, and even a standard 24bit depth buffer will be sufficient.

Practically, with this extension we would be able to attach a camera-object of a size as small as a human’s eye right onto the character’s model and it would be able to capture the large open-world scene as well as the parts of the character’s model the camera is mount on. This will let the game engines to render and simulate the character’s body uniformly with the other game objects without “making it transparent” or introducing fake parts rendered individually which cause those to not receive shadows, decals, not colliding with other objects, e.t.c.

Well, there are techniques that make it achievable even without the extension, yes, and there are games that prove it is all possible. But all those tricks have computational expenses, while the proposed extension does not add any additional computations overhead: there are 3 values (Xc,Yc,Zc) being divided anyway, and I see no technical difference of dividing those by a vector which has different components instead of the equal ones. So all it takes is to allow us to specify those components explicitly instead of getting them assembled by a fixed functionality. Or is there is smg I do not take into account?

The hardware units that deal with depth writes and testing are fixed-function units (also color blending/writes), perhaps that’s why only fixed 16b/24b and [0,1] FP32 is used. If the full FP32 range were available for the z-buffer, I would think that we’d at least see an extension to glDepthRange() to allow values outside of [0,1]. (I also wouldn’t mind seeing an increased z-buffer range for some bad cases artists create, btw.)

I see you did not read the actual definition - just a topic’s name, ay? :wink:
The resulting NDC coordinates are still fall in range [-1…1]. Only the clipping bounds I suggest to make unequal - instead of
{Wc,Wc,Wc} vs {-Wc,-Wc,zMin} I suggest to use custom possibly unequal values for each of the coordinates:
{Wcx,Wcy,Wcz} vs {-Wcx,-Wcy,zMin} which are derived from the explicitly set division vector
{Wcx,Wcy,Wcz} instead of the single tripled value Wc taken from the forth component of gl_Position.

Here is a clear explanation with examples.

Generally they tend to reduce the fixed functions, not to extend them. I would rather have them remove perspective division altogether and let the shaders do it themselves if they need it.
Division is one of the more hardware-taxing operations, its cost is relatively high and is good to avoid when not necessary.
Your proposal implies 3 divisions instead of single one per vertex. (note that dividing all x, y and z by the same value w really means single division (1/w) and 3 multiplications, which are far cheaper)

[QUOTE=l_belev;1261268]Generally they tend to reduce the fixed functions, not to extend them. I would rather have them remove perspective division altogether and let the shaders do it themselves if they need it.[/QUOTE]Once again. Perspective division must be done after the clipping. You can not clip primitives in vertex shader. Division can not be done for the zero or tiny numbers in divisor as the result is undefined. And the clipping and division using the same number also ensures that the result lays in range [-1…1]. My point is that the number may be different for different components, not necessarily the same for all.

[QUOTE=l_belev;1261268]Division is one of the more hardware-taxing operations, its cost is relatively high and is good to avoid when not necessary.
Your proposal implies 3 divisions instead of single one per vertex. (note that dividing all x, y and z by the same value w really means single division (1/w) and 3 multiplications, which are far cheaper)[/QUOTE]?! O.O Sorry, but as far as I know, the GPU hardware is vectorized - processing 4 items at once. Calculation of 1/w is the same thing as calculation of {1/w,1/w,1/w,1/w}. The source register may be stuffed with 4 different items to produce {1/wx,1/wy,1/wz,1/ww} and it will take the same amount of cycles as if all values would be equal - it doesn’t matter what were the actual contents of the source xmm register (assuming hardware with SSE support).

That w (resp. 1/w) value is not only needed for clipping, but also for perspective correct attribute interpolation. How would that work if you get three different w values ?

At any rate, this extension idea seems terribly fishy anyways. As pointed out, normalize your w to one then divide by those gl_PositionDiv factors. The only issue where it is a problem is when none of the fields from gl_PositionDiv is negative or close to zero. But the world is not ended. You can do this all with a geometry shader and do the clipping yourself [a triangle clipped N-times produces a triangle fan of no more than N+1 triangles by the way]. Also, the entire point of dividing by the same value, w, is to do perspective. What it looks like you really want is just for depth values, which can be done by by hand from the vertex shader anyways by normalizing yourself. The implicit requirement of w>0 means infront of the eye (but not the near plane), which I assume you’d want anyways. Normalize the z’s yourself is my advice.

Asking for a different divide value for each element of gl_Position.xyz would make the clipper (fixed function part) of a GPU take even more sand, in order to keep the same performance of triangles to clocks because various performance shortcuts for the most common situation (namely all w’s are positive and away from zero) would be gone. On subject of that, most hardware (if not all) has a dedicated unit handling triangle setup and clipping all rolled into one. Additionally, most implementations have a guard band logic to avoid clipping and let scissoring do the job. To be precise: if a triangle has all of the w’s are positive (and away from zero) and z’s in happy range [-1,1], then scissoring takes care of the clipping volume (essentially). If all the w’s are positive and some of the z’s are icky, then a triangle needs to get clipped against just the two z requirements, which results in at most 3 triangles. The really ugly case is when one or more of the w’s is negative (but not all); that case is the icky case and the clipper more often than not then does the clipping pain computation against all the clipping planes. That part sucks, always sucks and uses up a fair amount of sand. There has been hardware (like old Intel GPU’s) that did not have a dedicated clipper; the clipping and divide work was done by the programmable EU’s. It was not happy, so they added a dedicated clipper.

My advice: likely all you want is normalizing z your own way (which just is a VS job), but if you really want the whole enchilada, make a GS to implement that which you are after.

Ah, that is what I am missing! Thanks, [b]mbentrup[/b], for pointing that out. May I be dare to ask for a more info about the interpolation process in details? Maybe some links you could point out for me, please?

Well, there are few intuitive solutions that come to my mind.

The first solution is to make the gl_PositionDiv to be a 4-component vector, where first component is used for clipping and dividing gl_Position.x, second for y, third for z and the forth component of gl_PositionDiv is used for attribute interpolation.

The second solution is to use gl_Position.w for the attribute interpolation as it was before (otherwise what is the use of gl_Position.w would be, right?). In other words, this is like a restriction: the user can manipulate only first three components of the division vector while the forth one is taken from gl_Position.w implicitly.

I vote for the first solution, even though it makes gl_Position.w redundant unused component in case the vertex shader chooses to write custom values into gl_PositionDiv.

[QUOTE=kRogue;1261277]Asking for a different divide value for each element of gl_Position.xyz would make the clipper (fixed function part) of a GPU take even more sand, in order to keep the same performance of triangles to clocks because various performance shortcuts for the most common situation (namely all w’s are positive and away from zero) would be gone. On subject of that, most hardware (if not all) has a dedicated unit handling triangle setup and clipping all rolled into one. Additionally, most implementations have a guard band logic to avoid clipping and let scissoring do the job. To be precise: if a triangle has all of the w’s are positive (and away from zero) and z’s in happy range [-1,1], then scissoring takes care of the clipping volume (essentially). If all the w’s are positive and some of the z’s are icky, then a triangle needs to get clipped against just the two z requirements, which results in at most 3 triangles. The really ugly case is when one or more of the w’s is negative (but not all); that case is the icky case and the clipper more often than not then does the clipping pain computation against all the clipping planes. That part sucks, always sucks and uses up a fair amount of sand. There has been hardware (like old Intel GPU’s) that did not have a dedicated clipper; the clipping and divide work was done by the programmable EU’s. It was not happy, so they added a dedicated clipper.[/QUOTE]I am not sure I understood… With the proposed extension the normalized device coordinates will be in [-1…1] range anyway, even with different values in all of the components of gl_PositionDiv. And with glClipControl the clipping of z is already unbound from clipping of xy because the range of Zndc can be changed to [0…1], that means those coordinates are already processed independently. Well, whenever you write to gl_FragDepth, the clipping for Z is not performed at all, is it right?
That what made me think that the independence of clipping for all three components is not a problem, so I come up with that extension as an alternative to DirectX w-buffers.

[QUOTE=kRogue;1261277]My advice: make a GS to implement that which you are after.[/QUOTE]Do you suggest to perform the clipping manually in GS, then make a perspective division there also?

Gentlemen, I’ve updated the topic related the proposed extension to make it clear and well-defined (check the Extension part):

If there is still something “fishy”, please, point it out.

If it’s all about Z-fighting, why do you insist on fiddling with the X and Y components as well? Why is it so hard to scale the Z value in the shader if that’s all you actually want?

Because scaling Z will not produce the desired result, Agent D.
The closer the point to the W plane, the more dense the distribution of the XYZ values near it. Therefore separate W planes need to be used for Z and XY to control the distribution of Z values independently from XY. But making exceptions is not an OpenGL way. If Z is separated, then all other components should be separated also. The advantage is that we can omit the perspective matrix and let gl_PositionDiv alone handle the perspective transformations while gl_Position carrying the result of model-view transformation. In that case all 4 components of gl_PositionDiv will be used and have different values.

Um, I think some things are a touch unclear. Ahem. If all one is worried is about z, then there really is a simple way. Lets say one wants to divide gl_Position.z by ZDiv instead of gl_Position.w to get the normalized value. One way to do this -without- geometry shaders is to enable clipping for the fist 2 clip distances and write for their values:


gl_ClipDistance[0] = ZDiv - gl_Position.z;
gl_ClipDistance[1] = ZDiv + gl_Position.z; 

and then AFTER that write


gl_Position.z *= gl_Position.w/ZDiv;

and lastly, to be safe, enable depth clamping.

This will give you what you are after for z. For the whole enchilada, just use a geometry shader. There is no need for an extension.

Heh, again I here a suggestion to multiply z by w. And again I say, that the result turns out to be incorrect for clipped primitives as the vertices inserted by clipper get wrong depth values (did, tested, no-no-no!). The original vertices multiplied by w get divided by w - OK. But the inserted vertices get their Zc interpolated while it is multiplied by w, and when they get divided, result apparently comes to be incorrect - Z is higher than expected, so the primitives behind the clipped ones pop through.

Believe me, I spend a lot of time trying out many tricks (my imagination is not bad, really) - there is no easy solution. :slight_smile:
The best workaround for high zFar/zNear values, I think, would be:

  1. using the FP depth buffer,
  2. glDepthFunc(GL_GREATER),
  3. glClipControl(…,GL_ZERO_TO_ONE),
  4. the following projection matrix:
    | f/aspect  0      0      0    |
    |                              |
    |    0      f      0      0    |
P = |                              |
    |    0      0      0    zNear  |
    |                              |
    |    0      0     -1      0    |

where f = ctan(ViewAngleVertical/2),
aspect = Viewport.x / Viewport.y,
zNear: distance to the near clipping plane; this value could be very small (in 1e-XX range).

This setup will cull all the geometry behind the near clipping plane, and the resulting depth will decrease as the object get drawn further away asymptotically approaching 0 for the infinitely distant objects.
Obviously that will only work for FP buffers heavily exploiting the exponent part of the numbers in it’s negative range (Xe-XXX), because 99% of depth values will be less than 0.000…

Must admit, did not tried yet, though (I have GeForce GT520 and even with the latest driver I do not have support for glClipControl as GLView reports). But anyway, using FP for depth buffer is not the solution I would be completely happy with.

[QUOTE=kRogue;1261322]There is no need for an extension.[/QUOTE]Seems like I am asking the world to do me a personal favor. OK, let me show you why do I think that this extension will benefit the game industry (not just me alone). For clarity: the final goal is to make it possible for cameras to have a near clipping plane set as close as ~1e-3 or so (a few millimeters, in other words, or even less) while drawing the scene as far as ~1e+5 or so (hundred kilometers) without artifacts causing by degraded z-buffer resolution.

Here is a typical problem everyone is used not to mention:
[ATTACH=CONFIG]725[/ATTACH] - example from Far Cry 3, dawn time, player looks down, the pole on the right has long shadow stretching across the road.
What is wrong on that picture?
No idea?
Legs. There are no body, no legs, no shadow cast by the player.
Why? Why is it so in EVERY FPS game? Why is the character’s model never rendered the same way as every other model in the scene? Oh, because the main camera into which the whole scene is rendered has to have a near clipping plane set too far to capture the close-up parts of the body of the player! :whistle:

Now imagine we got the extension I fetish so much about. Now we can setup the camera with near clipping plane located 1mm away, no far clipping plane - and 24bpp default integer depth buffer will let us draw the whole open-world scene with no artifacts (as we would be able to control the distribution of z values independently of the location of near clipping plane by setting the w plane for z in different position). How tempting would it be to simply attach the camera to the player’s model’ head and make no exceptions drawing the player’s model together with objects in the scene? How much more realism would be achieved when the player would see the objects actually colliding and interacting with it’s own body, seeing the own shadow reflecting each of the player’s model’ movement? Damn, with zNear=0.001 there are enough space between the character’s eyes and the goggles to fit such camera in between and render from that natural location! We can even put two cameras right in front of the player’s eyes and render in stereo mode from those points, even through the goggles or a visor mounted on the player’s head - whatever. Well, playing stereo is not common yet, so for a single camera the nose’ base is a good point anyway. :slight_smile:

I feel like you are not understanding what the results of the suggestions to the vertex shader do. That vertex shader will produce exactly the same results you are requesting with the div stuff where the xy is the gl_Position.w and the Z is ZDiv. Lets examine why.

First, the clip distance expressions (did you remember to exanple GL_CLIP_DISTANCE0 and GL_CLIP_DISTANCE1 in API code?) are:


gl_ClipDistance[0] = gl_DivPosition.z - gl_Position.z

The clip valueit makes is then gl_DivPosition.z - gl_Position.z >=0 which is equivalent to g_Position.z <= gl_DivPosition.z. The next one:


gl_ClipDistance[1] = -gl_DivPosition.z - gl_Position.z

is then -gl_DivPosition.z - gl_Position.z >= 0 which is same as -gl_DiVPosition.z <= gl_Position.z. Hence the the triangle is now clipped by the usual 6 clipping planes and the two new ones you wanted. Now, we then normalize gl_Position.z as follows:

gl_Position.z *= gl_Position.w / gl_DivPosition.z

AFTER assigning the values. Lets see what the z after w-divide will be at the edges of the user defined clip distance. For ClipDistance[0], we have gl_Position.z = gl_DivPosition.z. For that extreme, we get that the normalized device coordinate for z is +1.0 as you want, and for ClipDistance[1] is 0 we get the normalized z is -1.0.
More importantly we clipped against what you are after.

However if one wants to change the clipping volume for x and y, then more work is required because one is also changing where they are located. One can still do this in vertex shader only with writes to gl_Position.xyz the -normalized- device coordinates for a vertex and gl_Position.w as 1.0. This will give -exactly- when the div factors are all non-negaitve. The catch is that then one must clip the triangle to -gl_DivPosition.x <= x <= gl_DivPosition.x and -gl_DivPosition.y <= y <= gl_DivPosition.y what you want then is to add 4 more clipping buggers (2 for x and 2 for y). T

Those user defined clip distances are key to handle gl_DivPosition having non-positive values, and they force the triangle to get clipped to exactly that clipping volume.

However, you now need to decide how to interpolate values, as the w values makes the interpolation perspective correct. Making the w=1.0 essentially instructs the interpolation to be non-perspective which does icky things to texture coordinates. To even decide a correct 1/w is going to be tricky since you now have 3 different division factors: 1 for x, 1 for y and 1 for z. That z is different is not really a big deal, but the x and y not being the same means that there is not a good choice. But lets say you do figure out the 1/w at each vertex. Take a look at the formula for perspective correct interpolation, from that one can see that it is a quotient of flat interpolation together and another term that is also flatly interpolated that involves 1/w. Specifically, for each interpolate u, write as the output as u/w, call it uDivW (where w is the w you choose for the vertex to do perspective) and also write as an interpolate 1/w, call it wInverse. Then to get the value at the fragment shader, the value is uDivW/wInverse. [since you wrote each gl_Position.w as 1.0, it does not matter if you declare your interpolates as perspective or nonperspective].

At the end of the day, the exacts functionality you want is already doable, even without geometry shaders.

I think maybe the issue you have is that you do not realize that what you are asking for changes the normalized x’s and y’s and thus dramatically move the primitives’ vertices as well.

I was thinking more on this and the situation where the Div factor you want is negative for some verts and positive for others together with when some x’s y’s or z’s are negative will interact in a bad way (namely the normalized will come out positive and not interpolate correctly, the example that breaks the vertex shader with clipdistance magick is when say two points where one has both gl_Position.z and gl_DivPosition.z positive and the other has both negative; that case will be the normalized coords stay positive but it should stretch across 0)… But, all is not lost, here is the extension implemented as a geometry shader. The code would mirror what a dedicated triangle clipper would do anyways, the shader is written old school style of GL_ARB_geometry_shader4



in vec3 DivPosition[];



int count;
vec3 fan_positions[9];
vec3 fan_DivPositions[9];

float current_clip[9];

/*
  return the location of the expression being at 0.
*/
vec4
compute_clip_location(in vec4 p0, in float v0, in vec4 p1, in float v1)
{
   /*
    solve for t so that t*v0 + (1-t)*v1 = 0
   */
   t = v1/(v1-v0);

   return t*p0 + (1-t)*p1;
}

/*
  there are 6 clipping planes:

    -DivPosition.? <= gl_Position.? <= DivPosition.? for ?=x,y,z 
   we codify it as

   v(sgn) . ? = sgn*gl_Position.? + gl_DivPosition.? 

   sgn=-1, 1, ?=x,y,z

   store that value in each active element of the current fan
*/
#define compute_current_clip(float sgn, F) \
  do { \
     int i;\
     for(i=0;i<count; ++i) {\
         current_clip[i] = sgn*fan_position[i].F + fan_DivPosition[i].F; \
     }\
  } while(0);


/*
  there is either two indices where the sign of current_clip
  switches or the sign does not flip at all.
  Record those two indices, and return the sign of current_clip
  in their range. If there are no two vertices record -1 twice.
*/
void
find_sides(out int changes_at[2])
{
   float v;
   int aa;

   changes_at[0]=changes_at[1]=-1;

   aa=0;
   v=sign(current_clip[0]);
   for(int i=1; i<count; ++i)
   {
       float w;
       w=sign(current_clip[i]);
       if(w!=v) 
       {
          changes_at[aa] = i;
          ++aa;
          v=w;
       }
   }
}

void
stitch(int changes_at[2])
{
    if(changes_at[0]==-1)
    {
       if(current_clip[0]<0.0)
         count=0;
  
       return;    
    }

    int new_count;

    if(current_clip[0]>=0.0)
    {
       //"copy" unclipped vertices
       new_count = changes_at[0];
  
       //insert interpolate between changes_at[0] -1 to changes_at[0]
       i0= changes_at[0] - 1;
       i1= changes_at[0];

       fan_positions[new_count]=compute_clip_location( fan_positions[i0], current_clip[i0],
                                                                           fan_positions[i1], current_clip[i1]);
       new_count++;

       //and now insert the value of change between changes_at[1] -1 to changes_at[1]
       i0= changes_at[1] - 1;
       i1= changes_at[1];

       fan_positions[new_count]=compute_clip_location( fan_positions[i0], current_clip[i0],
                                                                           fan_positions[i1], current_clip[i1]);
       new_count++;
       


       for(int i=changes_at[1]; i<count; ++i, ++new_count)
       {
           fan_positions[new_count]=fan_positions[i];
       }
    }
    else
    {
       //insert interpolate between changes_at[0] -1 to changes_at[0]
       i0= changes_at[0] - 1;
       i1= changes_at[0];

       fan_positions[0]=compute_clip_location( fan_positions[i0], current_clip[i0],
                                                              fan_positions[i1], current_clip[i1]);

         for(new_count=1, i=changes_at[0]; i<changes_at[1]; ++i, ++new_count)
           {
               fan_positions[new_count]=fan_positions[i];
           }

       //and now insert the value of change between changes_at[1] -1 to changes_at[1]
       i0= changes_at[1] - 1;
       i1= changes_at[1];

       fan_positions[new_count]=compute_clip_location( fan_positions[i0], current_clip[i0],
                                                                           fan_positions[i1], current_clip[i1]);
       new_count++;

        
    }

    count=new_count;
}


#define do_it(sgn, F) \
  do { \
   int temp[2];
   if(count>0)
   {
      compute_current(sgn, F);
      find_sides(temp);
      stitch(temp);
   }
 } while(0)

void
main()
{
   

   count=3;
   for(i=0;i<3;++i)
    {
        fan_Positions[i]=gl_PositionIn[i].xyz;
        fan_DivPosition[i]=DivPositions[i];
    }

   /*
      clip the triangle to the clipping equations
    */
   do_it(1, x);
   do_it(-1,x);
   do_it(1, y);
   do_it(-1,y);
   do_it(1,z);
   do_it(-1,z);
 
   /* now send out the triangle fan */
   gl_PositionOut = vec4(fan_position[0]/fan_DivPosition[0], 1.0);
   EmitVertex();
   gl_PositionOut = vec4(fan_position[1]/fan_DivPosition[1], 1.0);
   EmitVertex();

   for(int i=2; i<count; ++i) 
   {
       gl_PositionOut=vec4(fan_position[i]/fan_DivPosition[i], 1.0);
       EmitVertex()
       RestartPrimive();
       gl_PositionOut = vec4(fan_position[0]/fan_DivPosition[0], 1.0);
       EmitVertex();
       gl_PositionOut = vec4(fan_position[i]/fan_DivPosition[i], 1.0);
       EmitVertex();
   }
}


That terribly unoptimized mess will give you your extension. It does not contain early leave optimizations. It writes the gl_PositionOut as the normalized device coord you want and w=1.0. It needs to be augmented for interpolates (which means saving that t in compute_clip_location and using to compute the interpolate there) and should be refactored to make it more optimal.

Have fun.