[Mesa-dev] [RFC PATCH 00/13] Reusing nir xfb gathering for ARB_gl_spirv, wip array of blocks, plus other fixes

Discussion:

Alejandro Piñeiro

2018-12-08 11:48:08 UTC

Hi Jason,

as we were talking about arrays of blocks, xfb and other etc, I
decided to clean up the code I have so far, and send a new RFC, so you
know more or less what I have been doing.

With this series most of the xfb tests I have wrotten pass so far
(that includes arrays of structs, arrays of arrays of basic simples).

The only pending thing is how to deal with arrays of blocks of
input/output interface blocks, because as I told you privately, right
now locations are overlapping.

It is worth to note that due the same reason, a basic test using
input/output interface blocks now are regressing. It worked before the
block being splitted, and fails now. I assume that due the same
reason.

You can also find this series here:
https://github.com/Igalia/mesa/tree/apinheiro/rfc2-xfb

And the piglit series that adds xfb tests and a non-xfb array of
blocks tests (that as I mentioned, regresses) here:
https://github.com/Igalia/piglit/tree/apinheiro/xfb

Alejandro Piñeiro (13):
spirv/nir: update Xfb decoration comment
nir: don't assert when xfb_buffer/stride is present but not xfb_offset
nir: fix output offset compute for dvec3/4
nir: add component_offset at nir_xfb_info
spirv/nir: fixing the xfb_offset for arrays of structs
nir: fixing the xfb_offset for arrays of structs
spirv/nir: use array_element as interface_type for any array
spirv/nir: interface_type should only be set for blocks, not any
structs
nir/spirv: only expose interface type for arrays of interface blocks
nir/xfb: WIP: handle xfb buffer/offset rule for block arrays
nir: adding varyings on nir_xfb_info and gather_info
nir/xfb_info: handle arrays and AoA of basic types
nir/linker: use nir_gather_xfb_info

src/compiler/glsl/gl_nir_link_xfb.c | 252 +++++++------------------
src/compiler/nir/nir_gather_xfb_info.c | 162 +++++++++++++---
src/compiler/nir/nir_xfb_info.h | 25 ++-
src/compiler/spirv/spirv_to_nir.c | 2 +-
src/compiler/spirv/vtn_variables.c | 68 ++++---
5 files changed, 270 insertions(+), 239 deletions(-)

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:09 UTC

Permalink

Now Vulkan radv driver, and ARB_gl_spirv implementation supports
transform feedback. Having said so, those decorations are handled
elsewhere.

Reviewed-by: Jason Ekstrand <***@jlekstrand.net>
---
src/compiler/spirv/spirv_to_nir.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c
index 4e1ffc3fcbe..d1a86210656 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -775,7 +775,7 @@ struct_member_decoration_cb(struct vtn_builder *b,

case SpvDecorationXfbBuffer:
case SpvDecorationXfbStride:
- vtn_warn("Vulkan does not have transform feedback");
+ /* Handled at vtn_variables.c, apply_var_decoration */
break;

case SpvDecorationCPacked:

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:10 UTC

Permalink

In order to allow nir_gather_xfb_info to be used on OpenGL,
specifically ARB_gl_spirv.

So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables":

"outputs specifying both an *XfbBuffer* and an *Offset* are
captured, while outputs not specifying both of these are not
captured. Values are captured each time the shader writes to such
a decorated object."

This implies that are captured if both are present, and not if one of
those are lacking. Technically, it doesn't explicitly point that
having just one or the other is a mistake. In some cases, glslang is
adding some extra XfbBuffer without XfbOffset around, and mentioning
that technically that is not a bug (see issue#1526)

And for the case of Vulkan, as the same glslang issue mentions, it is
not clear if that should be a mistake or not. But even if it is a
mistake, it is not really needed to be checked on the driver, and we
can let the validation layers to check that.

v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason).

Reviewed-by: Jason Ekstrand <***@jlekstrand.net>
---
src/compiler/nir/nir_gather_xfb_info.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index e282bba0081..f8d4cd833c7 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -107,11 +107,9 @@ nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)
*/
unsigned num_outputs = 0;
nir_foreach_variable(var, &shader->outputs) {
- if (var->data.explicit_xfb_buffer ||
- var->data.explicit_xfb_stride) {
- assert(var->data.explicit_xfb_buffer &&
- var->data.explicit_xfb_stride &&
- var->data.explicit_offset);
+ if (var->data.explicit_xfb_buffer &&
+ var->data.explicit_offset) {
+
num_outputs += glsl_count_attribute_slots(var->type, false);
}
}
@@ -122,8 +120,9 @@ nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)

/* Walk the list of outputs and add them to the array */
nir_foreach_variable(var, &shader->outputs) {
- if (var->data.explicit_xfb_buffer ||
- var->data.explicit_xfb_stride) {
+ if (var->data.explicit_xfb_buffer &&
+ var->data.explicit_offset) {
+
unsigned location = var->data.location;
unsigned offset = var->data.offset;
add_var_xfb_outputs(xfb, var, &location, &offset, var->type);

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:11 UTC

Permalink

The offset compute was working fine for the case of attrib_slots=1,
and updating the offset for the following varying.

But in the case of attrib_slots=2 (so dvec3/4), we are basically
splitting the comp_slots needed in two outputs. In that case we can't
add to the offset the full size of the type.

v2: added assert and some parenthesis to improve readability (Jason)

Reviewed-by: Jason Ekstrand <***@jlekstrand.net>
---
src/compiler/nir/nir_gather_xfb_info.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index f8d4cd833c7..f4f597da4f5 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -81,7 +81,12 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
output->component_mask = (comp_mask >> (s * 4)) & 0xf;

(*location)++;
- *offset += comp_slots * 4;
+ /* attrib_slots would be only > 1 for doubles. On that case
+ * comp_slots will be a multiple of 2, so the following doesn't need
+ * to use DIV_ROUND_UP or similar
+ */
+ assert(comp_slots % attrib_slots == 0);
+ *offset += (comp_slots / attrib_slots) * 4;
}
}
}

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:12 UTC

Permalink

Where component_offset here is the offset when accessing components of
a packed variable. Or in other words, location_frac on
nir.h. Different places of mesa use different names for it.

Technically nir_xfb_info consumer can get the same from the
component_mask, it seems somewhat forced to make it to compute it,
instead of providing it.
---
src/compiler/nir/nir_gather_xfb_info.c | 3 +++
src/compiler/nir/nir_xfb_info.h | 1 +
2 files changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index f4f597da4f5..bf432583ddb 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -70,6 +70,7 @@ add_var_xfb_outputs(nir_xfb_info *xfb,

assert(var->data.location_frac + comp_slots <= 8);
uint8_t comp_mask = ((1 << comp_slots) - 1) << var->data.location_frac;
+ unsigned location_frac = var->data.location_frac;

assert(attrib_slots <= 2);
for (unsigned s = 0; s < attrib_slots; s++) {
@@ -79,6 +80,7 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
output->offset = *offset;
output->location = *location;
output->component_mask = (comp_mask >> (s * 4)) & 0xf;
+ output->component_offset = location_frac;

(*location)++;
/* attrib_slots would be only > 1 for doubles. On that case
@@ -87,6 +89,7 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
*/
assert(comp_slots % attrib_slots == 0);
*offset += (comp_slots / attrib_slots) * 4;
+ location_frac = 0;
}
}
}
diff --git a/src/compiler/nir/nir_xfb_info.h b/src/compiler/nir/nir_xfb_info.h
index 9b543df5f47..fef52ba96d8 100644
--- a/src/compiler/nir/nir_xfb_info.h
+++ b/src/compiler/nir/nir_xfb_info.h
@@ -34,6 +34,7 @@ typedef struct {
uint16_t offset;
uint8_t location;
uint8_t component_mask;
+ uint8_t component_offset;
} nir_xfb_output_info;

typedef struct {

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:13 UTC

Permalink

GLSLang computes the xfb_offset for struct members. In fact, for basic
structs, the xfb nir gathering pass expect those to be filled, as one
struct variable is lowered to several nir variables, and those need to
have the xfb offset already set. See [1].

But, as one existing comments at spirv to nir already points:

"GLSLang really likes to place decorations in the most interior
thing it possibly can. In particular, if you have a struct, it
will place the patch decorations on the struct members"

And that includes xfb offset. In fact, GLSLang not expose the xfb
offset of the full struct, as it is properly assigned to the members,
and it makes a lot of the internal checks (like offset overlapping)
easier for them. I was not able to find a spec quote saying that is
wrong, as all the individual members has the proper offset.

This affects the case of variables that are array of structs, are they
are exposed as just one nir variable output.

So this commit resets the xfb_offset for the nir_variable if any of
the members has a xfb_offset assigned.

Rant: In general, the rules for xfb offset assignment on the spec are
somewhat underspecified for the new ARB_gl_spirv/vulkan world, as it
is not clear who is the responsible to do that (in opposite to the old
GLSL world, where the answer is "always/everything should solved by
the driver"). Ideally, it would be good if glslang does it, so the
vulkan/opengl driver just need to get the info. Unfourtunately, there
are cases, like arrays of structs where the driver still need to do
the assignment. So perhaps in the end it should be the opposite,
glslang (or any other frontend), just exposing the explicit info from
the user, and let the driver do the individual
assignments. Unfourtunately, with the current spec, there isn't
anything preventing the frontend to do that, so we would need to be
defensive, and cover all aspects.

[1] https://github.com/KhronosGroup/glslang/pull/154
---
src/compiler/spirv/vtn_variables.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c
index 13e8bf1fc3c..91e351187d2 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1718,6 +1718,26 @@ add_missing_member_locations(struct vtn_variable *var,
}
}

+static void
+vtn_fix_struct_array_xfb_offset(nir_variable *var)
+{
+ if (!glsl_type_is_array(var->type))
+ return;
+
+ const struct glsl_type *child_type = glsl_get_array_element(var->type);
+
+ if (!glsl_type_is_struct(child_type))
+ return;
+
+ if (var->data.explicit_offset)
+ return;
+
+ int offset = glsl_get_struct_field_offset(child_type, 0);
+ if (offset != -1) {
+ var->data.explicit_offset = 1;
+ var->data.offset = offset;
+ }
+}

static void
vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
@@ -1882,6 +1902,8 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
vtn_foreach_decoration(b, vtn_value(b, interface_type->id,
vtn_value_type_type),
var_decoration_cb, var);
+
+ vtn_fix_struct_array_xfb_offset(var->var);
break;
}

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:14 UTC

Permalink

Equivalent to previous patch (so comments applies), but implemented on
a different place. We would need to chose in which one.
---
src/compiler/nir/nir_gather_xfb_info.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index bf432583ddb..6611691b686 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -101,6 +101,27 @@ compare_xfb_output_offsets(const void *_a, const void *_b)
return a->offset - b->offset;
}

+static void
+fix_struct_array_xfb_offset(nir_variable *var)
+{
+ if (!glsl_type_is_array(var->type))
+ return;
+
+ const struct glsl_type *child_type = glsl_get_array_element(var->type);
+
+ if (!glsl_type_is_struct(child_type))
+ return;
+
+ if (var->data.explicit_offset)
+ return;
+
+ int offset = glsl_get_struct_field_offset(child_type, 0);
+ if (offset != -1) {
+ var->data.explicit_offset = 1;
+ var->data.offset = offset;
+ }
+}
+
nir_xfb_info *
nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)
{
@@ -115,6 +136,8 @@ nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)
*/
unsigned num_outputs = 0;
nir_foreach_variable(var, &shader->outputs) {
+ fix_struct_array_xfb_offset(var);
+
if (var->data.explicit_xfb_buffer &&
var->data.explicit_offset) {

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:15 UTC

Permalink

This commit removes several of the checks when assigning the
array_element as the interface_type. Reading the comment, and what
commit bb04b84114d2780307f9cbd04447216c3f2d1c0c added on top, this is
done conservatively, for only the builtin cases that makes sense at
that moment. But even if those were true, that should be already
validated on the SPIR-V shader.

Additionally, it is not clear that user-defined array of input/output
blocks are not allowed on Vulkan. And for sure, they will be allowed
on OpenGL (via ARB_gl_spirv), so that method was too restrictive.
---
src/compiler/spirv/vtn_variables.c | 29 +++++------------------------
1 file changed, 5 insertions(+), 24 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c
index 91e351187d2..be3545aad47 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1664,24 +1664,6 @@ vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
return ptr;
}

-static bool
-is_per_vertex_inout(const struct vtn_variable *var, gl_shader_stage stage)
-{
- if (var->patch || !glsl_type_is_array(var->type->type))
- return false;
-
- if (var->mode == vtn_variable_mode_input) {
- return stage == MESA_SHADER_TESS_CTRL ||
- stage == MESA_SHADER_TESS_EVAL ||
- stage == MESA_SHADER_GEOMETRY;
- }
-
- if (var->mode == vtn_variable_mode_output)
- return stage == MESA_SHADER_TESS_CTRL;
-
- return false;
-}
-
static void
add_missing_member_locations(struct vtn_variable *var,
bool is_vertex_input)
@@ -1871,12 +1853,11 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
*/

struct vtn_type *interface_type = var->type;
- if (is_per_vertex_inout(var, b->shader->info.stage)) {
- /* In Geometry shaders (and some tessellation), inputs come
- * in per-vertex arrays. However, some builtins come in
- * non-per-vertex, hence the need for the is_array check. In
- * any case, there are no non-builtin arrays allowed so this
- * check should be sufficient.
+ if (!var->patch && glsl_type_is_array(var->type->type)) {
+ /* On Vulkan, Geometry shaders and some Tessellation, some inputs
+ * come in per-vertex arrays, so we need to check for arrays. On
+ * OpenGL we have the same, plus the possibility of user-defined
+ * inout block arrays.
*/
interface_type = var->type->array_element;
}

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:16 UTC

Permalink

Current code assumes that if the type is an struct it would behave as
a block. That is not always the case (like xfb_offset/xfb_buffer
assignment on arrays of structs vs arrays of blocks), so we need to
differentiate.
---
src/compiler/spirv/vtn_variables.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c
index be3545aad47..6d7d5dfc691 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1677,9 +1677,12 @@ add_missing_member_locations(struct vtn_variable *var,
*
* “If the structure type is a Block but without a Location, then each
* of its members must have a Location decoration.”
+ *
*/
- assert(var->base_location != -1 ||
- var->var->members[i].location != -1);
+ if (var->type->block) {
+ assert(var->base_location != -1 ||
+ var->var->members[i].location != -1);
+ }

/* From the Vulkan spec:
*
@@ -1692,8 +1695,12 @@ add_missing_member_locations(struct vtn_variable *var,
else
var->var->members[i].location = location;

+ /* Below we use type instead of interface_type, because interface_type
+ * is only available when it is a Block. This code also supports
+ * input/outputs that are just structs
+ */
const struct glsl_type *member_type =
- glsl_get_struct_field(var->var->interface_type, i);
+ glsl_get_struct_field(glsl_without_array(var->type->type), i);

location +=
glsl_count_attribute_slots(member_type, is_vertex_input);
@@ -1862,9 +1869,11 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
interface_type = var->type->array_element;
}

- if (glsl_type_is_struct(interface_type->type)) {
+ if (interface_type->block) {
var->var->interface_type = interface_type->type;
+ }

+ if (glsl_type_is_struct(interface_type->type)) {
/* It's a struct. Set it up as per-member. */
var->var->num_members = glsl_get_length(interface_type->type);
var->var->members = rzalloc_array(var->var, struct nir_variable_data,

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:17 UTC

Permalink

In the same way that just a struct is not a interface block, and array
of structs is not an array of interface blocks.

At least at the NIR level.
---
src/compiler/spirv/vtn_variables.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c
index 6d7d5dfc691..a8f2fdfa534 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1860,7 +1860,7 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val,
*/

struct vtn_type *interface_type = var->type;
- if (!var->patch && glsl_type_is_array(var->type->type)) {
+ if (!var->patch && glsl_type_is_array(var->type->type) && var->type->array_element->block) {
/* On Vulkan, Geometry shaders and some Tessellation, some inputs
* come in per-vertex arrays, so we need to check for arrays. On
* OpenGL we have the same, plus the possibility of user-defined

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:18 UTC

Permalink

From GLSL 4.60 spec, Section 4.4.2. Output Layout Qualifiers,
subsection Transform Feedback Layout Qualifiers:

"When a block is declared as an array, all members of block
array-element 0 are captured, as previously described, by the
declared or inherited xfb_buffer. Generally, an array of size N of
blocks is captured by N consecutive buffers, with all members of
block array-element E captured by buffer B, where B equals the
declared or inherited xfb_buffer plus E"

And although not explicitly mentioned, one conclusion for this
paragraph would be that the xfb offset remain the same for the same
member of each array-element.

WIP: xfb arrays of blocks tests still not working properly due
location overlapping.
---
src/compiler/nir/nir_gather_xfb_info.c | 57 ++++++++++++++++++++------
1 file changed, 45 insertions(+), 12 deletions(-)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index 6611691b686..c46af311b20 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -30,28 +30,30 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
nir_variable *var,
unsigned *location,
unsigned *offset,
+ unsigned buffer,
const struct glsl_type *type)
{
if (glsl_type_is_array(type) || glsl_type_is_matrix(type)) {
unsigned length = glsl_get_length(type);
const struct glsl_type *child_type = glsl_get_array_element(type);
+
for (unsigned i = 0; i < length; i++)
- add_var_xfb_outputs(xfb, var, location, offset, child_type);
+ add_var_xfb_outputs(xfb, var, location, offset, buffer, child_type);
} else if (glsl_type_is_struct(type)) {
unsigned length = glsl_get_length(type);
for (unsigned i = 0; i < length; i++) {
const struct glsl_type *child_type = glsl_get_struct_field(type, i);
- add_var_xfb_outputs(xfb, var, location, offset, child_type);
+ add_var_xfb_outputs(xfb, var, location, offset, buffer, child_type);
}
} else {
- assert(var->data.xfb_buffer < NIR_MAX_XFB_BUFFERS);
- if (xfb->buffers_written & (1 << var->data.xfb_buffer)) {
- assert(xfb->strides[var->data.xfb_buffer] == var->data.xfb_stride);
- assert(xfb->buffer_to_stream[var->data.xfb_buffer] == var->data.stream);
+ assert(buffer < NIR_MAX_XFB_BUFFERS);
+ if (xfb->buffers_written & (1 << buffer)) {
+ assert(xfb->strides[buffer] == var->data.xfb_stride);
+ assert(xfb->buffer_to_stream[buffer] == var->data.stream);
} else {
- xfb->buffers_written |= (1 << var->data.xfb_buffer);
- xfb->strides[var->data.xfb_buffer] = var->data.xfb_stride;
- xfb->buffer_to_stream[var->data.xfb_buffer] = var->data.stream;
+ xfb->buffers_written |= (1 << buffer);
+ xfb->strides[buffer] = var->data.xfb_stride;
+ xfb->buffer_to_stream[buffer] = var->data.stream;
}

assert(var->data.stream < NIR_MAX_XFB_STREAMS);
@@ -76,7 +78,7 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
for (unsigned s = 0; s < attrib_slots; s++) {
nir_xfb_output_info *output = &xfb->outputs[xfb->output_count++];

- output->buffer = var->data.xfb_buffer;
+ output->buffer = buffer;
output->offset = *offset;
output->location = *location;
output->component_mask = (comp_mask >> (s * 4)) & 0xf;
@@ -154,9 +156,40 @@ nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)
if (var->data.explicit_xfb_buffer &&
var->data.explicit_offset) {

+ unsigned buffer = var->data.xfb_buffer;
unsigned location = var->data.location;
- unsigned offset = var->data.offset;
- add_var_xfb_outputs(xfb, var, &location, &offset, var->type);
+
+ /* The last check is needed to distinguish a block array from a block
+ * that contains an array. That becomes messy due all the
+ * nir_split_per_members passes, as at this point we are not going to
+ * receive the original block array type, but splitted
+ */
+ bool block_array = glsl_type_is_array(var->type) &&
+ var->interface_type != NULL &&
+ glsl_get_array_element(var->type) == var->interface_type;
+
+ /*
+ * From GLSL 4.60 spec, Section 4.4.2. Output Layout Qualifiers,
+ * subsection Transform Feedback Layout Qualifiers:
+ *
+ * "When a block is declared as an array, all members of block
+ * array-element 0 are captured, as previously described, by the
+ * declared or inherited xfb_buffer. Generally, an array of size N
+ * of blocks is captured by N consecutive buffers, with all members
+ * of block array-element E captured by buffer B, where B equals the
+ * declared or inherited xfb_buffer plus E"
+ *
+ * We handle it here, externally to add_var_xfb_outputs, in order to
+ * not make it overly complicated.
+ */
+ unsigned num_iterations = block_array ? glsl_get_length(var->type) : 1;
+ const struct glsl_type *top_level_type = block_array ? var->interface_type : var->type;
+
+ for (unsigned i = 0; i < num_iterations; i++, buffer++) {
+ unsigned offset = var->data.offset;
+
+ add_var_xfb_outputs(xfb, var, &location, &offset, buffer, top_level_type);
+ }
}
}
assert(xfb->output_count == num_outputs);

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:19 UTC

Permalink

In order to be used for OpenGL (right now for ARB_gl_spirv).

This commit adds two new structures:

* nir_xfb_varying_info: that identifies each individual varying. For
each one, we need to know the type, buffer and xfb_offset

* nir_xfb_buffer_info: as now for each buffer, in addition to the
stride, we need to know how many varyings are assigned to it.

At this point, the only case where num_outputs != num_varyings is with
the case of doubles, that for dvec3/4 could require more than one
output. There are more cases though, that will be handled on following
patches.

As it is somewhat more complex to know the number of varyings needed
that the number of outputs, and num_varyings will be always less that
num_outputs, we are using num_outputs as an approximation when
allocating memory. This is debatable though. One alternative would be
to allocate as needed, as the original ARB_gl_spirv custom xfb
gathering pass was doing.
---
src/compiler/nir/nir_gather_xfb_info.c | 27 +++++++++++++++++++++++---
src/compiler/nir/nir_xfb_info.h | 24 +++++++++++++++--------
2 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index c46af311b20..e3c3376fb34 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -25,6 +25,17 @@

#include <util/u_math.h>

+static nir_xfb_info *
+nir_gather_xfb_info_create(void *mem_ctx, uint16_t output_count, uint16_t varying_count)
+{
+ nir_xfb_info *xfb = rzalloc_size(mem_ctx, sizeof(nir_xfb_info));
+
+ xfb->varyings = rzalloc_size(mem_ctx, sizeof(nir_xfb_varying_info) * varying_count);
+ xfb->outputs = rzalloc_size(mem_ctx, sizeof(nir_xfb_output_info) * output_count);
+
+ return xfb;
+}
+
static void
add_var_xfb_outputs(nir_xfb_info *xfb,
nir_variable *var,
@@ -48,11 +59,11 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
} else {
assert(buffer < NIR_MAX_XFB_BUFFERS);
if (xfb->buffers_written & (1 << buffer)) {
- assert(xfb->strides[buffer] == var->data.xfb_stride);
+ assert(xfb->buffers[buffer].stride == var->data.xfb_stride);
assert(xfb->buffer_to_stream[buffer] == var->data.stream);
} else {
xfb->buffers_written |= (1 << buffer);
- xfb->strides[buffer] = var->data.xfb_stride;
+ xfb->buffers[buffer].stride = var->data.xfb_stride;
xfb->buffer_to_stream[buffer] = var->data.stream;
}

@@ -74,6 +85,12 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
uint8_t comp_mask = ((1 << comp_slots) - 1) << var->data.location_frac;
unsigned location_frac = var->data.location_frac;

+ nir_xfb_varying_info *varying = &xfb->varyings[xfb->varying_count++];
+ varying->type = type;
+ varying->buffer = var->data.xfb_buffer;
+ varying->offset = *offset;
+ xfb->buffers[var->data.xfb_buffer].varying_count++;
+
assert(attrib_slots <= 2);
for (unsigned s = 0; s < attrib_slots; s++) {
nir_xfb_output_info *output = &xfb->outputs[xfb->output_count++];
@@ -149,7 +166,11 @@ nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)
if (num_outputs == 0)
return NULL;

- nir_xfb_info *xfb = rzalloc_size(mem_ctx, nir_xfb_info_size(num_outputs));
+ /* It is complex to know how many varyings do we have beforehand. We use
+ * num_outputs as an approximation, as num_outputs should be bigger that
+ * num_varyings.
+ */
+ nir_xfb_info *xfb = nir_gather_xfb_info_create(mem_ctx, num_outputs, num_outputs);

/* Walk the list of outputs and add them to the array */
nir_foreach_variable(var, &shader->outputs) {
diff --git a/src/compiler/nir/nir_xfb_info.h b/src/compiler/nir/nir_xfb_info.h
index fef52ba96d8..71f4e87018c 100644
--- a/src/compiler/nir/nir_xfb_info.h
+++ b/src/compiler/nir/nir_xfb_info.h
@@ -29,6 +29,11 @@
#define NIR_MAX_XFB_BUFFERS 4
#define NIR_MAX_XFB_STREAMS 4

+typedef struct {
+ uint16_t stride;
+ uint16_t varying_count;
+} nir_xfb_buffer_info;
+
typedef struct {
uint8_t buffer;
uint16_t offset;
@@ -37,23 +42,26 @@ typedef struct {
uint8_t component_offset;
} nir_xfb_output_info;

+typedef struct {
+ const struct glsl_type *type;
+ uint8_t buffer;
+ uint16_t offset;
+} nir_xfb_varying_info;
+
typedef struct {
uint8_t buffers_written;
uint8_t streams_written;

- uint16_t strides[NIR_MAX_XFB_BUFFERS];
+ nir_xfb_buffer_info buffers[NIR_MAX_XFB_BUFFERS];
uint8_t buffer_to_stream[NIR_MAX_XFB_STREAMS];

+ uint16_t varying_count;
+ nir_xfb_varying_info *varyings;
+
uint16_t output_count;
- nir_xfb_output_info outputs[0];
+ nir_xfb_output_info *outputs;
} nir_xfb_info;

-static inline size_t
-nir_xfb_info_size(uint16_t output_count)
-{
- return sizeof(nir_xfb_info) + sizeof(nir_xfb_output_info) * output_count;
-}
-
nir_xfb_info *
nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx);

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:20 UTC

Permalink

On OpenGL, a array of a simple type adds just one varying. So
gl_transform_feedback_varying_info struct defined at mtypes.h includes
the parameters Type (base_type) and Size (number of elements).

This commit checks this when the recursive add_var_xfb_outputs call
handles arrays, to ensure that just one is addded.

We also need to take into account AoA here.

v2: take into account basic aoa too
---
src/compiler/nir/nir_gather_xfb_info.c | 52 +++++++++++++++++++++-----
1 file changed, 43 insertions(+), 9 deletions(-)

diff --git a/src/compiler/nir/nir_gather_xfb_info.c b/src/compiler/nir/nir_gather_xfb_info.c
index e3c3376fb34..e19d4908715 100644
--- a/src/compiler/nir/nir_gather_xfb_info.c
+++ b/src/compiler/nir/nir_gather_xfb_info.c
@@ -36,25 +36,61 @@ nir_gather_xfb_info_create(void *mem_ctx, uint16_t output_count, uint16_t varyin
return xfb;
}

+static bool
+glsl_type_is_leaf(const struct glsl_type *type)
+{
+ if (glsl_type_is_struct(type) ||
+ (glsl_type_is_array(type) &&
+ (glsl_type_is_array(glsl_get_array_element(type)) ||
+ glsl_type_is_struct(glsl_get_array_element(type))))) {
+ return false;
+ } else {
+ return true;
+ }
+}
+
+static void
+add_var_xfb_varying(nir_xfb_info *xfb,
+ nir_variable *var,
+ unsigned offset,
+ const struct glsl_type *type)
+{
+ nir_xfb_varying_info *varying = &xfb->varyings[xfb->varying_count++];
+
+ varying->type = type;
+ varying->buffer = var->data.xfb_buffer;
+ varying->offset = offset;
+ xfb->buffers[var->data.xfb_buffer].varying_count++;
+}
+
static void
add_var_xfb_outputs(nir_xfb_info *xfb,
nir_variable *var,
unsigned *location,
unsigned *offset,
unsigned buffer,
- const struct glsl_type *type)
+ const struct glsl_type *type,
+ bool varying_added)
{
if (glsl_type_is_array(type) || glsl_type_is_matrix(type)) {
unsigned length = glsl_get_length(type);
+ bool local_varying_added = varying_added;
+
const struct glsl_type *child_type = glsl_get_array_element(type);
+ if (!glsl_type_is_array_of_arrays(type) &&
+ glsl_type_is_leaf(child_type)) {
+
+ add_var_xfb_varying(xfb, var, *offset, type);
+ local_varying_added = true;
+ }

for (unsigned i = 0; i < length; i++)
- add_var_xfb_outputs(xfb, var, location, offset, buffer, child_type);
+ add_var_xfb_outputs(xfb, var, location, offset, buffer, child_type, local_varying_added);
} else if (glsl_type_is_struct(type)) {
unsigned length = glsl_get_length(type);
for (unsigned i = 0; i < length; i++) {
const struct glsl_type *child_type = glsl_get_struct_field(type, i);
- add_var_xfb_outputs(xfb, var, location, offset, buffer, child_type);
+ add_var_xfb_outputs(xfb, var, location, offset, buffer, child_type, varying_added);
}
} else {
assert(buffer < NIR_MAX_XFB_BUFFERS);
@@ -85,11 +121,9 @@ add_var_xfb_outputs(nir_xfb_info *xfb,
uint8_t comp_mask = ((1 << comp_slots) - 1) << var->data.location_frac;
unsigned location_frac = var->data.location_frac;

- nir_xfb_varying_info *varying = &xfb->varyings[xfb->varying_count++];
- varying->type = type;
- varying->buffer = var->data.xfb_buffer;
- varying->offset = *offset;
- xfb->buffers[var->data.xfb_buffer].varying_count++;
+ if (!varying_added) {
+ add_var_xfb_varying(xfb, var, *offset, type);
+ }

assert(attrib_slots <= 2);
for (unsigned s = 0; s < attrib_slots; s++) {
@@ -209,7 +243,7 @@ nir_gather_xfb_info(const nir_shader *shader, void *mem_ctx)
for (unsigned i = 0; i < num_iterations; i++, buffer++) {
unsigned offset = var->data.offset;

- add_var_xfb_outputs(xfb, var, &location, &offset, buffer, top_level_type);
+ add_var_xfb_outputs(xfb, var, &location, &offset, buffer, top_level_type, false);
}
}
}

--
2.19.1

Alejandro Piñeiro

2018-12-08 11:48:21 UTC

Permalink

Instead of a custom ARB_gl_spirv xfb gather info pass.

In fact, this is not only about reusing code, but the current custom
code was not handling properly how many varyings are enumerated from
some complex types. So this change is also about fixing some corner
cases.
---
src/compiler/glsl/gl_nir_link_xfb.c | 252 ++++++++--------------------
1 file changed, 72 insertions(+), 180 deletions(-)

diff --git a/src/compiler/glsl/gl_nir_link_xfb.c b/src/compiler/glsl/gl_nir_link_xfb.c
index bcef1e1863d..b294a4885b0 100644
--- a/src/compiler/glsl/gl_nir_link_xfb.c
+++ b/src/compiler/glsl/gl_nir_link_xfb.c
@@ -22,8 +22,8 @@
*/

#include "nir.h"
+#include "nir_xfb_info.h"
#include "gl_nir_linker.h"
-#include "ir_uniform.h" /* for gl_uniform_storage */
#include "linker_util.h"
#include "main/context.h"

@@ -34,158 +34,13 @@
* particularities.
*/

-struct active_xfb_buffer {
- GLuint stride;
- GLuint num_varyings;
-};
-
-struct active_xfb_varyings {
- unsigned num_varyings;
- unsigned num_outputs;
- unsigned buffer_size;
- struct nir_variable **varyings;
- struct active_xfb_buffer buffers[MAX_FEEDBACK_BUFFERS];
-};
-
-static unsigned
-get_num_outputs(nir_variable *var)
-{
- return glsl_count_attribute_slots(var->type,
- false /* is_vertex_input */);
-}
-
-static void
-add_xfb_varying(struct active_xfb_varyings *active_varyings,
- nir_variable *var)
-{
- if (active_varyings->num_varyings >= active_varyings->buffer_size) {
- if (active_varyings->buffer_size == 0)
- active_varyings->buffer_size = 1;
- else
- active_varyings->buffer_size *= 2;
-
- active_varyings->varyings = realloc(active_varyings->varyings,
- sizeof(nir_variable*) *
- active_varyings->buffer_size);
- }
-
- active_varyings->varyings[active_varyings->num_varyings++] = var;
-
- active_varyings->num_outputs += get_num_outputs(var);
-}
-
static int
-cmp_xfb_offset(const void *x_generic, const void *y_generic)
-{
- const nir_variable *const *x = x_generic;
- const nir_variable *const *y = y_generic;
-
- if ((*x)->data.xfb_buffer != (*y)->data.xfb_buffer)
- return (*x)->data.xfb_buffer - (*y)->data.xfb_buffer;
- return (*x)->data.offset - (*y)->data.offset;
-}
-
-static void
-get_active_xfb_varyings(struct gl_shader_program *prog,
- struct active_xfb_varyings *active_varyings)
-{
- for (unsigned i = 0; i < MESA_SHADER_STAGES; ++i) {
- struct gl_linked_shader *sh = prog->_LinkedShaders[i];
- if (sh == NULL)
- continue;
-
- nir_shader *nir = sh->Program->nir;
-
- nir_foreach_variable(var, &nir->outputs) {
- if (var->data.explicit_xfb_buffer &&
- var->data.explicit_xfb_stride) {
- assert(var->data.xfb_buffer < MAX_FEEDBACK_BUFFERS);
- active_varyings->buffers[var->data.xfb_buffer].stride =
- var->data.xfb_stride;
- }
-
- if (!var->data.explicit_xfb_buffer ||
- !var->data.explicit_offset)
- continue;
-
- active_varyings->buffers[var->data.xfb_buffer].num_varyings++;
-
- add_xfb_varying(active_varyings, var);
- }
- }
-
- /* The xfb_offset qualifier does not have to be used in increasing order
- * however some drivers expect to receive the list of transform feedback
- * declarations in order so sort it now for convenience.
- */
- qsort(active_varyings->varyings,
- active_varyings->num_varyings,
- sizeof(*active_varyings->varyings),
- cmp_xfb_offset);
-}
-
-static unsigned
-add_varying_outputs(nir_variable *var,
- const struct glsl_type *type,
- unsigned location_offset,
- unsigned dest_offset,
- struct gl_transform_feedback_output *output)
+count_bits(uint8_t mask)
{
- unsigned num_outputs = 0;
-
- if (glsl_type_is_array(type) || glsl_type_is_matrix(type)) {
- unsigned length = glsl_get_length(type);
- const struct glsl_type *child_type = glsl_get_array_element(type);
- unsigned component_slots = glsl_get_component_slots(child_type);
-
- for (unsigned i = 0; i < length; i++) {
- unsigned child_outputs = add_varying_outputs(var,
- child_type,
- location_offset,
- dest_offset,
- output + num_outputs);
- num_outputs += child_outputs;
- location_offset += child_outputs;
- dest_offset += component_slots;
- }
- } else if (glsl_type_is_struct(type)) {
- unsigned length = glsl_get_length(type);
- for (unsigned i = 0; i < length; i++) {
- const struct glsl_type *child_type = glsl_get_struct_field(type, i);
- unsigned child_outputs = add_varying_outputs(var,
- child_type,
- location_offset,
- dest_offset,
- output + num_outputs);
- num_outputs += child_outputs;
- location_offset += child_outputs;
- dest_offset += glsl_get_component_slots(child_type);
- }
- } else {
- unsigned location = var->data.location + location_offset;
- unsigned location_frac = var->data.location_frac;
- unsigned num_components = glsl_get_component_slots(type);
-
- while (num_components > 0) {
- unsigned output_size = MIN2(num_components, 4 - location_frac);
-
- output->OutputRegister = location;
- output->OutputBuffer = var->data.xfb_buffer;
- output->NumComponents = output_size;
- output->StreamId = var->data.stream;
- output->DstOffset = var->data.offset / 4 + dest_offset;
- output->ComponentOffset = location_frac;
-
- dest_offset += output_size;
- num_components -= output_size;
- num_outputs++;
- output++;
- location++;
- location_frac = 0;
- }
- }
-
- return num_outputs;
+ if (mask == 0)
+ return 0;
+ else
+ return (mask & 1) + count_bits(mask >> 1);
}

void
@@ -220,36 +75,62 @@ gl_nir_link_assign_xfb_resources(struct gl_context *ctx,
free(prog->TransformFeedback.VaryingNames[i]);
free(prog->TransformFeedback.VaryingNames);

- struct active_xfb_varyings active_varyings = { 0 };
+ nir_xfb_info *xfb_info = NULL;

- get_active_xfb_varyings(prog, &active_varyings);
+ /* Find last stage before fragment shader */
+ for (int stage = MESA_SHADER_FRAGMENT - 1; stage >= 0; stage--) {
+ if (stage != MESA_SHADER_VERTEX &&
+ stage != MESA_SHADER_TESS_EVAL &&
+ stage != MESA_SHADER_GEOMETRY) {
+ continue;
+ }

- for (unsigned buf = 0; buf < MAX_FEEDBACK_BUFFERS; buf++)
- prog->TransformFeedback.BufferStride[buf] = active_varyings.buffers[buf].stride;
+ struct gl_linked_shader *sh = prog->_LinkedShaders[stage];

- prog->TransformFeedback.NumVarying = active_varyings.num_varyings;
- prog->TransformFeedback.VaryingNames =
- malloc(sizeof(GLchar *) * active_varyings.num_varyings);
+ if (sh) {
+ xfb_info = nir_gather_xfb_info(sh->Program->nir, ctx);
+ break;
+ }
+ }

struct gl_transform_feedback_info *linked_xfb =
rzalloc(xfb_prog, struct gl_transform_feedback_info);
xfb_prog->sh.LinkedTransformFeedback = linked_xfb;

+ if (!xfb_info) {
+ prog->TransformFeedback.NumVarying = 0;
+ linked_xfb->NumOutputs = 0;
+ linked_xfb->NumVarying = 0;
+ linked_xfb->ActiveBuffers = 0;
+ return;
+ }
+
+ for (unsigned buf = 0; buf < MAX_FEEDBACK_BUFFERS; buf++)
+ prog->TransformFeedback.BufferStride[buf] = xfb_info->buffers[buf].stride;
+
+ prog->TransformFeedback.NumVarying = xfb_info->varying_count;
+ prog->TransformFeedback.VaryingNames =
+ malloc(sizeof(GLchar *) * xfb_info->varying_count);
+
linked_xfb->Outputs =
rzalloc_array(xfb_prog,
struct gl_transform_feedback_output,
- active_varyings.num_outputs);
- linked_xfb->NumOutputs = active_varyings.num_outputs;
+ xfb_info->output_count);
+ linked_xfb->NumOutputs = xfb_info->output_count;

linked_xfb->Varyings =
rzalloc_array(xfb_prog,
struct gl_transform_feedback_varying_info,
- active_varyings.num_varyings);
- linked_xfb->NumVarying = active_varyings.num_varyings;
+ xfb_info->varying_count);
+ linked_xfb->NumVarying = xfb_info->varying_count;
+
+ int buffer_index = 0; /* Corresponds to GL_TRANSFORM_FEEDBACK_BUFFER_INDEX */
+ int xfb_buffer =
+ (xfb_info->varying_count > 0) ?
+ xfb_info->outputs[0].buffer : 0;

- struct gl_transform_feedback_output *output = linked_xfb->Outputs;
- for (unsigned i = 0; i < active_varyings.num_varyings; i++) {
- struct nir_variable *var = active_varyings.varyings[i];
+ for (unsigned i = 0; i < xfb_info->varying_count; i++) {
+ nir_xfb_varying_info *xfb_varying = &xfb_info->varyings[i];

/* From ARB_gl_spirv spec:
*
@@ -277,23 +158,34 @@ gl_nir_link_assign_xfb_resources(struct gl_context *ctx,
*/
prog->TransformFeedback.VaryingNames[i] = NULL;

- unsigned varying_outputs = add_varying_outputs(var,
- var->type,
- 0, /* location_offset */
- 0, /* dest_offset */
- output);
- assert(varying_outputs == get_num_outputs(var));
- output = output + varying_outputs;
+ if (xfb_buffer != xfb_varying->buffer) {
+ buffer_index++;
+ xfb_buffer = xfb_varying->buffer;
+ }

struct gl_transform_feedback_varying_info *varying =
linked_xfb->Varyings + i;

/* ARB_gl_spirv: see above. */
varying->Name = NULL;
- varying->Type = glsl_get_gl_type(var->type);
- varying->BufferIndex = var->data.xfb_buffer;
- varying->Size = glsl_get_length(var->type);
- varying->Offset = var->data.offset;
+ varying->Type = glsl_get_gl_type(xfb_varying->type);
+ varying->BufferIndex = buffer_index;
+ varying->Size = glsl_get_length(xfb_varying->type);
+ varying->Offset = xfb_varying->offset;
+ }
+
+ for (unsigned i = 0; i < xfb_info->output_count; i++) {
+ nir_xfb_output_info *xfb_output = &xfb_info->outputs[i];
+
+ struct gl_transform_feedback_output *output =
+ linked_xfb->Outputs + i;
+
+ output->OutputRegister = xfb_output->location;
+ output->OutputBuffer = xfb_output->buffer;
+ output->NumComponents = count_bits(xfb_output->component_mask);
+ output->StreamId = xfb_info->buffer_to_stream[xfb_output->buffer];
+ output->DstOffset = xfb_output->offset / 4;
+ output->ComponentOffset = xfb_output->component_offset;
}

/* Make sure MaxTransformFeedbackBuffers is <= 32 so the bitmask for
@@ -303,14 +195,14 @@ gl_nir_link_assign_xfb_resources(struct gl_context *ctx,
assert(ctx->Const.MaxTransformFeedbackBuffers <= sizeof(buffers) * 8);

for (unsigned buf = 0; buf < MAX_FEEDBACK_BUFFERS; buf++) {
- if (active_varyings.buffers[buf].stride > 0) {
- linked_xfb->Buffers[buf].Stride = active_varyings.buffers[buf].stride / 4;
- linked_xfb->Buffers[buf].NumVaryings = active_varyings.buffers[buf].num_varyings;
+ if (xfb_info->buffers[buf].stride > 0) {
+ linked_xfb->Buffers[buf].Stride = xfb_info->buffers[buf].stride / 4;
+ linked_xfb->Buffers[buf].NumVaryings = xfb_info->buffers[buf].varying_count;
buffers |= 1 << buf;
}
}

linked_xfb->ActiveBuffers = buffers;

- free(active_varyings.varyings);
+ ralloc_free(xfb_info);
}

--
2.19.1