Discussion:
[Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation
Gert Wollny
2017-06-09 23:15:05 UTC
Permalink
Dear all,

as I wrote before, I was looking into the temporary register renaming.

This series of patches implements a new approach that achieves a tigher
estimation of the life time of the temporaries, and as a result the Piano
and Voloplosion benchmarks implemented in gputest [1] now work. Before
they failed with "r600_pipe_shader_create - translation from TGSI failed!"

Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders. I've also tested other programs like the unignie-*
benchmarks and they didn't show regressions.

I think that the patch will need a few more iterations to remove code duplication
and generally adhere to the mesa style, but I think it is atthe point where I could
need a bit of feedback to get it into shape to be acceptable, and I'd also like to
mention that since I'm new to mesa this I have no commit rights.

many thanks,
Gert

[1] http://www.geeks3d.com/gputest/

Gert Wollny (3):
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
mesa/st: glsl_to_tgsi: tie in the new register renaming approach

configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +-------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 135 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 551 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
src/mesa/state_tracker/tests/Makefile.am | 40 ++
src/mesa/state_tracker/tests/st-renumerate-test | 210 ++++++
.../tests/test_glsl_to_tgsi_lifetime.cpp | 789 +++++++++++++++++++++
11 files changed, 2104 insertions(+), 287 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
Gert Wollny
2017-06-09 23:15:06 UTC
Permalink
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.
---
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 287 +--------------------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++++++++++++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 135 ++++++++++
4 files changed, 381 insertions(+), 284 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 8a65fbe663..4450d80090 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -505,6 +505,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+ state_tracker/st_glsl_to_tgsi_private.cpp \
+ state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index c5d2e0fcd2..0e7f4b646a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -65,251 +66,8 @@

#define MAX_GLSL_TEXTURE_OFFSET 4

-class st_src_reg;
-class st_dst_reg;
+extern int swizzle_for_size(int size);

-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
- unsigned num_elements = 4;
-
- if (type) {
- type = type->without_array();
- if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
- }
-
- int swizzle = swizzle_for_size(num_elements);
- assert(num_elements + component <= 4);
-
- swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
- return swizzle;
-}
-
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
-public:
- st_src_reg(gl_register_file file, int index, const glsl_type *type,
- int component = 0, unsigned array_id = 0)
- {
- assert(file != PROGRAM_ARRAY || array_id != 0);
- this->file = file;
- this->index = index;
- this->swizzle = swizzle_for_type(type, component);
- this->negate = 0;
- this->abs = 0;
- this->index2D = 0;
- this->type = type ? type->base_type : GLSL_TYPE_ERROR;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = array_id;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = index2D;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->swizzle = 0;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- explicit st_src_reg(st_dst_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
- int negate:4; /**< NEGATE_XYZW mask from mesa */
- unsigned abs:1;
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- /*
- * Is this the second half of a double register pair?
- * currently used for input mapping only.
- */
- unsigned double_reg2:1;
- unsigned is_double_vertex_input:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-
- st_src_reg get_abs()
- {
- st_src_reg reg = *this;
- reg.negate = 0;
- reg.abs = 1;
- return reg;
- }
-};
-
-class st_dst_reg {
-public:
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = 0;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->writemask = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->array_id = 0;
- }
-
- explicit st_dst_reg(st_src_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-};
-
-st_src_reg::st_src_reg(st_dst_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->double_reg2 = false;
- this->array_id = reg.array_id;
- this->is_double_vertex_input = false;
-}
-
-st_dst_reg::st_dst_reg(st_src_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->writemask = WRITEMASK_XYZW;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->array_id = reg.array_id;
-}
-
-class glsl_to_tgsi_instruction : public exec_node {
-public:
- DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
-
- st_dst_reg dst[2];
- st_src_reg src[4];
- st_src_reg resource; /**< sampler or buffer register */
- st_src_reg *tex_offsets;
-
- /** Pointer to the ir source this tree came from for debugging */
- ir_instruction *ir;
-
- unsigned op:8; /**< TGSI opcode */
- unsigned saturate:1;
- unsigned is_64bit_expanded:1;
- unsigned sampler_base:5;
- unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
- unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
- glsl_base_type tex_type:5;
- unsigned tex_shadow:1;
- unsigned image_format:9;
- unsigned tex_offset_num_offset:3;
- unsigned dead_mask:4; /**< Used in dead code elimination */
- unsigned buffer_access:3; /**< buffer access type */
-
- const struct tgsi_opcode_info *info;
-};

class variable_storage {
DECLARE_RZALLOC_CXX_OPERATORS(variable_storage)
@@ -390,11 +148,6 @@ find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
return GLSL_TYPE_ERROR;
}

-struct rename_reg_pair {
- bool valid;
- int new_reg;
-};
-
struct glsl_to_tgsi_visitor : public ir_visitor {
public:
glsl_to_tgsi_visitor();
@@ -597,7 +350,7 @@ fail_link(struct gl_shader_program *prog, const char *fmt, ...)
prog->data->LinkStatus = linking_failure;
}

-static int
+int
swizzle_for_size(int size)
{
static const int size_swizzles[4] = {
@@ -611,40 +364,6 @@ swizzle_for_size(int size)
return size_swizzles[size - 1];
}

-static bool
-is_resource_instruction(unsigned opcode)
-{
- switch (opcode) {
- case TGSI_OPCODE_RESQ:
- case TGSI_OPCODE_LOAD:
- case TGSI_OPCODE_ATOMUADD:
- case TGSI_OPCODE_ATOMXCHG:
- case TGSI_OPCODE_ATOMCAS:
- case TGSI_OPCODE_ATOMAND:
- case TGSI_OPCODE_ATOMOR:
- case TGSI_OPCODE_ATOMXOR:
- case TGSI_OPCODE_ATOMUMIN:
- case TGSI_OPCODE_ATOMUMAX:
- case TGSI_OPCODE_ATOMIMIN:
- case TGSI_OPCODE_ATOMIMAX:
- return true;
- default:
- return false;
- }
-}
-
-static unsigned
-num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->num_dst;
-}
-
-static unsigned
-num_inst_src_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->is_tex || is_resource_instruction(op->op) ?
- op->info->num_src - 1 : op->info->num_src;
-}

glsl_to_tgsi_instruction *
glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
new file mode 100644
index 0000000000..337f21cf79
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
@@ -0,0 +1,241 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+
+using std::vector;
+
+extern int swizzle_for_size(int size);
+
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+ unsigned num_elements = 4;
+
+ if (type) {
+ type = type->without_array();
+ if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+ }
+
+ int swizzle = swizzle_for_size(num_elements);
+ assert(num_elements + component <= 4);
+
+ swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+ return swizzle;
+}
+
+
+
+st_src_reg::st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component, unsigned array_id)
+{
+ assert(file != PROGRAM_ARRAY || array_id != 0);
+ this->file = file;
+ this->index = index;
+ this->swizzle = swizzle_for_type(type, component);
+ this->negate = 0;
+ this->abs = 0;
+ this->index2D = 0;
+ this->type = type ? type->base_type : GLSL_TYPE_ERROR;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = index2D;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->swizzle = 0;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+
+st_src_reg st_src_reg::get_abs()
+{
+ st_src_reg reg = *this;
+ reg.negate = 0;
+ reg.abs = 1;
+ return reg;
+}
+
+st_src_reg::st_src_reg(st_dst_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->double_reg2 = false;
+ this->array_id = reg.array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_dst_reg::st_dst_reg(st_src_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->writemask = WRITEMASK_XYZW;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->array_id = reg.array_id;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->array_id = 0;
+}
+
+bool
+is_resource_instruction(unsigned opcode)
+{
+ switch (opcode) {
+ case TGSI_OPCODE_RESQ:
+ case TGSI_OPCODE_LOAD:
+ case TGSI_OPCODE_ATOMUADD:
+ case TGSI_OPCODE_ATOMXCHG:
+ case TGSI_OPCODE_ATOMCAS:
+ case TGSI_OPCODE_ATOMAND:
+ case TGSI_OPCODE_ATOMOR:
+ case TGSI_OPCODE_ATOMXOR:
+ case TGSI_OPCODE_ATOMUMIN:
+ case TGSI_OPCODE_ATOMUMAX:
+ case TGSI_OPCODE_ATOMIMIN:
+ case TGSI_OPCODE_ATOMIMAX:
+ return true;
+ default:
+ return false;
+ }
+}
+
+unsigned
+num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->num_dst;
+}
+
+unsigned
+num_inst_src_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->is_tex || is_resource_instruction(op->op) ?
+ op->info->num_src - 1 : op->info->num_src;
+}
+
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.h b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
new file mode 100644
index 0000000000..59697badff
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
@@ -0,0 +1,135 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <mesa/main/mtypes.h>
+#include <compiler/glsl_types.h>
+#include <compiler/glsl/ir.h>
+#include <tgsi/tgsi_info.h>
+#include <stack>
+#include <vector>
+
+class st_dst_reg;
+
+/**
+ * This struct is a corresponding struct to TGSI ureg_src.
+ */
+class st_src_reg {
+public:
+ st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component = 0, unsigned array_id = 0);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D);
+
+ st_src_reg();
+
+ explicit st_src_reg(st_dst_reg reg);
+
+ st_src_reg get_abs();
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+
+ uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
+ int negate:4; /**< NEGATE_XYZW mask from mesa */
+ unsigned abs:1;
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ /*
+ * Is this the second half of a double register pair?
+ * currently used for input mapping only.
+ */
+ unsigned double_reg2:1;
+ unsigned is_double_vertex_input:1;
+ unsigned array_id:10;
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+
+};
+
+class st_dst_reg {
+public:
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index);
+
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type);
+
+ st_dst_reg();
+
+ explicit st_dst_reg(st_src_reg reg);
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ unsigned array_id:10;
+
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+};
+
+class glsl_to_tgsi_instruction : public exec_node {
+public:
+ DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
+
+ st_dst_reg dst[2];
+ st_src_reg src[4];
+ st_src_reg resource; /**< sampler or buffer register */
+ st_src_reg *tex_offsets;
+
+ /** Pointer to the ir source this tree came from for debugging */
+ ir_instruction *ir;
+
+ unsigned op:8; /**< TGSI opcode */
+ unsigned saturate:1;
+ unsigned is_64bit_expanded:1;
+ unsigned sampler_base:5;
+ unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
+ unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type:5;
+ unsigned tex_shadow:1;
+ unsigned image_format:9;
+ unsigned tex_offset_num_offset:3;
+ unsigned dead_mask:4; /**< Used in dead code elimination */
+ unsigned buffer_access:3; /**< buffer access type */
+
+ const struct tgsi_opcode_info *info;
+};
+
+struct rename_reg_pair {
+ bool valid;
+ int new_reg;
+};
+
+extern bool is_resource_instruction(unsigned opcode);
+extern unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
+extern unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);
+
+
--
2.13.0
Gert Wollny
2017-06-09 23:15:07 UTC
Permalink
This patch adds new classes and tests to implement a tracker for the
life time of temporary registers for the register renaming stage of
glsl_to_tgsi. The tracker aims at estimating the shortest possible
life time for each register. The code base requires c++11, the flag is
propagated from the LLVM_CXXFLAGS.
---
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 2 +
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 551 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
src/mesa/state_tracker/tests/Makefile.am | 40 ++
src/mesa/state_tracker/tests/st-renumerate-test | 210 ++++++
.../tests/test_glsl_to_tgsi_lifetime.cpp | 789 +++++++++++++++++++++
8 files changed, 1709 insertions(+), 2 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp

diff --git a/configure.ac b/configure.ac
index f379ba8573..579e159420 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2827,6 +2827,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..72ffd61212 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.

-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests

if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
@@ -101,7 +101,7 @@ AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(MSVC2013_COMPAT_CFLAGS)
AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+ $(LLVM_CXXFLAGS) \
$(VISIBILITY_CXXFLAGS) \
$(MSVC2013_COMPAT_CXXFLAGS)

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 4450d80090..908d1acff6 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -507,6 +507,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..389a4b6b5f
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,551 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <stack>
+#include <limits>
+#include <iostream>
+
+
+using std::vector;
+using std::stack;
+using std::shared_ptr;
+using std::weak_ptr;
+using std::pair;
+using std::make_pair;
+using std::make_shared;
+using std::numeric_limits;
+
+tgsi_temp_lifetime::tgsi_temp_lifetime(exec_list *instructions, int ntemps):
+ lifetimes(ntemps)
+{
+ evaluate(instructions);
+}
+
+const std::vector<std::pair<int, int> >& tgsi_temp_lifetime::get_lifetimes() const
+{
+ return lifetimes;
+}
+
+void tgsi_temp_lifetime::evaluate(exec_list *instructions)
+{
+ int i = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int scope_level = 0;
+ bool is_at_end = false;
+ shared_ptr<prog_scope> current = make_shared<prog_scope>(sct_outer, 0, scope_level++, i);
+ stack<shared_ptr<prog_scope>> scope_stack;
+
+ vector<temp_access> acc(lifetimes.size());
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (is_at_end) {
+ std::cerr << "GLSL_TO_TGSI: shader has instructions past end marker\n";
+ break;
+ }
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ shared_ptr<prog_scope> scope = make_shared<prog_scope>(current, sct_loop, loop_id, scope_level, i);
+ ++loop_id;
+ ++scope_level;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --scope_level;
+ current->set_end(i);
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+ case TGSI_OPCODE_IF:
+ case TGSI_OPCODE_UIF:{
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].append(i, acc_read, current);
+ }
+ }
+ shared_ptr<prog_scope> scope = make_shared<prog_scope>(current, sct_if, if_id, scope_level, i);
+ ++if_id;
+ ++scope_level;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ current->set_end(i-1);
+ current = make_shared<prog_scope>(current->parent(), sct_else, current->id(),
+ current->level(), i);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ current->set_end(i);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --scope_level;
+ current->set_end(i-1);
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ shared_ptr<prog_scope> scope = make_shared<prog_scope>(current, sct_switch, switch_id, scope_level, i);
+ ++scope_level;
+ ++switch_id;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --scope_level;
+ current->set_end(i-1);
+
+ // remove the case level
+ if (current->type() != sct_switch ) {
+ current = scope_stack.top();
+ scope_stack.pop();
+ }
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+
+ case TGSI_OPCODE_CASE:
+ case TGSI_OPCODE_DEFAULT: {
+ if ( current->type() == sct_switch ) {
+ scope_stack.push(current);
+ current = make_shared<prog_scope>(current, sct_case, current->id(), scope_level, i);
+ }else{
+ auto p = current->parent();
+ auto scope = make_shared<prog_scope>(p, sct_case, p->id(), p->level(), i);
+ if (current->end() == -1)
+ scope->set_previous(current);
+ current = scope;
+ }
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].append(i, acc_read, current);
+ }
+ }
+ }
+ case TGSI_OPCODE_BRK: {
+ if ( current->type() == sct_case) {
+ current->set_end(i-1);
+ }
+ }
+ case TGSI_OPCODE_CONT: {
+ current->set_continue(current, i);
+ break;
+ }
+
+ default: {
+
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->dst[j].index].append(i, acc_write, current);
+ }
+ }
+
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].append(i, acc_read, current);
+ }
+ }
+
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->tex_offsets[j].index].append(i, acc_read, current);
+ }
+ }
+
+ } // end default
+ } // end switch (op)
+
+ ++i;
+ }
+
+ // make sure last scope is closed, even though no
+ // TGSI_OPCODE_END was given
+ if (current->end() < 0) {
+ current->set_end(i-1);
+ }
+
+ for(unsigned i = 1; i < lifetimes.size(); ++i) {
+ lifetimes[i] = acc[i].get_required_lifetime();
+ }
+}
+
+
+tgsi_temp_lifetime::prog_scope::prog_scope(e_scope_type type, int id, int lvl, int s_begin):
+ prog_scope(std::shared_ptr<prog_scope>(), type, id, lvl, s_begin)
+{
+}
+
+tgsi_temp_lifetime::prog_scope::prog_scope(std::shared_ptr<prog_scope> p, e_scope_type type, int id, int lvl, int s_begin):
+ scope_type(type),
+ scope_id(id),
+ nested_level(lvl),
+ scope_begin(s_begin),
+ scope_end(-1),
+ loop_continue(numeric_limits<int>::max()),
+ parent_scope(p)
+{
+}
+
+tgsi_temp_lifetime::e_scope_type tgsi_temp_lifetime::prog_scope::type() const
+{
+ return scope_type;
+}
+
+
+std::shared_ptr<tgsi_temp_lifetime::prog_scope>
+tgsi_temp_lifetime::prog_scope::parent() const
+{
+ return parent_scope;
+}
+
+int tgsi_temp_lifetime::prog_scope::level() const
+{
+ return nested_level;
+}
+
+bool tgsi_temp_lifetime::prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (parent_scope)
+ return parent_scope->in_loop();
+ return false;
+}
+
+const tgsi_temp_lifetime::prog_scope *
+tgsi_temp_lifetime::prog_scope::get_parent_loop() const
+{
+ if (scope_type == sct_loop)
+ return this;
+ if (parent_scope)
+ return parent_scope->get_parent_loop();
+ else
+ return nullptr;
+}
+
+bool tgsi_temp_lifetime::prog_scope::contains(const prog_scope& other) const
+{
+ return (begin() <= other.begin()) && (end() >= other.end());
+}
+
+bool tgsi_temp_lifetime::prog_scope::is_conditional() const
+{
+ return scope_type == sct_if || scope_type == sct_else ||
+ scope_type == sct_case;
+}
+
+const tgsi_temp_lifetime::prog_scope *
+tgsi_temp_lifetime::prog_scope::in_ifelse() const
+{
+ if (scope_type == sct_if ||
+ scope_type == sct_else)
+ return this;
+ else if (parent_scope)
+ return parent_scope->in_ifelse();
+ else
+ return nullptr;
+}
+
+const tgsi_temp_lifetime::prog_scope *
+tgsi_temp_lifetime::prog_scope::in_switchcase() const
+{
+ if (scope_type == sct_case)
+ return this;
+ else if (parent_scope)
+ return parent_scope->in_switchcase();
+ else
+ return nullptr;
+}
+
+int tgsi_temp_lifetime::prog_scope::id() const
+{
+ return scope_id;
+}
+
+int tgsi_temp_lifetime::prog_scope::begin() const
+{
+ return scope_begin;
+}
+
+int tgsi_temp_lifetime::prog_scope::end() const
+{
+ return scope_end;
+}
+
+void tgsi_temp_lifetime::prog_scope::set_previous(std::shared_ptr<prog_scope> prev)
+{
+ previous = prev;
+}
+
+void tgsi_temp_lifetime::prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous)
+ previous->set_end(end);
+ }
+}
+
+void tgsi_temp_lifetime::prog_scope::set_continue(weak_ptr<prog_scope> scope, int i)
+{
+ if (scope_type == sct_loop) {
+ loop_to_continue_scope = scope;
+ loop_continue = i;
+ } else if (parent_scope)
+ parent()->set_continue(scope, i);
+}
+
+int tgsi_temp_lifetime::prog_scope::loop_continue_idx() const
+{
+ return loop_continue;
+}
+
+void tgsi_temp_lifetime::temp_access::append(int index, e_acc_type rw, std::shared_ptr<prog_scope> pstate)
+{
+ temp_access_record r = {index, rw, pstate};
+ timeline.push_back(r);
+}
+
+pair<int, int> tgsi_temp_lifetime::temp_access::get_required_lifetime() const
+{
+ bool keep_for_full_loop = false;
+
+ std::shared_ptr<prog_scope> lr_scope;
+ std::shared_ptr<prog_scope> fr_scope;
+ std::shared_ptr<prog_scope> fw_scope;
+ const prog_scope *fw_ifthen_scope = nullptr;
+ const prog_scope *fw_switch_scope = nullptr;
+
+ int first_write = -1;
+ int last_read = -1;
+ int first_read = numeric_limits<int>::max();
+
+ for (temp_access_record i: timeline) {
+ if (i.acc == acc_read) {
+ last_read = i.index;
+ lr_scope = i.pstate;
+ if (first_read > i.index) {
+ first_read = i.index;
+ fr_scope = i.pstate;
+ }
+ }else{
+ if (first_write == -1) {
+ first_write = i.index;
+ fw_scope = i.pstate;
+
+ // we write in an if-branch
+ fw_ifthen_scope = i.pstate->in_ifelse();
+ if (fw_ifthen_scope && fw_ifthen_scope->in_loop()) {
+ // value not always written, in loops we must keep it
+ keep_for_full_loop = true;
+ } else {
+ // test for switch-case
+ fw_switch_scope = i.pstate->in_switchcase();
+
+ if (fw_switch_scope && fw_switch_scope->in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+ } else if (keep_for_full_loop) {
+
+ // written in if branch?
+ // disable because read first in else branch
+ // makes this invalid and this is not (yet) tracked
+ if (0 && fw_ifthen_scope) {
+ auto s = i.pstate->in_ifelse();
+ // also written in the else branch?
+ if (s && (s->id() == fw_ifthen_scope->id())) {
+ keep_for_full_loop = false;
+ }
+ }
+ }
+ }
+ }
+
+ // this temp is only read, this is undefined
+ // behaviour, so we can use the register otherwise
+ if (!fw_scope) {
+ return make_pair(-1, -1);
+ }
+
+ // Only written to, just make sure it doesn't overlap
+ if (!lr_scope) {
+ return make_pair(first_write, first_write);
+ }
+
+ int target_level = -1;
+ // evaluate the shared scope
+ while (target_level < 0) {
+ if (lr_scope->contains(*fw_scope)) {
+ target_level = lr_scope->level();
+ } else if (fw_scope->contains(*lr_scope)) {
+ target_level = fw_scope->level();
+ } else {
+ // scopes (partially) disjunct, move up
+ if (lr_scope->type() == sct_loop) {
+ last_read = lr_scope->end();
+ }
+ lr_scope = lr_scope->parent();
+ }
+ }
+
+ // propagate the read scope to the target_level
+ while (lr_scope->level() > target_level) {
+ // if the read is in a loop we need to extend the
+ // variables life time to the end of that loop
+ if (lr_scope->type() == sct_loop) {
+ last_read = lr_scope->end();
+ }
+ lr_scope = lr_scope->parent();
+ }
+
+
+ // propagate the first_write scope to the target_level
+ bool has_continue = false;
+ while (target_level < fw_scope->level()) {
+
+ // propagate lifetime also if there was a continue
+ // in a loop and the write was after the continue
+ if (has_continue || (fw_scope->loop_continue_idx() < first_write)) {
+ has_continue = true;
+ first_write = fw_scope->begin();
+ int lr = fw_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ if ((fw_scope->type() == sct_loop) && (first_read < first_write)) {
+ first_write = fw_scope->begin();
+ int lr = fw_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ fw_scope = fw_scope->parent();
+
+ // if the value is conditionally written in a loop
+ // then propagate its lifetime to the full loop
+ if (fw_scope->type() == sct_loop) {
+ if (keep_for_full_loop || (first_read < first_write)) {
+ first_write = fw_scope->begin();
+ int lr = fw_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ // if we currently don't propagate the lifetime but
+ // the enclosing scope is a conditional within a loop
+ // up to the last-read level we need to propagate,
+ // todo: to tighten the life time check whether the value
+ // is written in all consitional code path below the loop
+ if (!keep_for_full_loop &&
+ fw_scope->is_conditional() &&
+ fw_scope->in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+
+
+
+
+ // same level and same range means it is first
+ // written and last read in the same scope
+ // ignore the case when first read is before
+ // first write, because it is undefined behaviour
+ if ((lr_scope->begin() == fw_scope->begin()) &&
+ (lr_scope->end() == fw_scope->end())) {
+ return make_pair(first_write, last_read);
+ }
+
+ // different scopes,
+ if (!keep_for_full_loop && first_read > first_write) {
+ return make_pair(first_write, last_read);
+ }else{
+ // 1. if the value is not always written in a loop
+ // it must be kept for the whole loop scope.
+ //
+ // 2. if the value is read (maybe conditionally)
+ // before it is written first it also must be
+ // kept valid for the whole loop
+ auto enclosing_loop = lr_scope->get_parent_loop();
+ assert(enclosing_loop);
+ return make_pair(enclosing_loop->begin(), enclosing_loop->end());
+ }
+}
+
+
+void evaluate_remapping(const vector<std::pair<int, int>>& lt, std::vector<rename_reg_pair>& result)
+{
+
+ struct out_access_record {
+ int end;
+ unsigned reg;
+ bool tested;
+ };
+
+ std::multimap<int, out_access_record> m;
+ for (unsigned i = 1; i < lt.size(); ++i) {
+ out_access_record oar = {lt[i].second, i, false};
+ m.insert(make_pair(lt[i].first, oar));
+ }
+
+ while (true) {
+ auto trgt = m.begin();
+
+ while ( trgt != m.end() && trgt->second.tested)
+ ++trgt;
+
+ if (trgt == m.end())
+ break;
+
+ while (trgt != m.end()) {
+ auto src = m.upper_bound(trgt->second.end - 1);
+ if (src == m.end()) {
+ trgt->second.tested = true;
+ } else {
+ result[src->second.reg].new_reg = trgt->second.reg;
+ result[src->second.reg].valid = true;
+ trgt->second.end = src->second.end;
+ m.erase(src);
+ break;
+ }
+ ++trgt;
+ }
+ }
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..e764f0f0c2
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,114 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <memory>
+#include <map>
+
+class tgsi_temp_lifetime {
+public:
+
+ enum e_scope_type {
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_case,
+ sct_unknown
+ };
+
+ enum e_acc_type {
+ acc_read,
+ acc_write
+ };
+
+ class prog_scope {
+
+ public:
+ prog_scope(e_scope_type type, int id, int lvl, int s_begin);
+ prog_scope(std::shared_ptr<prog_scope> p, e_scope_type type, int id, int lvl, int s_begin);
+
+ e_scope_type type() const;
+ std::shared_ptr<prog_scope> parent() const;
+ int level() const;
+ int id() const;
+ int end() const;
+ int begin() const;
+ const prog_scope *in_ifelse() const;
+ const prog_scope *in_switchcase() const;
+ int loop_continue_idx() const;
+ bool in_loop() const;
+ const prog_scope *get_parent_loop() const;
+ bool is_conditional() const;
+ bool contains(const prog_scope& other) const;
+ void set_end(int end);
+ void set_previous(std::shared_ptr<prog_scope> prev);
+ void set_continue(std::weak_ptr<prog_scope> scope, int i);
+ private:
+ e_scope_type scope_type;
+ int scope_id;
+ int nested_level;
+ int scope_begin;
+ int scope_end;
+ int loop_continue;
+ std::weak_ptr<prog_scope> loop_to_continue_scope;
+ std::shared_ptr<prog_scope> previous;
+ std::shared_ptr<prog_scope> parent_scope;
+
+ };
+
+ class temp_access {
+
+ public:
+
+ void append(int index, e_acc_type rw, std::shared_ptr<prog_scope> pstate);
+ std::pair<int, int> get_required_lifetime() const;
+
+ private:
+
+ struct temp_access_record {
+ int index;
+ e_acc_type acc;
+ std::shared_ptr<prog_scope> pstate;
+ };
+
+ std::vector< temp_access_record > timeline;
+
+ };
+
+ tgsi_temp_lifetime(exec_list *instructions, int ntemps);
+
+ const std::vector<std::pair<int, int> >& get_lifetimes() const;
+
+
+private:
+
+ void evaluate(exec_list *instructions);
+
+ std::vector<std::pair<int, int> > lifetimes;
+};
+
+void evaluate_remapping(const std::vector<std::pair<int, int>>& lt, std::vector<rename_reg_pair>& result);
+
+
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..a2bcad8dde
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,40 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES) $(INCLUDE_DIRS)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gallium/drivers/trace/libtrace.la \
+ $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
+ $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS)
+
diff --git a/src/mesa/state_tracker/tests/st-renumerate-test b/src/mesa/state_tracker/tests/st-renumerate-test
new file mode 100755
index 0000000000..bc2f325899
--- /dev/null
+++ b/src/mesa/state_tracker/tests/st-renumerate-test
@@ -0,0 +1,210 @@
+#! /bin/sh
+
+# st-renumerate-test - temporary wrapper script for .libs/st-renumerate-test
+# Generated by libtool (GNU libtool) 2.4.6
+#
+# The st-renumerate-test program cannot be directly executed until all the libtool
+# libraries that it depends on are installed.
+#
+# This wrapper script should never be moved out of the build directory.
+# If it is, it will not operate correctly.
+
+# Sed substitution that helps us do robust quoting. It backslashifies
+# metacharacters that are still active within double-quoted strings.
+sed_quote_subst='s|\([`"$\\]\)|\\\1|g'
+
+# Be Bourne compatible
+if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then
+ emulate sh
+ NULLCMD=:
+ # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which
+ # is contrary to our usage. Disable this feature.
+ alias -g '${1+"$@"}'='"$@"'
+ setopt NO_GLOB_SUBST
+else
+ case `(set -o) 2>/dev/null` in *posix*) set -o posix;; esac
+fi
+BIN_SH=xpg4; export BIN_SH # for Tru64
+DUALCASE=1; export DUALCASE # for MKS sh
+
+# The HP-UX ksh and POSIX shell print the target directory to stdout
+# if CDPATH is set.
+(unset CDPATH) >/dev/null 2>&1 && unset CDPATH
+
+relink_command=""
+
+# This environment variable determines our operation mode.
+if test "$libtool_install_magic" = "%%%MAGIC variable%%%"; then
+ # install mode needs the following variables:
+ generated_by_libtool_version='2.4.6'
+ notinst_deplibs=' ../../../../src/mapi/shared-glapi/libglapi.la'
+else
+ # When we are sourced in execute mode, $file and $ECHO are already set.
+ if test "$libtool_execute_magic" != "%%%MAGIC variable%%%"; then
+ file="$0"
+
+# A function that is used when there is no print builtin or printf.
+func_fallback_echo ()
+{
+ eval 'cat <<_LTECHO_EOF
+$1
+_LTECHO_EOF'
+}
+ ECHO="printf %s\\n"
+ fi
+
+# Very basic option parsing. These options are (a) specific to
+# the libtool wrapper, (b) are identical between the wrapper
+# /script/ and the wrapper /executable/ that is used only on
+# windows platforms, and (c) all begin with the string --lt-
+# (application programs are unlikely to have options that match
+# this pattern).
+#
+# There are only two supported options: --lt-debug and
+# --lt-dump-script. There is, deliberately, no --lt-help.
+#
+# The first argument to this parsing function should be the
+# script's ../../../../libtool value, followed by no.
+lt_option_debug=
+func_parse_lt_options ()
+{
+ lt_script_arg0=$0
+ shift
+ for lt_opt
+ do
+ case "$lt_opt" in
+ --lt-debug) lt_option_debug=1 ;;
+ --lt-dump-script)
+ lt_dump_D=`$ECHO "X$lt_script_arg0" | /bin/sed -e 's/^X//' -e 's%/[^/]*$%%'`
+ test "X$lt_dump_D" = "X$lt_script_arg0" && lt_dump_D=.
+ lt_dump_F=`$ECHO "X$lt_script_arg0" | /bin/sed -e 's/^X//' -e 's%^.*/%%'`
+ cat "$lt_dump_D/$lt_dump_F"
+ exit 0
+ ;;
+ --lt-*)
+ $ECHO "Unrecognized --lt- option: '$lt_opt'" 1>&2
+ exit 1
+ ;;
+ esac
+ done
+
+ # Print the debug banner immediately:
+ if test -n "$lt_option_debug"; then
+ echo "st-renumerate-test:st-renumerate-test:$LINENO: libtool wrapper (GNU libtool) 2.4.6" 1>&2
+ fi
+}
+
+# Used when --lt-debug. Prints its arguments to stdout
+# (redirection is the responsibility of the caller)
+func_lt_dump_args ()
+{
+ lt_dump_args_N=1;
+ for lt_arg
+ do
+ $ECHO "st-renumerate-test:st-renumerate-test:$LINENO: newargv[$lt_dump_args_N]: $lt_arg"
+ lt_dump_args_N=`expr $lt_dump_args_N + 1`
+ done
+}
+
+# Core function for launching the target application
+func_exec_program_core ()
+{
+
+ if test -n "$lt_option_debug"; then
+ $ECHO "st-renumerate-test:st-renumerate-test:$LINENO: newargv[0]: $progdir/$program" 1>&2
+ func_lt_dump_args ${1+"$@"} 1>&2
+ fi
+ exec "$progdir/$program" ${1+"$@"}
+
+ $ECHO "$0: cannot exec $program $*" 1>&2
+ exit 1
+}
+
+# A function to encapsulate launching the target application
+# Strips options in the --lt-* namespace from $@ and
+# launches target application with the remaining arguments.
+func_exec_program ()
+{
+ case " $* " in
+ *\ --lt-*)
+ for lt_wr_arg
+ do
+ case $lt_wr_arg in
+ --lt-*) ;;
+ *) set x "$@" "$lt_wr_arg"; shift;;
+ esac
+ shift
+ done ;;
+ esac
+ func_exec_program_core ${1+"$@"}
+}
+
+ # Parse options
+ func_parse_lt_options "$0" ${1+"$@"}
+
+ # Find the directory that this script lives in.
+ thisdir=`$ECHO "$file" | /bin/sed 's%/[^/]*$%%'`
+ test "x$thisdir" = "x$file" && thisdir=.
+
+ # Follow symbolic links until we get to the real thisdir.
+ file=`ls -ld "$file" | /bin/sed -n 's/.*-> //p'`
+ while test -n "$file"; do
+ destdir=`$ECHO "$file" | /bin/sed 's%/[^/]*$%%'`
+
+ # If there was a directory component, then change thisdir.
+ if test "x$destdir" != "x$file"; then
+ case "$destdir" in
+ [\\/]* | [A-Za-z]:[\\/]*) thisdir="$destdir" ;;
+ *) thisdir="$thisdir/$destdir" ;;
+ esac
+ fi
+
+ file=`$ECHO "$file" | /bin/sed 's%^.*/%%'`
+ file=`ls -ld "$thisdir/$file" | /bin/sed -n 's/.*-> //p'`
+ done
+
+ # Usually 'no', except on cygwin/mingw when embedded into
+ # the cwrapper.
+ WRAPPER_SCRIPT_BELONGS_IN_OBJDIR=no
+ if test "$WRAPPER_SCRIPT_BELONGS_IN_OBJDIR" = "yes"; then
+ # special case for '.'
+ if test "$thisdir" = "."; then
+ thisdir=`pwd`
+ fi
+ # remove .libs from thisdir
+ case "$thisdir" in
+ *[\\/].libs ) thisdir=`$ECHO "$thisdir" | /bin/sed 's%[\\/][^\\/]*$%%'` ;;
+ .libs ) thisdir=. ;;
+ esac
+ fi
+
+ # Try to get the absolute directory name.
+ absdir=`cd "$thisdir" && pwd`
+ test -n "$absdir" && thisdir="$absdir"
+
+ program='st-renumerate-test'
+ progdir="$thisdir/.libs"
+
+
+ if test -f "$progdir/$program"; then
+ # Add our own library path to LD_LIBRARY_PATH
+ LD_LIBRARY_PATH="/home/gerddie/src/Freedesktop/mesa-orig/src/mapi/shared-glapi/.libs:$LD_LIBRARY_PATH"
+
+ # Some systems cannot cope with colon-terminated LD_LIBRARY_PATH
+ # The second colon is a workaround for a bug in BeOS R4 sed
+ LD_LIBRARY_PATH=`$ECHO "$LD_LIBRARY_PATH" | /bin/sed 's/::*$//'`
+
+ export LD_LIBRARY_PATH
+
+ if test "$libtool_execute_magic" != "%%%MAGIC variable%%%"; then
+ # Run the actual program with our arguments.
+ func_exec_program ${1+"$@"}
+ fi
+ else
+ # The program doesn't exist.
+ $ECHO "$0: error: '$progdir/$program' does not exist" 1>&2
+ $ECHO "This script is just a wrapper for $program." 1>&2
+ $ECHO "See the libtool documentation for more information." 1>&2
+ exit 1
+ fi
+fi
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..3e094f0dda
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,789 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+using std::pair;
+
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ MockCodeline(unsigned _op, const vector<int>& _dst, const vector<int>& _src, const vector<int>&_to):
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+
+
+/*
+ * supported opcodes for mock-shaders:
+ *
+ * Arithmentic
+ *
+ * TGSI_OPCODE_MOV, TGSI_OPCODE_UADD, TGSI_OPCODE_UMAD
+ *
+ * Flow control
+ * loops:
+ * TGSI_OPCODE_BGNLOOP, TGSI_OPCODE_ENDLOOP,
+ * TGSI_OPCODE_BRK, TGSI_OPCODE_CONT
+ *
+ * if/switch:
+ * TGSI_OPCODE_IF, TGSI_OPCODE_UIF, TGSI_OPCODE_ELSE, TGSI_OPCODE_ENDIF
+ * TGSI_OPCODE_SWITCH, TGSI_OPCODE_CASE, TGSI_OPCODE_DEFAULT
+ *
+ * TGSI_OPCODE_END
+ */
+
+
+class MockShader {
+public:
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+private:
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+using expectation = vector<vector<int>>;
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+MockShader::MockShader(const vector<MockCodeline>& source):
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+
+ // set texture registers
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ // the list is not fully initialized, so
+ // tearing it down also must be done manually.
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+/**
+ This is a text class to check the exact life times
+*/
+class LifetimeEvaluatorExactTest : public testing::Test {
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+};
+
+/**
+ This is a text class to check that the life time is at least
+ in the expected range
+*/
+class LifetimeEvaluatorAtLeastTest : public testing::Test {
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+};
+
+
+void LifetimeEvaluatorExactTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+
+ tgsi_temp_lifetime ana(shader.get_program(), shader.get_num_temps());
+ auto lifetimes = ana.get_lifetimes();
+
+ // lifetimes[0] not used, but created for simpler processing
+ ASSERT_EQ(lifetimes.size(), e.size());
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].first, e[i][0]);
+ EXPECT_EQ(lifetimes[i].second, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+
+ tgsi_temp_lifetime ana(shader.get_program(), shader.get_num_temps());
+ auto lifetimes = ana.get_lifetimes();
+
+ // lifetimes[0] not used, but created for simpler processing
+ ASSERT_EQ(lifetimes.size(), e.size());
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].first, e[i][0]);
+ EXPECT_GE(lifetimes[i].second, e[i][1]);
+ }
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}}, // 2
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}}, // 3
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}}, // 4
+ { TGSI_OPCODE_ENDLOOP }, // 5
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}}, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+// in loop if/else value written only in one path, and read later
+// - value must survive the whole loop
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF}, // 2
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}}, // 3
+ { TGSI_OPCODE_ENDIF}, // 4
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}}, // 5
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}}, // 6
+ { TGSI_OPCODE_ENDLOOP }, // 7
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}}, // 8
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
+}
+
+
+// in loop if/else value written in both path, and read later
+// - value must survive from first write to last read in loop
+// for now we only check that the minimum life time is correct
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF}, // 2
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}}, // 3
+ { TGSI_OPCODE_ELSE }, // 4
+ { TGSI_OPCODE_MOV, {2}, {1}, {}}, // 5
+ { TGSI_OPCODE_ENDIF}, // 6
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}}, // 7
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}}, // 8
+ { TGSI_OPCODE_ENDLOOP }, // 9
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}}, // 10
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
+}
+
+// in loop if/else value written in both path, red in else path
+// before read and also read later
+// - value must survive from first write to last read in loop
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF}, // 2
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}}, // 3
+ { TGSI_OPCODE_ELSE }, // 4
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}}, // 5
+ { TGSI_OPCODE_ENDIF}, // 6
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}}, // 7
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}}, // 8
+ { TGSI_OPCODE_ENDLOOP }, // 9
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}}, // 10
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+// in loop if/else read in one path before written in the same loop
+// - value must survive the whole loop
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF, {}, {in0}, {}}, // 2
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}}, // 3
+ { TGSI_OPCODE_ENDIF}, // 4
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}}, // 5
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}}, // 6
+ { TGSI_OPCODE_ENDLOOP }, // 7
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}}, // 8
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries
+ */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 5
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE}, // 10
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 15
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 5
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF}, // 10
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 13
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+
+// if a continue is in the loop, all variables written after the
+// continue and used outside the loop must be maintained for the
+// whole loop
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 5
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}}, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+// if a continue is in the loop, all variables written after the
+// continue and used outside the loop must be maintained for the
+// whole loop, but not further
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}}, // 7
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 7}}));
+}
+
+// if a continue is in the loop, all variables written after the
+// continue and used outside the loop must be maintained for all
+// loops up untto the read scope, but not further
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_BGNLOOP }, // 2
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}}, // 9
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 9}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+// value written in one case, and read in other, in loop
+// - must survive the loop
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_CASE, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_DEFAULT }, // 0
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_ENDSWITCH }, // 0
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_CASE, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_DEFAULT }, // 0
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH }, // 0
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+
+// value read/write in same case, stays there
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_CASE, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_DEFAULT }, // 0
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_ENDSWITCH }, // 0
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+// value read/write in all cases, should only live from first
+// write to last read, but currently the whole loop is used.
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}}, // 0
+ { TGSI_OPCODE_CASE, {}, {in0}, {} }, // 0
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_DEFAULT }, // 0
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK }, // 0
+ { TGSI_OPCODE_ENDSWITCH }, // 0
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 6
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+// value written in one case, and read in other, in loop, may fall through
+// - must survive the loop
+
+// value read/write in differnt loops
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 1
+ { TGSI_OPCODE_ENDLOOP }, // 2
+ { TGSI_OPCODE_BGNLOOP }, // 3
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}}, // 4
+ { TGSI_OPCODE_ENDLOOP }, // 5
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+// value read/write in differnt loops
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 3
+ { TGSI_OPCODE_ENDIF}, // 1
+ { TGSI_OPCODE_ENDLOOP }, // 5
+ { TGSI_OPCODE_BGNLOOP }, // 6
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}}, // 7
+ { TGSI_OPCODE_ENDLOOP }, // 5
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+// first read before first write wiredness with nested loops
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP }, // 0
+ { TGSI_OPCODE_BGNLOOP }, // 1
+ { TGSI_OPCODE_IF, {}, {in0}, {}}, // 2
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}}, // 3
+ { TGSI_OPCODE_ENDIF}, // 4
+ { TGSI_OPCODE_ENDLOOP }, // 5
+ { TGSI_OPCODE_BGNLOOP }, // 6
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 7
+ { TGSI_OPCODE_ENDLOOP }, // 8
+ { TGSI_OPCODE_ENDLOOP }, // 9
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+// register is only written. This should not happen,
+// but to handle the case we want the register to life
+// at least one instruction
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 3
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,0}}));
+}
+
+// register read in if
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 3
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}}, // 3
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+// register read in switch
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 3
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_END, {}, {1}, {}},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, DistinceScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}}, // 3
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+/* Check that two destination registers are used
+*/
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}}, // 3
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers are used
+*/
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}}, // 3
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}}, // 3
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+/* Check that two destination registers are used
+*/
+TEST_F(LifetimeEvaluatorExactTest, OverwriteWrittenOnlyTemps)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV , {1}, {in0}, {}}, // 3
+ { TGSI_OPCODE_MOV , {2}, {in1}, {}}, // 3
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,0}, {1,1}}));
+}
+
+
+TEST(RegisterRemapping, RegisterRemapping)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(7, proto);
+
+ vector<pair<int, int>> lt({{-1,-1},
+ {0, 1}, // 1
+ {0, 2}, // 2
+ {1, 2}, // 3
+ {2, 10}, // 4
+ {3, 5}, // 5
+ {5, 10} // 6
+ });
+
+
+
+ evaluate_remapping(lt, result);
+
+ vector<int> remap({0,1, 2, 3, 4, 5, 6});
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {return rn.valid ? rn.new_reg : x;});
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+
+ for(unsigned i = 1; i < remap.size(); ++i) {
+ EXPECT_EQ(remap[i], expect[i]);
+ }
+
+}
+
+
+TEST(RegisterRemapping, RegisterRemapping2)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(7, proto);
+
+ vector<pair<int, int>> lt({{-1,-1},
+ {0, 1}, // 1
+ {0, 2}, // 2
+ {3, 3}, // 3
+ {4, 4}, // 4
+ });
+
+
+
+ evaluate_remapping(lt, result);
+
+ vector<int> remap({0, 1, 2, 3, 4});
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {return rn.valid ? rn.new_reg : x;});
+
+ vector<int> expect({0, 1, 2, 1, 1});
+
+ for(unsigned i = 1; i < remap.size(); ++i) {
+ EXPECT_EQ(remap[i], expect[i]);
+ }
+
+}
--
2.13.0
Gert Wollny
2017-06-09 23:15:08 UTC
Permalink
This patch replaces the old register livetime estimation with the
new approach.
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0e7f4b646a..b76ad42536 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,10 +55,11 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"

#include "util/hash_table.h"
#include <algorithm>
+#include <iostream>

#define PROGRAM_ANY_CONST ((1 << PROGRAM_STATE_VAR) | \
(1 << PROGRAM_CONSTANT) | \
@@ -323,6 +324,7 @@ public:

void merge_two_dsts(void);
void merge_registers(void);
+ void merge_registers_alternative(void);
void renumber_registers(void);

void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
@@ -5042,6 +5044,17 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
}
}

+void
+glsl_to_tgsi_visitor::merge_registers_alternative(void)
+{
+ rename_reg_pair proto ={false, 0};
+ std::vector<rename_reg_pair> renames(this->next_temp, proto);
+ tgsi_temp_lifetime analysis(&this->instructions, this->next_temp);
+ auto lt = analysis.get_lifetimes();
+ evaluate_remapping(lt, renames);
+ rename_temp_registers(&renames[0]);
+}
+
/* Merges temporary registers together where possible to reduce the number of
* registers needed to run a program.
*
@@ -6492,7 +6505,7 @@ get_mesa_program_tgsi(struct gl_context *ctx,

v->merge_two_dsts();
if (!skip_merge_registers)
- v->merge_registers();
+ v->merge_registers_alternative();
v->renumber_registers();

/* Write the END instruction. */
--
2.13.0
Marek Olšák
2017-06-11 14:12:05 UTC
Permalink
Hi Gert,

Have you measured the CPU overhead of the new code?

Marek
Post by Gert Wollny
Dear all,
as I wrote before, I was looking into the temporary register renaming.
This series of patches implements a new approach that achieves a tigher
estimation of the life time of the temporaries, and as a result the Piano
and Voloplosion benchmarks implemented in gputest [1] now work. Before
they failed with "r600_pipe_shader_create - translation from TGSI failed!"
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders. I've also tested other programs like the unignie-*
benchmarks and they didn't show regressions.
I think that the patch will need a few more iterations to remove code duplication
and generally adhere to the mesa style, but I think it is atthe point where I could
need a bit of feedback to get it into shape to be acceptable, and I'd also like to
mention that since I'm new to mesa this I have no commit rights.
many thanks,
Gert
[1] http://www.geeks3d.com/gputest/
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
mesa/st: glsl_to_tgsi: tie in the new register renaming approach
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +-------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 135 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 551 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
src/mesa/state_tracker/tests/Makefile.am | 40 ++
src/mesa/state_tracker/tests/st-renumerate-test | 210 ++++++
.../tests/test_glsl_to_tgsi_lifetime.cpp | 789 +++++++++++++++++++++
11 files changed, 2104 insertions(+), 287 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Marek Olšák
2017-06-11 14:15:00 UTC
Permalink
Also, I don't know if people will like that it uses STL. I personally
have no issue with that as long as it doesn't break apps (e.g. the STL
shipped with apps should be the same as the STL shipped with the
distribution).

Marek
Post by Marek Olšák
Hi Gert,
Have you measured the CPU overhead of the new code?
Marek
Post by Gert Wollny
Dear all,
as I wrote before, I was looking into the temporary register renaming.
This series of patches implements a new approach that achieves a tigher
estimation of the life time of the temporaries, and as a result the Piano
and Voloplosion benchmarks implemented in gputest [1] now work. Before
they failed with "r600_pipe_shader_create - translation from TGSI failed!"
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders. I've also tested other programs like the unignie-*
benchmarks and they didn't show regressions.
I think that the patch will need a few more iterations to remove code duplication
and generally adhere to the mesa style, but I think it is atthe point where I could
need a bit of feedback to get it into shape to be acceptable, and I'd also like to
mention that since I'm new to mesa this I have no commit rights.
many thanks,
Gert
[1] http://www.geeks3d.com/gputest/
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
mesa/st: glsl_to_tgsi: tie in the new register renaming approach
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +-------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 135 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 551 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
src/mesa/state_tracker/tests/Makefile.am | 40 ++
src/mesa/state_tracker/tests/st-renumerate-test | 210 ++++++
.../tests/test_glsl_to_tgsi_lifetime.cpp | 789 +++++++++++++++++++++
11 files changed, 2104 insertions(+), 287 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Gert Wollny
2017-06-11 17:21:04 UTC
Permalink
Hello Marek,

thanks for chiming in.
Post by Marek Olšák
Also, I don't know if people will like that it uses STL. I personally
have no issue with that as long as it doesn't break apps (e.g. the
STL shipped with apps should be the same as the STL shipped with the
distribution).
Well, on Linux I would take it for granted that the STL used to run the
code is the same like the one the code was compiled with, and there are
already quite some places in the mesa code where STL constructs are
used (if that wounld't have been the case, then I would tried to avoid
the STL). I am actually more concerned that propagating the C++11
requirement to the whole of src/mesa might not be welcomed (although
everything compiles and runs fine).
Post by Marek Olšák
Post by Marek Olšák
Hi Gert,
Have you measured the CPU overhead of the new code?
So far no, I guess one would do that with the shader-db to get
reasonable complex shaders, but I only have a r600 based card so I'm
not sure whether I can run this. In any case, tomorrow I will take a
look into this.

Best,
Gert
Post by Marek Olšák
Post by Marek Olšák
Marek
Post by Gert Wollny
Dear all,
as I wrote before, I was looking into the temporary register renaming.
This series of patches implements a new approach that achieves a tigher
estimation of the life time of the temporaries, and as a result the Piano
and Voloplosion benchmarks implemented in gputest [1] now work. Before
they failed with "r600_pipe_shader_create - translation from TGSI failed!"
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders. I've also tested other programs like the unignie-*
benchmarks and they didn't show regressions.
I think that the patch will need a few more iterations to remove code duplication
and generally adhere to the mesa style, but I think it is atthe point where I could
need a bit of feedback to get it into shape to be acceptable, and I'd also like to
mention that since I'm new to mesa this I have no commit rights.
many thanks,
Gert
[1] http://www.geeks3d.com/gputest/
  mesa/st: glsl_to_tgsi move some helper classes to extra files
  mesa/st: glsl_to_tgsi Implement a new lifetime tracker for
temporaries
  mesa/st: glsl_to_tgsi: tie in the new register renaming
approach
 configure.ac                                       |   1 +
 src/mesa/Makefile.am                               |   4 +-
 src/mesa/Makefile.sources                          |   4 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 302 +----
---
 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++++++
 src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 135 ++++
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 551
++++++++++++++
 .../state_tracker/st_glsl_to_tgsi_temprename.h     | 114 +++
 src/mesa/state_tracker/tests/Makefile.am           |  40 ++
 src/mesa/state_tracker/tests/st-renumerate-test    | 210 ++++++
 .../tests/test_glsl_to_tgsi_lifetime.cpp           | 789
+++++++++++++++++++++
 11 files changed, 2104 insertions(+), 287 deletions(-)
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_private.h
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
 create mode 100644 src/mesa/state_tracker/tests/Makefile.am
 create mode 100755 src/mesa/state_tracker/tests/st-renumerate-
test
 create mode 100644
src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Gert Wollny
2017-06-12 08:09:03 UTC
Permalink
Post by Gert Wollny
Post by Marek Olšák
Have you measured the CPU overhead of the new code?
So far no, I guess one would do that with the shader-db to get
reasonable complex shaders, but I only have a r600 based card so I'm
not sure whether I can run this. In any case, tomorrow I will take a
look into this. 
I did runs of the shader-db/run program with valgrind/callgrind

Here the original merge_registers reports 0.21% and my code reports
0.50%.

If it is important to cut down on these addes 0.3%, I think, I can
eliminate 0.1% by changing the implementation to not used so many
dynamically allocated objects, but beyond that it will be difficult.

Best,
Gert
Michel Dänzer
2017-06-12 06:44:42 UTC
Permalink
Post by Gert Wollny
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders.
Which tests regressed (maybe you can put up a piglit HTML summary
somewhere generated from a run with and without your patches)? Do they
consistently pass without your patches and fail with them?
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
Gert Wollny
2017-06-12 09:32:56 UTC
Permalink
Post by Michel Dänzer
Post by Gert Wollny
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders.
Which tests regressed (maybe you can put up a piglit HTML summary
somewhere generated from a run with and without your patches)? Do
they consistently pass without your patches and fail with them?
I had to redo the results, because I realized that I had compared the
system mesa version (with EGL support) versus the test version (without
EGL support). 

Now both tested versions were configure with the same options and I run
both versions two times. The result diff of the quick test are:

piglit summary console -d results/o1 results/o2 results/n1 results/n2 

glx/glx-multithread-texture: pass pass fail fail

glx/glx-visuals-stencil: fail fail pass pass

glx/glx_arb_sync_control/timing -fullscreen -divisor 1: pass fail pass
fail

glx/glx_arb_sync_control/timing -fullscreen -divisor 2: pass fail fail
warn

glx/glx_arb_sync_control/timing -fullscreen -msc-delta 1: fail fail
warn fail

glx/glx_arb_sync_control/timing -fullscreen -msc-delta 2: fail fail
fail pass

glx/glx_arb_sync_control/timing -msc-delta 2: warn fail pass fail

glx/glx_arb_sync_control/timing -waitformsc -msc-delta 2: fail pass
pass fail

spec/arb_shader_bit_encoding/execution/and-clamp: fail fail pass fail

spec/arb_shading_language_420pack/active sampler conflict: fail fail
pass fail

spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec2-index-
rd: fail fail pass pass

spec/nv_conditional_render/drawpixels: fail pass fail pass

spec/nv_conditional_render/vertex_array: fail pass pass pass


summary:
       name:     o1     o2     n1     n2
       ----  ------ ------ ------ ------
       pass:  31583  31584  31588  31585
       fail:   1454   1454   1449   1452
      crash:      5      5      5      5
       skip:  17356  17356  17356  17356
    timeout:      0      0      0      0
       warn:     14     13     14     14
 incomplete:      0      0      0      0
 dmesg-warn:      0      0      0      0
 dmesg-fail:      0      0      0      0
    changes:      0      6      9      9
      fixes:      0      3      7      3
regressions:      0      3      2      6
      total:  50412  50412  50412  50412


Best,
Gert
Michel Dänzer
2017-06-12 09:58:56 UTC
Permalink
Post by Gert Wollny
Post by Michel Dänzer
Post by Gert Wollny
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders.
Which tests regressed (maybe you can put up a piglit HTML summary
somewhere generated from a run with and without your patches)? Do
they consistently pass without your patches and fail with them?
I had to redo the results, because I realized that I had compared the
system mesa version (with EGL support) versus the test version (without
EGL support).
Now both tested versions were configure with the same options and I run
piglit summary console -d results/o1 results/o2 results/n1 results/n2
glx/glx-multithread-texture: pass pass fail fail
Might want to make sure this isn't a regression caused by your patches,
but FWIW this test seems to fail for me with radeonsi with "random" values.
Post by Gert Wollny
glx/glx_arb_sync_control/timing -fullscreen -divisor 1: pass fail pass
fail
glx/glx_arb_sync_control/timing -fullscreen -divisor 2: pass fail fail
warn
glx/glx_arb_sync_control/timing -fullscreen -msc-delta 1: fail fail
warn fail
glx/glx_arb_sync_control/timing -fullscreen -msc-delta 2: fail fail
fail pass
glx/glx_arb_sync_control/timing -msc-delta 2: warn fail pass fail
glx/glx_arb_sync_control/timing -waitformsc -msc-delta 2: fail pass
pass fail
You can ignore these, their results are somewhat random. The piglit
patches below help a little, but the results are still not 100% reliable
when piglit runs multiple tests concurrently.

https://patchwork.freedesktop.org/patch/150486/
https://patchwork.freedesktop.org/patch/150484/
https://patchwork.freedesktop.org/patch/150485/


In summary, it looks like your patches don't cause any piglit regressions.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
Nicolai Hähnle
2017-06-12 10:17:23 UTC
Permalink
Post by Gert Wollny
Post by Michel Dänzer
Post by Gert Wollny
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders.
Which tests regressed (maybe you can put up a piglit HTML summary
somewhere generated from a run with and without your patches)? Do
they consistently pass without your patches and fail with them?
I had to redo the results, because I realized that I had compared the
system mesa version (with EGL support) versus the test version (without
EGL support).
Now both tested versions were configure with the same options and I run
piglit summary console -d results/o1 results/o2 results/n1 results/n2
glx/glx-multithread-texture: pass pass fail fail
glx/glx-visuals-stencil: fail fail pass pass
glx/glx_arb_sync_control/timing -fullscreen -divisor 1: pass fail pass
fail
glx/glx_arb_sync_control/timing -fullscreen -divisor 2: pass fail fail
warn
glx/glx_arb_sync_control/timing -fullscreen -msc-delta 1: fail fail
warn fail
glx/glx_arb_sync_control/timing -fullscreen -msc-delta 2: fail fail
fail pass
glx/glx_arb_sync_control/timing -msc-delta 2: warn fail pass fail
glx/glx_arb_sync_control/timing -waitformsc -msc-delta 2: fail pass
pass fail
The above are probably noise.
Post by Gert Wollny
spec/arb_shader_bit_encoding/execution/and-clamp: fail fail pass fail
spec/arb_shading_language_420pack/active sampler conflict: fail fail
pass fail
spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec2-index-
rd: fail fail pass pass
spec/nv_conditional_render/drawpixels: fail pass fail pass
spec/nv_conditional_render/vertex_array: fail pass pass pass
It's disconcerting that you have tests here whose pass status is
unstable. Those tests really should be deterministic.

You should definitely investigate
spec/arb_shader_bit_encoding/execution/and-clamp to see if it's related
to your patches.

Cheers,
Nicolai
Post by Gert Wollny
name: o1 o2 n1 n2
---- ------ ------ ------ ------
pass: 31583 31584 31588 31585
fail: 1454 1454 1449 1452
crash: 5 5 5 5
skip: 17356 17356 17356 17356
timeout: 0 0 0 0
warn: 14 13 14 14
incomplete: 0 0 0 0
dmesg-warn: 0 0 0 0
dmesg-fail: 0 0 0 0
changes: 0 6 9 9
fixes: 0 3 7 3
regressions: 0 3 2 6
total: 50412 50412 50412 50412
Best,
Gert
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-12 13:35:42 UTC
Permalink
Hello Nicolai,
Post by Gert Wollny
spec/arb_shader_bit_encoding/execution/and-clamp: fail fail pass fail
It's disconcerting that you have tests here whose pass status is 
unstable. Those tests really should be deterministic.
When I run only these tests ("all" and specified with -t ) then they
always pass (= three times in a row).

spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec2-
Post by Gert Wollny
index-
rd: fail fail pass pass
 
This one actually now passes because of the patch (before it failed
because it needed 125 registers, and only 124 are free to be used. 
Post by Gert Wollny
spec/arb_shading_language_420pack/active sampler conflict: fail
fail pass fail
 
spec/nv_conditional_render/drawpixels: fail pass fail pass
 
spec/nv_conditional_render/vertex_array: fail pass pass pass > 
These, however, are unstable, independent on whether my patches are
applied or not.

Best,
Gert
Nicolai Hähnle
2017-06-12 10:28:07 UTC
Permalink
Post by Gert Wollny
Dear all,
as I wrote before, I was looking into the temporary register renaming.
This series of patches implements a new approach that achieves a tigher
estimation of the life time of the temporaries, and as a result the Piano
and Voloplosion benchmarks implemented in gputest [1] now work. Before
they failed with "r600_pipe_shader_create - translation from TGSI failed!"
Piglit shows 7 fixes and 6 regressions compared to git 8fac894f, but they don't
seem to be related to shaders. I've also tested other programs like the unignie-*
benchmarks and they didn't show regressions.
I think that the patch will need a few more iterations to remove code duplication
and generally adhere to the mesa style, but I think it is atthe point where I could
need a bit of feedback to get it into shape to be acceptable, and I'd also like to
mention that since I'm new to mesa this I have no commit rights.
Plenty of style issues aside, can you explain where and why you get
tighter lifetimes?

Cheers,
Nicolai
Post by Gert Wollny
many thanks,
Gert
[1] http://www.geeks3d.com/gputest/
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
mesa/st: glsl_to_tgsi: tie in the new register renaming approach
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 302 +-------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 241 +++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 135 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 551 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 114 +++
src/mesa/state_tracker/tests/Makefile.am | 40 ++
src/mesa/state_tracker/tests/st-renumerate-test | 210 ++++++
.../tests/test_glsl_to_tgsi_lifetime.cpp | 789 +++++++++++++++++++++
11 files changed, 2104 insertions(+), 287 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100755 src/mesa/state_tracker/tests/st-renumerate-test
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-12 12:34:21 UTC
Permalink
Plenty of style issues aside, can you explain where and why you get 
tighter lifetimes?
In the original code if a temporary is used within a loop it gets the
whole life time of the loop assigned. 

With this patch I track in more detail where a temporary is accesses
and base the lifetime on this: For instance, if a variable is first
unconditionally written and later read for the last time in the same
scope (loop, if, or switch branch), then the lifetime can be restricted
to that first-written - last-read range.

The code gets more complex because it tries to resolve this also for
nested scopes, and one also has to take care about whether a variable
is written only conditionally within a loop, or conditionally read
before it is written (in the source code sense, but not in the program
flow sense).

Shaders that profit from this better lifetime estimation are the ones
that have many short living values within long loops, think

for () {
float4 t[1] = f(in2);
float4 t[2] = g(in1);
float4 t[3] = op(t[1], t[2]);
...
sum += t[200];
}

Here the old code would keep all of the t[i] alive for the whole loop,
and in fact with the GpuTest Piano benchmark I have seen a shader with
2000 temporaries where many were used in a loop but only required for
two or three lines so that my code could merge them to less then 100
temporaries, and this made it possible for the tgsi to bytecode layer
in r600g to actually translate the shader.

Best,
Gert

PS: Regarding style, I am fully aware that I have to iterate over this
code a few more times. I tried to adhere to the way the existing code
represents itself to me, but I'm happy to listen to any advise I can
get.

In any case, I though it might be best to send this patch out now for
discussion. Now, with the unit tests in place I will rework it and
focus also more on style questions. One thing that comes up immediately
up is that I will try to reduce the use of dynamically allocated
memory, since 60% of the run time of my code is with memory allocation
and de-allocation.
Nicolai Hähnle
2017-06-12 18:52:51 UTC
Permalink
Post by Gert Wollny
Post by Nicolai Hähnle
Plenty of style issues aside, can you explain where and why you get
tighter lifetimes?
In the original code if a temporary is used within a loop it gets the
whole life time of the loop assigned.
With this patch I track in more detail where a temporary is accesses
and base the lifetime on this: For instance, if a variable is first
unconditionally written and later read for the last time in the same
scope (loop, if, or switch branch), then the lifetime can be restricted
to that first-written - last-read range.
The code gets more complex because it tries to resolve this also for
nested scopes, and one also has to take care about whether a variable
is written only conditionally within a loop, or conditionally read
before it is written (in the source code sense, but not in the program
flow sense).
Shaders that profit from this better lifetime estimation are the ones
that have many short living values within long loops, think
for () {
float4 t[1] = f(in2);
float4 t[2] = g(in1);
float4 t[3] = op(t[1], t[2]);
...
sum += t[200];
}
Here the old code would keep all of the t[i] alive for the whole loop,
and in fact with the GpuTest Piano benchmark I have seen a shader with
Post by Nicolai Hähnle
2000 temporaries where many were used in a loop but only required for
two or three lines so that my code could merge them to less then 100
temporaries, and this made it possible for the tgsi to bytecode layer
in r600g to actually translate the shader.
Okay. I think you should seriously re-think your algorithm in a way that
makes it a more natural evolution from the algorithm that's already there.

Basically, the current algorithm tracks (first_write, last_read), so
think about what you need to track in order to obtain a single-pass
algorithm that computes lifetime (first, last) for every temporary. I
think the following should do it for a first cut:

struct st_calculate_lifetime_state {
/* First and last instruction that has this temporary */
unsigned first;
unsigned last;

/* First instruction of the outer-most in-active loop that
* contains this temporary. (A loop is active if we're
* currently processing instructions in it.)
unsigned loop_first;

/* Position of a read without preceding dominating write. */
unsigned undef_read[4];

/* First write in the program that is dominating our
* current position, per channel.
*/
unsigned first_dominating_write[4];
};

In addition, you need to keep a stack of active scopes (loops and ifs),
but you really only need to remember the start of the scope (and for
loops, probably the position of the first BREAK).

Here's a sketch of the "state machine" that you need to run while
traversing the program, assuming no BRK and CONT:

Init: last = 0, everything else = ~0

These are updates on individual variables on use:

On any use (source or dest):
- if first > cur_pos, set first = loop_first = cur_pos
- if loop_first < first, set first = loop_first
- update last

On use as source:
- if first_dominating_write > cur_pos and undef_read > cur_pos, set
undef_read = cur_pos

On use as dest:
- if first_dominating_write > cur_pos, set first_dominating_write = cur_pos

These are updates of all temporaries on scope change:

On ENDLOOP, for all temps:
- if loop_first > start of loop, set loop_first = start of loop
- first < start of loop, update last to end of loop
- if undef_read between start and end of loop: set first = MIN(first,
start of loop) and last = end of loop
- if first_dominating_write < end of loop, set undef_read = ~0

On ELSE and ENDIF, for all temps:
- if first_dominating_write > start of scope, set first_dominating_write
= ~0

I'm not sure right now whether BREAK / CONT need special treatment at
all. I think what you need is:

On ENDLOOP:
- if first_dominating_write is between the first BREAK in the loop and
the end of the loop, set first_dominating_write = ~0

And for CONT, you probably don't really need anything, because CONTs
cannot make you skip code forever.

What this state machine doesn't yet cover is

IF ..
MOV TEMP[0], ...
ELSE
MOV TEMP[0], ...
ENDIF

Still, I'd start with it and see whether you need to cover that case.

And even that case can probably be dealt with in a fairly efficient and
pragmatic way. The idea is to keep track of the nesting level of IFs,
plus an

uint32 dominating_write_in_true_block[4];

per temp. Then:

On ELSE:
- if first_dominating_write < cur_pos, set the bit corresponding to the
current nesting level in dominating_write_in_true_block

On ENDIF:
- don't reset first_dominating_write if the bit corresponding to the
current nesting level in dominating_write_in_true_block is set
- unconditionally clear that bit

It won't cover cases with nesting level > 32, but do we really care?

I hope I didn't miss anything, because after all this is admittedly
subtle stuff. Still, I think this kind of state-machine approach should
work and allow you to avoid *lots* of allocations and pointer-chasing.

Cheers,
Nicolai
Post by Gert Wollny
Best,
Gert
PS: Regarding style, I am fully aware that I have to iterate over this
code a few more times. I tried to adhere to the way the existing code
represents itself to me, but I'm happy to listen to any advise I can
get.
In any case, I though it might be best to send this patch out now for
discussion. Now, with the unit tests in place I will rework it and
focus also more on style questions. One thing that comes up immediately
up is that I will try to reduce the use of dynamically allocated
memory, since 60% of the run time of my code is with memory allocation
and de-allocation.
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Nicolai Hähnle
2017-06-12 19:00:38 UTC
Permalink
Post by Nicolai Hähnle
Post by Gert Wollny
Post by Nicolai Hähnle
Plenty of style issues aside, can you explain where and why you get
tighter lifetimes?
In the original code if a temporary is used within a loop it gets the
whole life time of the loop assigned.
With this patch I track in more detail where a temporary is accesses
and base the lifetime on this: For instance, if a variable is first
unconditionally written and later read for the last time in the same
scope (loop, if, or switch branch), then the lifetime can be restricted
to that first-written - last-read range.
The code gets more complex because it tries to resolve this also for
nested scopes, and one also has to take care about whether a variable
is written only conditionally within a loop, or conditionally read
before it is written (in the source code sense, but not in the program
flow sense).
Shaders that profit from this better lifetime estimation are the ones
that have many short living values within long loops, think
for () {
float4 t[1] = f(in2);
float4 t[2] = g(in1);
float4 t[3] = op(t[1], t[2]);
...
sum += t[200];
}
Here the old code would keep all of the t[i] alive for the whole loop,
and in fact with the GpuTest Piano benchmark I have seen a shader with
Post by Nicolai Hähnle
2000 temporaries where many were used in a loop but only required for
two or three lines so that my code could merge them to less then 100
temporaries, and this made it possible for the tgsi to bytecode layer
in r600g to actually translate the shader.
Okay. I think you should seriously re-think your algorithm in a way that
makes it a more natural evolution from the algorithm that's already there.
Basically, the current algorithm tracks (first_write, last_read), so
think about what you need to track in order to obtain a single-pass
algorithm that computes lifetime (first, last) for every temporary. I
struct st_calculate_lifetime_state {
/* First and last instruction that has this temporary */
unsigned first;
unsigned last;
/* First instruction of the outer-most in-active loop that
* contains this temporary. (A loop is active if we're
* currently processing instructions in it.)
unsigned loop_first;
/* Position of a read without preceding dominating write. */
unsigned undef_read[4];
/* First write in the program that is dominating our
* current position, per channel.
*/
unsigned first_dominating_write[4];
};
In addition, you need to keep a stack of active scopes (loops and ifs),
but you really only need to remember the start of the scope (and for
loops, probably the position of the first BREAK).
Here's a sketch of the "state machine" that you need to run while
Init: last = 0, everything else = ~0
- if first > cur_pos, set first = loop_first = cur_pos
- if loop_first < first, set first = loop_first
- update last
- if first_dominating_write > cur_pos and undef_read > cur_pos, set
undef_read = cur_pos
- if first_dominating_write > cur_pos, set first_dominating_write = cur_pos
- if loop_first > start of loop, set loop_first = start of loop
- first < start of loop, update last to end of loop
This should of course read:

- if first < start of loop and last is inside the loop, then update last
to the end of the loop

Cheers,
Nicolai
Post by Nicolai Hähnle
- if undef_read between start and end of loop: set first = MIN(first,
start of loop) and last = end of loop
- if first_dominating_write < end of loop, set undef_read = ~0
- if first_dominating_write > start of scope, set first_dominating_write
= ~0
I'm not sure right now whether BREAK / CONT need special treatment at
- if first_dominating_write is between the first BREAK in the loop and
the end of the loop, set first_dominating_write = ~0
And for CONT, you probably don't really need anything, because CONTs
cannot make you skip code forever.
What this state machine doesn't yet cover is
IF ..
MOV TEMP[0], ...
ELSE
MOV TEMP[0], ...
ENDIF
Still, I'd start with it and see whether you need to cover that case.
And even that case can probably be dealt with in a fairly efficient and
pragmatic way. The idea is to keep track of the nesting level of IFs,
plus an
uint32 dominating_write_in_true_block[4];
- if first_dominating_write < cur_pos, set the bit corresponding to the
current nesting level in dominating_write_in_true_block
- don't reset first_dominating_write if the bit corresponding to the
current nesting level in dominating_write_in_true_block is set
- unconditionally clear that bit
It won't cover cases with nesting level > 32, but do we really care?
I hope I didn't miss anything, because after all this is admittedly
subtle stuff. Still, I think this kind of state-machine approach should
work and allow you to avoid *lots* of allocations and pointer-chasing.
Cheers,
Nicolai
Post by Gert Wollny
Best,
Gert
PS: Regarding style, I am fully aware that I have to iterate over this
code a few more times. I tried to adhere to the way the existing code
represents itself to me, but I'm happy to listen to any advise I can
get.
In any case, I though it might be best to send this patch out now for
discussion. Now, with the unit tests in place I will rework it and
focus also more on style questions. One thing that comes up immediately
up is that I will try to reduce the use of dynamically allocated
memory, since 60% of the run time of my code is with memory allocation
and de-allocation.
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-13 08:01:28 UTC
Permalink
Am Montag, den 12.06.2017, 21:00 +0200 schrieb Nicolai Hähnle:

Thanks for you comments, although I do not agree with most of them.
Post by Nicolai Hähnle
Okay. I think you should seriously re-think your algorithm in a way
that  makes it a more natural evolution from the algorithm that's
already there.
Well, when I look at the current algorithm than I don't really see much
to evolve from: The tracking of loops is minimal and it actually only
uses the outer loop to assign life times.
In any case, 80% of this code I already re-used (i.e. the loop over the
instructions and iterating over the registers).
Post by Nicolai Hähnle
Basically, the current algorithm tracks (first_write, last_read),
so  think about what you need to track in order to obtain a single-
pass  algorithm that computes lifetime (first, last) for every
temporary. I 
IMHO a single pass algorithm is not better then what I do now.
Especially with nested loops a single pass algorithm will be more
complicated.

Think e.g.

...
BEGIN_LOOP
...
 BEGIN_LOOP
a = ... 
b = ... 
c = a OP b; 
END_LOOP 
... /* more processing that doesn't access a, b, or c */
END_LOOP

out = f(a, c, ...)
END


Here, a and c must be kept alive for both loops, but b only is
needed for a few instructions. However, in a single pass state tracker
I have to keep all the information for a, b, and c until the end of the
program, because only then I can discard the loop information for b,
and resolve everything else for a and c.

With the approach I am using, i.e. only collecting all the access
information in the pass over the program, and resolving the life-times
at the end by re-playing the temp-access timeline, the above can be
achieved with less hassle, because I don't need to track per temporary
information for all the temporaries all the time, instead, I only need
to resolve loops and scopes when it is really needed.

[snip]

To me the algorithm you lined out looks quite complicated, but not too
different from what I'm doing when I replay the access time-line of a
register. However, with your approach one has to track the state of
each variable all the time, information that could be shared otherwise
and might not even needed.

[snip]
Post by Nicolai Hähnle
I hope I didn't miss anything, because after all this is
admittedly subtle stuff.
It is, and this is why I think it is better to separate the steps into
manageable chunks that can be put under test. (I admit, I'm a fan of
test driven development, and for that I also think it is more important
to write test cases instead of sketching out algorithms).
Post by Nicolai Hähnle
Still, I think this kind of state-machine approach should 
work and allow you to avoid *lots* of allocations and pointer-
chasing.
The allocations I can (and will) get rid of, but I don't see some
pointer-chasing as a problem, since it is encapsulated within class
methods.

I thank you for your comments, but given that my code is working I
don't see that re-doing it from scratch is such a good idea. I think
refactoring it to eliminate the allocations and covering additional
test cases is a better approach. If this makes it possible to move the
implementation to be single pass, then I might consider it, but I think
tracking all the information for all temporaries all the time is not
such a good idea, especially for large shaders that might have 2000+
temporaries before register merging.

Best,
Gert
Nicolai Hähnle
2017-06-13 09:07:47 UTC
Permalink
Post by Gert Wollny
Thanks for you comments, although I do not agree with most of them.
Post by Nicolai Hähnle
Okay. I think you should seriously re-think your algorithm in a way
that makes it a more natural evolution from the algorithm that's
already there.
Well, when I look at the current algorithm than I don't really see much
to evolve from: The tracking of loops is minimal and it actually only
uses the outer loop to assign life times.
In any case, 80% of this code I already re-used (i.e. the loop over the
instructions and iterating over the registers).
Post by Nicolai Hähnle
Basically, the current algorithm tracks (first_write, last_read),
so think about what you need to track in order to obtain a single-
pass algorithm that computes lifetime (first, last) for every
temporary. I
IMHO a single pass algorithm is not better then what I do now.
Especially with nested loops a single pass algorithm will be more
complicated.
Think e.g.
...
BEGIN_LOOP
...
BEGIN_LOOP
a = ...
b = ...
c = a OP b;
END_LOOP
... /* more processing that doesn't access a, b, or c */
END_LOOP
out = f(a, c, ...)
END
Here, a and c must be kept alive for both loops, but b only is
needed for a few instructions. However, in a single pass state tracker
I have to keep all the information for a, b, and c until the end of the
program, because only then I can discard the loop information for b,
and resolve everything else for a and c.
In terms of memory use, your approach also keeps the information for all
variables all the time, because of the multi-pass.

In terms of running time, it's true that what I sketched will loop over
all temporaries for every end-of-scope.

However, that's fairly simple to fix by keeping track of which
temporaries occurred per-scope, and then only looping over those
temporaries.

You could even do the tracking with a single stack-like array, so you
end up with only 3 memory allocations:

- stack of currently active scopes
- array of temporaries
- stack of temporary-occurences per scope
Post by Gert Wollny
With the approach I am using, i.e. only collecting all the access
information in the pass over the program, and resolving the life-times
at the end by re-playing the temp-access timeline, the above can be
achieved with less hassle, because I don't need to track per temporary
information for all the temporaries all the time, instead, I only need
to resolve loops and scopes when it is really needed.
[snip]
To me the algorithm you lined out looks quite complicated, but not too
different from what I'm doing when I replay the access time-line of a
register. However, with your approach one has to track the state of
each variable all the time, information that could be shared otherwise
and might not even needed.
[snip]
Post by Nicolai Hähnle
I hope I didn't miss anything, because after all this is
admittedly subtle stuff.
It is, and this is why I think it is better to separate the steps into
manageable chunks that can be put under test. (I admit, I'm a fan of
test driven development, and for that I also think it is more important
to write test cases instead of sketching out algorithms).
Post by Nicolai Hähnle
Still, I think this kind of state-machine approach should
work and allow you to avoid *lots* of allocations and pointer-
chasing.
The allocations I can (and will) get rid of, but I don't see some
pointer-chasing as a problem, since it is encapsulated within class
methods.
How will you get rid of those allocations? I find that it's useful to be
able to sketch the data structures first.

At the very least, you need a vector per temporary. You're also not
handling the case where a temporary is set in both branches of an
if/else, and I'd argue that the approach is more annoying to adapt to
handling it than what I sketched.

By the way, that's related to another conceptual advantage of having a
state machine that acts on end-of-scope, which is that this is basically
like having an action for phi-nodes.
Post by Gert Wollny
I thank you for your comments, but given that my code is working I
don't see that re-doing it from scratch is such a good idea. I think
refactoring it to eliminate the allocations and covering additional
test cases is a better approach. If this makes it possible to move the
implementation to be single pass, then I might consider it, but I think
tracking all the information for all temporaries all the time is not
such a good idea, especially for large shaders that might have 2000+
temporaries before register merging.
Well, as I said, you're tracking all the information anyway, and as for
when you *update* the information, there's an easy way to make that lean.

I'm curious what you'd suggest for getting rid of allocations anyway.

Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-14 06:30:19 UTC
Permalink
Post by Nicolai Hähnle
I'm curious what you'd suggest for getting rid of allocations anyway.
As the refactoring goes I think I will end up with a hybrid approach:
In the temporaries I will not keep the full time line, but the
important read/write information - just like you suggested. However, I
will not resolve the scopes at the end of each loop, but only after the
program is fully scanned. For that I need to keep the information of
all scopes available and links to the key scopes fro each temporary.

Equal to what you pointed out above, I'll need three allocations for
this: 

- the vector for the scopes,
- the vector for the temporaries
- a stack to handle the scope changes.

To not limit the number of scopes and the scope nesting level the scope
vector and the stack might do re-allocations though.

I think right now I will not go for tracking whether a temporary is
written in both if/else branches or all switch cases. What I want to
achieve is that the drivers don't get into trouble because too many
temporaries need to be allocated when the TGSI is translated into byte
code (test case R600 with 124 free usable registers), and so far this
seems to work without tackling this detail.

Thank you again for your insights,
Gert
Nicolai Hähnle
2017-06-14 07:55:19 UTC
Permalink
Post by Gert Wollny
Post by Nicolai Hähnle
I'm curious what you'd suggest for getting rid of allocations anyway.
In the temporaries I will not keep the full time line, but the
important read/write information - just like you suggested. However, I
will not resolve the scopes at the end of each loop, but only after the
program is fully scanned. For that I need to keep the information of
all scopes available and links to the key scopes fro each temporary.
Okay. Hmm, I think in a sense what you might end up doing is resolving
scopes lazily. That does sound good, and I'm starting to see how it
might be possible. Try to keep the data structures clean so that you can
easily explain all the variables in comments. That should help a lot.
Post by Gert Wollny
Equal to what you pointed out above, I'll need three allocations for
- the vector for the scopes,
- the vector for the temporaries
- a stack to handle the scope changes.
To not limit the number of scopes and the scope nesting level the scope
vector and the stack might do re-allocations though.
I think right now I will not go for tracking whether a temporary is
written in both if/else branches or all switch cases. What I want to
achieve is that the drivers don't get into trouble because too many
temporaries need to be allocated when the TGSI is translated into byte
code (test case R600 with 124 free usable registers), and so far this
seems to work without tackling this detail.
Sounds good. If I'm understanding the "lazy scope resolution" correctly,
it might even be not too difficult to add if/else and switch case
resolution after the fact. Anyway, it's not needed for a first version.

Cheers,
Nicolai
Post by Gert Wollny
Thank you again for your insights,
Gert
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-16 09:31:59 UTC
Permalink
Dear all,

with the help of Nicolai's comments I rewrote the proposed patch set to improve
the register renaming.

The patch is related to bugs where shader compilation fails with
"- translation from TGSI failed!"

Among these is https://bugs.freedesktop.org/show_bug.cgi?id=65448 which I can
confirm will be fixed for R600_DEBUG=nosb set (with sb enabled it will fail
with a failing assertion in the sb code).

Changes to the first patch set are:
- significantly cutting down on the memory allocations
- exposing only a minimal interface to register lifetime estimation and
calculating the rename table.

The algorithm works like follows:
- first the program is scanned, the loops, switch and if/else scopes are
collected and for each temporary first and last reads and writes and the
according scopes are collected, and it its recorded whether a variable is
written conditionally, and whether loops have continue or break statements.
- then after the whole program has been scanned, the life times are estimated by
merging the read and write scopes for each temporary.
- the register mapping is evaluated
- applying the mapping is done with the rename_temp_registers method already
in place.

The algorithm tracks optimal life times for temporaries that are written
unconditionally. For temporaries written in if/else branches or switch cases
it is not (yet) tracked whether they are written in all branches, and hence,
the estimated life time does not necessarily comprise the optimum.

Running piglit on the shaders shows no regressions, and marks one more test as
passing:

***@glsl-***@execution@variable-***@gs-input-array-vec2-index-rd

However, I don't think that my patch actually tackles the true problem of this
shader - i.e. the shader copies a large input block to temp arrays, and accesses
these indirectly via a variable not controlled by the shader, thereby making
register renaming impossible for these temporaries.

Checking the perfocmance by running the shader-db

perf record --call-graph ./run -j1 shaders

I get the following performance compared to the original implementation

current patches applied
self 0.25 0.22
- life-time estimation 0.03 0.12
- evaluate mapping (in self=0.17) 0.05
- rename-registers 0.05 0.05

All numbers are in %, normalized for the corresponding number reported for main.

The reduction when evaluating the mappings results because the original
implementation uses a brute force O(n^2) algorithm, whereas I use a O(n log n)
algorithm to find renaming candidates.

Many thanks for any comments,
Gert

Gert Wollny (3):
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries
mesa/st: glsl_to_tgsi: tie in the new register renaming approach

configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 299 +------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 202 +++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 164 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 674 +++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 30 +
src/mesa/state_tracker/tests/Makefile.am | 40 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 959 +++++++++++++++++++++
10 files changed, 2092 insertions(+), 285 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
Gert Wollny
2017-06-16 09:32:01 UTC
Permalink
This patch adds new classes and tests to implement a tracker for the
life time of temporary registers for the register renaming stage of
glsl_to_tgsi. The tracker aims at estimating the shortest possible
life time for each register. The code base requires c++11, the flag is
propagated from the LLVM_CXXFLAGS.
---
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 202 +++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 164 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 674 +++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 30 +
src/mesa/state_tracker/tests/Makefile.am | 40 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 959 +++++++++++++++++++++
9 files changed, 2074 insertions(+), 2 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp

diff --git a/configure.ac b/configure.ac
index 6c67d27084..855d06779c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2827,6 +2827,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..72ffd61212 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.

-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests

if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
@@ -101,7 +101,7 @@ AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(MSVC2013_COMPAT_CFLAGS)
AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+ $(LLVM_CXXFLAGS) \
$(VISIBILITY_CXXFLAGS) \
$(MSVC2013_COMPAT_CXXFLAGS)

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..a68e9d2afe 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
new file mode 100644
index 0000000000..d3115a4bfc
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
@@ -0,0 +1,202 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+
+using std::vector;
+
+extern int swizzle_for_size(int size);
+
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+ unsigned num_elements = 4;
+
+ if (type) {
+ type = type->without_array();
+ if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+ }
+
+ int swizzle = swizzle_for_size(num_elements);
+ assert(num_elements + component <= 4);
+
+ swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+ return swizzle;
+}
+
+
+
+st_src_reg::st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component, unsigned array_id)
+{
+ assert(file != PROGRAM_ARRAY || array_id != 0);
+ this->file = file;
+ this->index = index;
+ this->swizzle = swizzle_for_type(type, component);
+ this->negate = 0;
+ this->abs = 0;
+ this->index2D = 0;
+ this->type = type ? type->base_type : GLSL_TYPE_ERROR;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = index2D;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->swizzle = 0;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg st_src_reg::get_abs()
+{
+ st_src_reg reg = *this;
+ reg.negate = 0;
+ reg.abs = 1;
+ return reg;
+}
+
+st_src_reg::st_src_reg(st_dst_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->double_reg2 = false;
+ this->array_id = reg.array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_dst_reg::st_dst_reg(st_src_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->writemask = WRITEMASK_XYZW;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->array_id = reg.array_id;
+}
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->array_id = 0;
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.h b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
new file mode 100644
index 0000000000..d729bc008d
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
@@ -0,0 +1,164 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <mesa/main/mtypes.h>
+#include <compiler/glsl_types.h>
+#include <compiler/glsl/ir.h>
+#include <tgsi/tgsi_info.h>
+#include <stack>
+#include <vector>
+
+class st_dst_reg;
+
+/**
+ * This struct is a corresponding struct to TGSI ureg_src.
+ */
+class st_src_reg {
+public:
+ st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component = 0, unsigned array_id = 0);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D);
+
+ st_src_reg();
+
+ explicit st_src_reg(st_dst_reg reg);
+
+ st_src_reg get_abs();
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+
+ uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
+ int negate:4; /**< NEGATE_XYZW mask from mesa */
+ unsigned abs:1;
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ /*
+ * Is this the second half of a double register pair?
+ * currently used for input mapping only.
+ */
+ unsigned double_reg2:1;
+ unsigned is_double_vertex_input:1;
+ unsigned array_id:10;
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+
+};
+
+class st_dst_reg {
+public:
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index);
+
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type);
+
+ st_dst_reg();
+
+ explicit st_dst_reg(st_src_reg reg);
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ unsigned array_id:10;
+
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+};
+
+class glsl_to_tgsi_instruction : public exec_node {
+public:
+ DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
+
+ st_dst_reg dst[2];
+ st_src_reg src[4];
+ st_src_reg resource; /**< sampler or buffer register */
+ st_src_reg *tex_offsets;
+
+ /** Pointer to the ir source this tree came from for debugging */
+ ir_instruction *ir;
+
+ unsigned op:8; /**< TGSI opcode */
+ unsigned saturate:1;
+ unsigned is_64bit_expanded:1;
+ unsigned sampler_base:5;
+ unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
+ unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type:5;
+ unsigned tex_shadow:1;
+ unsigned image_format:9;
+ unsigned tex_offset_num_offset:3;
+ unsigned dead_mask:4; /**< Used in dead code elimination */
+ unsigned buffer_access:3; /**< buffer access type */
+
+ const struct tgsi_opcode_info *info;
+};
+
+struct rename_reg_pair {
+ bool valid;
+ int new_reg;
+};
+
+inline bool
+is_resource_instruction(unsigned opcode)
+{
+ switch (opcode) {
+ case TGSI_OPCODE_RESQ:
+ case TGSI_OPCODE_LOAD:
+ case TGSI_OPCODE_ATOMUADD:
+ case TGSI_OPCODE_ATOMXCHG:
+ case TGSI_OPCODE_ATOMCAS:
+ case TGSI_OPCODE_ATOMAND:
+ case TGSI_OPCODE_ATOMOR:
+ case TGSI_OPCODE_ATOMXOR:
+ case TGSI_OPCODE_ATOMUMIN:
+ case TGSI_OPCODE_ATOMUMAX:
+ case TGSI_OPCODE_ATOMIMIN:
+ case TGSI_OPCODE_ATOMIMAX:
+ return true;
+ default:
+ return false;
+ }
+}
+
+inline unsigned
+num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->num_dst;
+}
+
+inline unsigned
+num_inst_src_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->is_tex || is_resource_instruction(op->op) ?
+ op->info->num_src - 1 : op->info->num_src;
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..a2e8e3778c
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,674 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <stack>
+#include <algorithm>
+#include <limits>
+
+using std::vector;
+using std::stack;
+using std::pair;
+using std::make_pair;
+using std::numeric_limits;
+
+
+typedef int scope_idx;
+
+enum e_scope_type {
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_switch_case,
+ sct_switch_default,
+ sct_unknown
+};
+
+enum e_acc_type {
+ acc_read,
+ acc_write
+};
+
+class prog_scope {
+
+public:
+ prog_scope(e_scope_type type, int my_idx, int id, int lvl, int s_begin,
+ vector<prog_scope>& scopes);
+ prog_scope(scope_idx p, e_scope_type type, int my_idx, int id,
+ int lvl, int s_begin, vector<prog_scope>& scopes);
+
+ e_scope_type type() const { return scope_type; }
+ scope_idx parent() const { return parent_scope; }
+ int level() const {return nested_level; }
+ int id() const { return scope_id; }
+ int end() const {return scope_end; }
+ int begin() const {return scope_begin; }
+ int loop_continue_line() const {return loop_continue;}
+
+ scope_idx in_ifelse() const;
+ scope_idx in_switchcase() const;
+
+ bool in_loop() const;
+ scope_idx get_parent_loop() const;
+ bool is_conditional() const;
+ bool contains(scope_idx idx) const;
+ void set_end(int end);
+ void set_previous(scope_idx prev);
+ void set_continue(scope_idx scope, int i);
+ bool enclosed_by_loop_prior_to_switch();
+private:
+
+ e_scope_type scope_type;
+ int scope_id;
+ int nested_level;
+ int scope_begin;
+ int scope_end;
+ int loop_continue;
+
+ scope_idx my_idx;
+ scope_idx scope_of_loop_to_continue;
+ scope_idx previous_switchcase;
+ scope_idx parent_scope;
+
+ vector<prog_scope>& scopes;
+};
+
+class temp_access {
+
+public:
+ temp_access(vector<prog_scope>& scopes);
+ void append(int index, e_acc_type rw, scope_idx pstate);
+ pair<int, int> get_required_lifetime();
+
+private:
+
+ struct temp_access_record {
+ int index;
+ e_acc_type acc;
+ scope_idx prog_scope;
+ };
+
+ std::vector<prog_scope>& scopes;
+
+ bool keep_for_full_loop;
+
+ scope_idx last_read_scope;
+ scope_idx undefined_read_scope;
+ scope_idx first_write_scope;
+
+ int first_write;
+ int last_read;
+ int last_write;
+ int undefined_read;
+};
+
+
+class tgsi_temp_lifetime {
+
+public:
+
+ tgsi_temp_lifetime();
+
+ vector<std::pair<int, int> > get_lifetimes(exec_list *instructions,
+ int ntemps) const;
+private:
+
+ scope_idx make_scope(e_scope_type type, int id, int lvl, int s_begin) const;
+ scope_idx make_scope(scope_idx p, e_scope_type type, int id,
+ int lvl, int s_begin) const;
+
+ void evaluate();
+
+ mutable vector<prog_scope> scopes;
+
+};
+
+tgsi_temp_lifetime::tgsi_temp_lifetime()
+{
+ scopes.reserve(20);
+}
+
+scope_idx
+tgsi_temp_lifetime::make_scope(e_scope_type type, int id, int lvl,
+ int s_begin)const
+{
+ int idx = scopes.size();
+ scopes.push_back(prog_scope(type, idx, id, lvl, s_begin, scopes));
+ return idx;
+}
+
+scope_idx
+tgsi_temp_lifetime::make_scope(scope_idx p, e_scope_type type, int id,
+ int lvl, int s_begin) const
+{
+ int idx = scopes.size();
+ scopes.push_back(prog_scope(p, type, idx, id, lvl, s_begin, scopes));
+ return idx;
+}
+
+vector<pair<int, int> >
+tgsi_temp_lifetime::get_lifetimes(exec_list *instructions, int ntemps) const
+{
+ int line = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int nesting_lvl = 0;
+ bool is_at_end = false;
+ stack<scope_idx> scope_stack;
+
+ std::vector<std::pair<int, int> > lifetimes(ntemps);
+ vector<temp_access> acc(ntemps, temp_access(scopes));
+
+ scope_idx current = make_scope(sct_outer, 0, nesting_lvl++, line);
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (is_at_end) {
+ // shader has instructions past end marker; we ignore this
+ break;
+ }
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ scope_idx scope = make_scope(current, sct_loop, loop_id,
+ nesting_lvl, line);
+ ++loop_id;
+ ++nesting_lvl;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --nesting_lvl;
+ scopes[current].set_end(line);
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+ case TGSI_OPCODE_IF:
+ case TGSI_OPCODE_UIF:{
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ }
+ scope_idx scope = make_scope(current, sct_if, if_id, nesting_lvl, line);
+ ++if_id;
+ ++nesting_lvl;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ scopes[current].set_end(line-1);
+ current = make_scope(scopes[current].parent(), sct_else,
+ scopes[current].id(), scopes[current].level(), line);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ scopes[current].set_end(line);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --nesting_lvl;
+ scopes[current].set_end(line-1);
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ scope_idx scope = make_scope(current, sct_switch, switch_id,
+ nesting_lvl, line);
+ ++nesting_lvl;
+ ++switch_id;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --nesting_lvl;
+ scopes[current].set_end(line-1);
+
+ // remove the case level
+ if (scopes[current].type() != sct_switch ) {
+ current = scope_stack.top();
+ scope_stack.pop();
+ }
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+
+ case TGSI_OPCODE_CASE:
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ } // fall through
+ case TGSI_OPCODE_DEFAULT: {
+ auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
+ sct_switch_case : sct_switch_default;
+ if ( scopes[current].type() == sct_switch ) {
+ scope_stack.push(current);
+ current = make_scope(current, scope_type, scopes[current].id(),
+ nesting_lvl, line);
+ }else{
+ auto p = scopes[current].parent();
+ auto scope = make_scope(p, scope_type, scopes[p].id(),
+ scopes[p].level(), line);
+ if (scopes[current].end() == -1)
+ scopes[scope].set_previous(current);
+ current = scope;
+ }
+ break;
+ }
+ case TGSI_OPCODE_BRK: {
+ if ( (scopes[current].type() == sct_switch_case) ||
+ (scopes[current].type() == sct_switch_default)) {
+ scopes[current].set_end(line-1);
+ }
+ /* Make sure that the nearest enclosing scope is a loop
+ * and not a switch case.
+ * Apart from that this is like a continue, just
+ * a bit more final */
+ else if (scopes[current].enclosed_by_loop_prior_to_switch()) {
+ scopes[current].set_continue(current, line);
+ }
+ break;
+ }
+ case TGSI_OPCODE_CONT: {
+ scopes[current].set_continue(current, line);
+ break;
+ }
+
+ default: {
+
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->dst[j].index].append(line, acc_write, current);
+ }
+ }
+
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].append(line, acc_read, current);
+ }
+ }
+
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->tex_offsets[j].index].append(line, acc_read, current);
+ }
+ }
+
+ } // end default
+ } // end switch (op)
+
+ ++line;
+ }
+
+ // make sure last scope is closed, even though no
+ // TGSI_OPCODE_END was given
+ if (scopes[current].end() < 0) {
+ scopes[current].set_end(line-1);
+ }
+
+ for(unsigned i = 1; i < lifetimes.size(); ++i) {
+ lifetimes[i] = acc[i].get_required_lifetime();
+ }
+ scopes.clear();
+ return lifetimes;
+}
+
+
+prog_scope::prog_scope(e_scope_type type, int idx, int id,
+ int lvl, int s_begin,
+ std::vector<prog_scope>& s):
+ prog_scope(-1, type, idx, id, lvl, s_begin, s)
+{
+}
+
+prog_scope::prog_scope(scope_idx p, e_scope_type type,
+ int idx, int id, int lvl, int s_begin,
+ std::vector<prog_scope>& s):
+ scope_type(type),
+ scope_id(id),
+ nested_level(lvl),
+ scope_begin(s_begin),
+ scope_end(-1),
+ loop_continue(numeric_limits<int>::max()),
+ my_idx(idx),
+ scope_of_loop_to_continue(0),
+ previous_switchcase(0),
+ parent_scope(p),
+ scopes(s)
+{
+}
+
+bool prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (parent_scope >= 0)
+ return scopes[parent_scope].in_loop();
+ return false;
+}
+
+scope_idx
+prog_scope::get_parent_loop() const
+{
+ if (scope_type == sct_loop)
+ return my_idx;
+ if (parent_scope >= 0)
+ return scopes[parent_scope].get_parent_loop();
+ else
+ return -1;
+}
+
+bool prog_scope::contains(scope_idx other) const
+{
+ return (begin() <= scopes[other].begin()) && (end() >= scopes[other].end());
+}
+
+bool prog_scope::is_conditional() const
+{
+ return scope_type == sct_if || scope_type == sct_else ||
+ scope_type == sct_switch_case || scope_type == sct_switch_default;
+}
+
+bool prog_scope::enclosed_by_loop_prior_to_switch()
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default ||
+ scope_type == sct_switch)
+ return false;
+ if (parent_scope >= 0)
+ return scopes[parent_scope].enclosed_by_loop_prior_to_switch();
+ else
+ return false;
+}
+
+scope_idx prog_scope::in_ifelse() const
+{
+ if ((scope_type == sct_if) ||
+ (scope_type == sct_else))
+ return my_idx;
+ else if (parent_scope >= 0)
+ return scopes[parent_scope].in_ifelse();
+ else
+ return -1;
+}
+
+scope_idx prog_scope::in_switchcase() const
+{
+ if ((scope_type == sct_switch_case) ||
+ (scope_type == sct_switch_default))
+ return my_idx;
+ else if (parent_scope >= 0)
+ return scopes[parent_scope].in_switchcase();
+ else
+ return -1;
+}
+
+void prog_scope::set_previous(scope_idx prev)
+{
+ previous_switchcase = prev;
+}
+
+void prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous_switchcase)
+ scopes[parent_scope].set_end(end);
+ }
+}
+
+void prog_scope::set_continue(scope_idx scope, int line)
+{
+ if (scope_type == sct_loop) {
+ scope_of_loop_to_continue = scope;
+ loop_continue = line;
+ } else if (parent_scope >= 0)
+ scopes[parent_scope].set_continue(scope, line);
+}
+
+temp_access::temp_access(std::vector<prog_scope>& s):
+ scopes(s),
+ keep_for_full_loop(false),
+ last_read_scope(-1),
+ undefined_read_scope(-1),
+ first_write_scope(-1),
+ first_write(-1),
+ last_read(-1),
+ last_write(-1),
+ undefined_read(numeric_limits<int>::max())
+{
+}
+
+void temp_access::append(int line, e_acc_type acc, scope_idx idx)
+{
+ last_write = line;
+ if (acc == acc_read) {
+ last_read = line;
+ last_read_scope = idx;
+ if (undefined_read > line) {
+ undefined_read = line;
+ undefined_read_scope = idx;
+ }
+ } else {
+ if (first_write == -1) {
+ first_write = line;
+ first_write_scope = idx;
+
+ // we write in an if-branch
+ auto fw_ifthen_scope = scopes[idx].in_ifelse();
+ if ((fw_ifthen_scope >= 0) && scopes[fw_ifthen_scope].in_loop()) {
+ // value not always written, in loops we must keep it
+ keep_for_full_loop = true;
+ } else {
+ // same thing for switch-case
+ auto fw_switch_scope = scopes[idx].in_switchcase();
+ if (fw_switch_scope >= 0 && scopes[fw_switch_scope].in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+ }
+ }
+}
+
+pair<int, int> temp_access::get_required_lifetime()
+{
+ /* this temp is only read, this is undefined
+ behaviour, so we can use the register otherwise */
+ if (first_write_scope < 0) {
+ return make_pair(-1, -1);
+ }
+
+ /* Only written to, just make sure that renaming
+ * doesn't reuse this register too early (corner
+ * case is the one opcode with two destinations) */
+ if (last_read_scope < 0) {
+ return make_pair(first_write, first_write + 1);
+ }
+
+ // evaluate the shared scope
+ int target_level = -1;
+
+ while (target_level < 0) {
+ if (scopes[last_read_scope].contains(first_write_scope)) {
+ target_level = scopes[last_read_scope].level();
+ } else if (scopes[first_write_scope].contains(last_read_scope)) {
+ target_level = scopes[first_write_scope].level();
+ } else {
+ // scopes (partially) disjunct, move up
+ if (scopes[last_read_scope].type() == sct_loop) {
+ last_read = scopes[last_read_scope].end();
+ }
+ last_read_scope = scopes[last_read_scope].parent();
+ }
+ }
+
+ // propagate the read scope to the target_level
+ while (scopes[last_read_scope].level() > target_level) {
+
+ /* if the read is in a loop we need to extend the
+ * variables life time to the end of that loop */
+ if (scopes[last_read_scope].type() == sct_loop) {
+ last_read = scopes[last_read_scope].end();
+ }
+ last_read_scope = scopes[last_read_scope].parent();
+ }
+
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after it (so it constitutes
+ * a conditional write */
+ if (scopes[first_write_scope].loop_continue_line() < first_write) {
+ keep_for_full_loop = true;
+ }
+
+ /* propagate lifetimes before moving to upper scopes */
+ if ((scopes[first_write_scope].type() == sct_loop) &&
+ (keep_for_full_loop || (undefined_read < first_write))) {
+ first_write = scopes[first_write_scope].begin();
+ int lr = scopes[first_write_scope].end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ // propagate the first_write scope to the target_level
+ while (target_level < scopes[first_write_scope].level()) {
+
+ first_write_scope = scopes[first_write_scope].parent();
+
+ if (scopes[first_write_scope].loop_continue_line() < first_write) {
+ keep_for_full_loop = true;
+ }
+
+ // if the value is conditionally written in a loop
+ // then propagate its lifetime to the full loop
+ if (scopes[first_write_scope].type() == sct_loop) {
+ if (keep_for_full_loop || (undefined_read < first_write)) {
+ first_write = scopes[first_write_scope].begin();
+ int lr = scopes[first_write_scope].end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ // if we currently don't propagate the lifetime but
+ // the enclosing scope is a conditional within a loop
+ // up to the last-read level we need to propagate,
+ // todo: to tighten the life time check whether the value
+ // is written in all consitional code path below the loop
+ if (!keep_for_full_loop &&
+ scopes[first_write_scope].is_conditional() &&
+ scopes[first_write_scope].in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+
+
+ /* We do not correct the last_write for scope, but
+ * if it is past the last_read we have to keep the
+ * temporary alive past this instructions */
+ if (last_write > last_read) {
+ last_read = last_write + 1;
+ }
+
+ return make_pair(first_write, last_read);
+}
+
+vector<pair<int, int>>
+estimate_temporary_lifetimes(exec_list *instructions, int ntemps)
+{
+ return tgsi_temp_lifetime().get_lifetimes(instructions, ntemps);
+}
+
+void evaluate_remapping(const std::vector<std::pair<int, int>>& lifetimes,
+ struct rename_reg_pair *result)
+{
+ struct access_record {
+ int begin;
+ int end;
+ unsigned reg;
+ bool erase;
+ };
+
+ auto compare_begin = [](const access_record& a, const access_record& b) {
+ return a.begin < b.begin;
+ };
+ auto compare_end_begin = [](const access_record& a, const access_record& b) {
+ return a.end <= b.begin;
+ };
+
+ vector<access_record> m(lifetimes.size() - 1);
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ m[i-1] = {lifetimes[i].first, lifetimes[i].second, i, false};
+ }
+
+ std::sort(m.begin(), m.end(), compare_begin);
+
+ auto trgt = m.begin();
+ auto mend = m.end();
+ auto first_erase = mend;
+ auto search_start = trgt + 1;
+
+ while (trgt != mend) {
+
+ auto src = std::upper_bound(search_start, mend, *trgt, compare_end_begin);
+ if (src != mend) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't erase the renamed
+ * register just now, just mark it for removal. The alternative
+ * to call m.erase(src) here would be quite expensive. */
+ src->erase = true;
+ if (first_erase == mend)
+ first_erase = src;
+ search_start = src + 1;
+ } else {
+ /* Moving to the next target register it is time to
+ * erase the already merged registers */
+ if (first_erase != mend) {
+ auto out = first_erase;
+ auto in_start = first_erase + 1;
+ while (in_start != mend) {
+ if (!in_start->erase)
+ *out++ = *in_start;
+ ++in_start;
+ }
+ mend = out;
+ first_erase = mend;
+ }
+ ++trgt;
+ search_start = trgt + 1;
+ }
+ }
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..04d5321682
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,30 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+
+std::vector<std::pair<int, int>>
+estimate_temporary_lifetimes(exec_list *instructions, int ntemps);
+
+void evaluate_remapping(const std::vector<std::pair<int, int>>& lt,
+ struct rename_reg_pair *result);
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..ac6def682a
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,40 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES) $(INCLUDE_DIRS)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gallium/drivers/trace/libtrace.la \
+ $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
+ $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS) \
+ -ldl
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..a2c59fb28f
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,959 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+using std::pair;
+
+
+/* A line to describe a TGSI instruction for building mock shaders */
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ MockCodeline(unsigned _op, const vector<int>& _dst,
+ const vector<int>& _src, const vector<int>&_to):
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+/* A few constants to use in the mock shaders */
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+/* A class to create a shader program to check the register allocation
+ * and renaming. The created exec_list is not completely set up and can
+ * only be used for the register tife-time analyis. */
+class MockShader {
+public:
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+private:
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+/* type for register lifetime expectation */
+using expectation = vector<vector<int>>;
+
+
+/* This is a teat class to check the exact life times of
+ * registers. */
+class LifetimeEvaluatorExactTest : public testing::Test {
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+};
+
+/* This test class checks that the life time covers at least
+ * in the expected range. It is used for cases where we know that
+ * a the implementation could be improved on estimating the minimal
+ * life time.
+ */
+class LifetimeEvaluatorAtLeastTest : public testing::Test {
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+};
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+/* in loop if/else value written only in one path, and read later
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
+}
+
+
+// in loop if/else value written in both path, and read later
+// - value must survive from first write to last read in loop
+// for now we only check that the minimum life time is correct
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
+}
+
+/* in loop if/else value written in both path, red in else path
+ * before read and also read later- value must survive from first
+ * write to last read in loop */
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+/* in loop if/else read in one path before written in the same loop
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries
+ */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 15
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 13
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole outer loop */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* Test whether variable is kept also if the continue is in a
+ * higher scope than the variable write */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteInLoopAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for all
+ * loops including the read loop */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to the case and not to the loop, the variable doesn't
+ * need to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_DEFAULT},
+ { TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{8, 10}}));
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to that inner loop. The variable has to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitchInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_DEFAULT, {}, {}, {}},
+ { TGSI_OPCODE_ENDSWITCH, {}, {}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{2, 10}}));
+}
+
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop that includes the read */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for all
+ * loops up onto the read scope */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 11}}));
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_DEFAULT},
+ {TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 3}}));
+}
+
+/* variable written in a switch within a loop must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* value written in one case, and read in other, in loop
+ * - must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* Make sure SWITCH is closed correctly in the scope stack */
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+
+/* value read/write in same case, stays there */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+/* value read/write in all cases, should only live from first
+ * write to last read, but currently the whole loop is used. */
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+
+/* value read/write in differnt loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+/* value read/write in differnt loops, conditional */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferentScopesCondReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* register read in if */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in switch */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+/* Check that a missing END is handled (Unigine-Haven creates such a
+ * shader) */
+TEST_F(LifetimeEvaluatorExactTest, DistinceScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+/* Dead code elimination should catch and remove the case
+ * when a variable is written after its last read, but
+ * we want the code to be aware of this case.
+ * The life time of this uselessly written variable is set
+ * to the instruction after the write, because
+ * otherwise it could be re-used too early.
+*/
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {1}, {2}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,3}, {1,2}}));
+}
+
+/*
+ * Somehow a duplicate of above tests specifically using the
+ * problematic corner case n question. DFRACEXP has two
+ * destinations, and if one value is thrown away, we must ensure
+ * that the two output registers don't merge.
+ * In this test case the last access for 2 and 3 is in line 4,
+ * but only 3 can be merged with 4 because it is read, 2 on the
+ * other hand is written to, and merging it with 4 would result in
+ * undefined behaviour.
+*/
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead2)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+ { TGSI_OPCODE_DFRACEXP , {2,4}, {3}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {4}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,2}, {1,4}, {2,3}, {3,4}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, OnlyWriteOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+
+/* Check that two destination registers are actually used */
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers and three source registers
+ * are used */
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}},
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+
+class RegisterRemapping : public testing::Test {
+protected:
+ void run(const vector<pair<int, int>>& lt, const vector<int>& expect);
+};
+
+void RegisterRemapping::run(const vector<pair<int, int>>& lt,
+ const vector<int>& expect)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(lt.size(), proto);
+
+ evaluate_remapping(lt, &result[0]);
+
+ vector<int> remap(lt.size());
+ for (unsigned i = 0; i < lt.size(); ++i) {
+ remap[i] = result[i].valid ? result[i].new_reg : i;
+ }
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {
+ return rn.valid ? rn.new_reg : x;
+ });
+
+ for(unsigned i = 1; i < remap.size(); ++i)
+ EXPECT_EQ(remap[i], expect[i]);
+}
+
+TEST_F(RegisterRemapping, RegisterRemapping1)
+{
+ vector<pair<int, int>> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {1, 2},
+ {2, 10},
+ {3, 5},
+ {5, 10}
+ });
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+ run(lt, expect);
+}
+
+
+TEST_F(RegisterRemapping, RegisterRemapping2)
+{
+ vector<pair<int, int>> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {3, 3},
+ {4, 4},
+ });
+ vector<int> expect({0, 1, 2, 1, 1});
+ run(lt, expect);
+}
+
+
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+MockShader::MockShader(const vector<MockCodeline>& source):
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+
+ // set texture registers
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ if (i.tex_offsets.size() > 0)
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ else
+ next_instr->tex_offsets = 0;
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ // the list is not fully initialized, so
+ // tearing it down also must be done manually.
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+void LifetimeEvaluatorExactTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+
+ auto lifetimes = estimate_temporary_lifetimes(shader.get_program(), shader.get_num_temps());
+
+ // lifetimes[0] not used, but created for simpler processing
+ ASSERT_EQ(lifetimes.size(), e.size());
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].first, e[i][0]);
+ EXPECT_EQ(lifetimes[i].second, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+
+ auto lifetimes = estimate_temporary_lifetimes(shader.get_program(), shader.get_num_temps());
+
+ // lifetimes[0] not used, but created for simpler processing
+ ASSERT_EQ(lifetimes.size(), e.size());
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].first, e[i][0]);
+ EXPECT_GE(lifetimes[i].second, e[i][1]);
+ }
+}
--
2.13.0
Nicolai Hähnle
2017-06-18 10:05:11 UTC
Permalink
Post by Gert Wollny
This patch adds new classes and tests to implement a tracker for the
life time of temporary registers for the register renaming stage of
glsl_to_tgsi. The tracker aims at estimating the shortest possible
life time for each register. The code base requires c++11, the flag is
propagated from the LLVM_CXXFLAGS.
---
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 202 +++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 164 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 674 +++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 30 +
src/mesa/state_tracker/tests/Makefile.am | 40 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 959 +++++++++++++++++++++
9 files changed, 2074 insertions(+), 2 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
diff --git a/configure.ac b/configure.ac
index 6c67d27084..855d06779c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2827,6 +2827,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..72ffd61212 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests
if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
@@ -101,7 +101,7 @@ AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(MSVC2013_COMPAT_CFLAGS)
AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+ $(LLVM_CXXFLAGS) \
I kind of suspect that this might be a no-no. On the one hand it makes
sense, because it makes the use of CXXFLAGS consistent, but it's a
pretty significant build system change.

At least separate it out into its own patch.
Post by Gert Wollny
$(VISIBILITY_CXXFLAGS) \
$(MSVC2013_COMPAT_CXXFLAGS)
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..a68e9d2afe 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
Inconsistent use of whitespace.

Then the whole patch needs to be re-arranged, i.e. the
st_glsl_to_tgsi_private stuff, and I agree with Emil that the tests
should be separated out into their own patch.

BTW, you can and should use

git rebase -x "cd $builddir && make" $basecommit

to verify that you don't break the build in an intermediate patch.


[snip]
Post by Gert Wollny
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..a2e8e3778c
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,674 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <stack>
+#include <algorithm>
+#include <limits>
+
+using std::vector;
+using std::stack;
+using std::pair;
+using std::make_pair;
+using std::numeric_limits;
+
+
+typedef int scope_idx;
Please remove this, it's an unnecessary and distracting abstraction that
doesn't gain you anything.
Post by Gert Wollny
+
+enum e_scope_type {
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_switch_case,
+ sct_switch_default,
+ sct_unknown
+};
+
+enum e_acc_type {
+ acc_read,
+ acc_write
+};
+
+class prog_scope {
+
+ prog_scope(e_scope_type type, int my_idx, int id, int lvl, int s_begin,
+ vector<prog_scope>& scopes);
+ prog_scope(scope_idx p, e_scope_type type, int my_idx, int id,
+ int lvl, int s_begin, vector<prog_scope>& scopes);
+
+ e_scope_type type() const { return scope_type; }
+ scope_idx parent() const { return parent_scope; }
+ int level() const {return nested_level; }
+ int id() const { return scope_id; }
+ int end() const {return scope_end; }
+ int begin() const {return scope_begin; }
+ int loop_continue_line() const {return loop_continue;}
+
+ scope_idx in_ifelse() const;
+ scope_idx in_switchcase() const;
+
+ bool in_loop() const;
+ scope_idx get_parent_loop() const;
+ bool is_conditional() const;
+ bool contains(scope_idx idx) const;
+ void set_end(int end);
+ void set_previous(scope_idx prev);
+ void set_continue(scope_idx scope, int i);
+ bool enclosed_by_loop_prior_to_switch();
+
+ e_scope_type scope_type;
+ int scope_id;
+ int nested_level;
+ int scope_begin;
+ int scope_end;
+ int loop_continue;
+
+ scope_idx my_idx;
+ scope_idx scope_of_loop_to_continue;
+ scope_idx previous_switchcase;
+ scope_idx parent_scope;
+
+ vector<prog_scope>& scopes;
I consider the scopes back-reference here and in temp_access to be bad
style. Think less object- and more data-oriented, then you won't need
them. Most of those helper functions should be members of
tgsi_temp_lifetime -- and the getters/setters can be removed entirely.
YAGNI.
Post by Gert Wollny
+};
+
+class temp_access {
+
+ temp_access(vector<prog_scope>& scopes);
+ void append(int index, e_acc_type rw, scope_idx pstate);
+ pair<int, int> get_required_lifetime();
+
+
+ struct temp_access_record {
+ int index;
+ e_acc_type acc;
+ scope_idx prog_scope;
+ };
+
+ std::vector<prog_scope>& scopes;
+
+ bool keep_for_full_loop;
+
+ scope_idx last_read_scope;
+ scope_idx undefined_read_scope;
+ scope_idx first_write_scope;
+
+ int first_write;
+ int last_read;
+ int last_write;
+ int undefined_read;
+};
+
+
+class tgsi_temp_lifetime {
+
+
+ tgsi_temp_lifetime();
+
+ vector<std::pair<int, int> > get_lifetimes(exec_list *instructions,
+ int ntemps) const;
+
+ scope_idx make_scope(e_scope_type type, int id, int lvl, int s_begin) const;
+ scope_idx make_scope(scope_idx p, e_scope_type type, int id,
+ int lvl, int s_begin) const;
+
+ void evaluate();
+
+ mutable vector<prog_scope> scopes;
To first approximation, you should never use mutable. Those methods
should not be const to begin with.
Post by Gert Wollny
+
+};
+
+tgsi_temp_lifetime::tgsi_temp_lifetime()
+{
+ scopes.reserve(20);
+}
+
+scope_idx
+tgsi_temp_lifetime::make_scope(e_scope_type type, int id, int lvl,
+ int s_begin)const
+{
+ int idx = scopes.size();
+ scopes.push_back(prog_scope(type, idx, id, lvl, s_begin, scopes));
+ return idx;
+}
AFAICS this overload is only used once. Please remove it.
Post by Gert Wollny
+
+scope_idx
+tgsi_temp_lifetime::make_scope(scope_idx p, e_scope_type type, int id,
+ int lvl, int s_begin) const
+{
+ int idx = scopes.size();
+ scopes.push_back(prog_scope(p, type, idx, id, lvl, s_begin, scopes));
+ return idx;
+}
+
+vector<pair<int, int> >
+tgsi_temp_lifetime::get_lifetimes(exec_list *instructions, int ntemps) const
+{
+ int line = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int nesting_lvl = 0;
+ bool is_at_end = false;
+ stack<scope_idx> scope_stack;
+
+ std::vector<std::pair<int, int> > lifetimes(ntemps);
+ vector<temp_access> acc(ntemps, temp_access(scopes));
+
+ scope_idx current = make_scope(sct_outer, 0, nesting_lvl++, line);
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (is_at_end) {
+ // shader has instructions past end marker; we ignore this
Use C-style comments. Also, this should probably have an assert, or can
this really happen?
Post by Gert Wollny
+ break;
+ }
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ scope_idx scope = make_scope(current, sct_loop, loop_id,
+ nesting_lvl, line);
+ ++loop_id;
+ ++nesting_lvl;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --nesting_lvl;
+ scopes[current].set_end(line);
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+ case TGSI_OPCODE_UIF:{
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ }
+ scope_idx scope = make_scope(current, sct_if, if_id, nesting_lvl, line);
It's probably a good idea to have scopes start at line + 1. Otherwise,
it may be tempting to think that the read access of the IF condition
happens inside the IF scope, at least based on the line numbers.
Post by Gert Wollny
+ ++if_id;
+ ++nesting_lvl;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ scopes[current].set_end(line-1);
+ current = make_scope(scopes[current].parent(), sct_else,
+ scopes[current].id(), scopes[current].level(), line);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ scopes[current].set_end(line);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --nesting_lvl;
+ scopes[current].set_end(line-1);
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ scope_idx scope = make_scope(current, sct_switch, switch_id,
+ nesting_lvl, line);
+ ++nesting_lvl;
+ ++switch_id;
+ scope_stack.push(current);
+ current = scope;
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --nesting_lvl;
+ scopes[current].set_end(line-1);
+
+ // remove the case level
+ if (scopes[current].type() != sct_switch ) {
+ current = scope_stack.top();
+ scope_stack.pop();
+ }
+ current = scope_stack.top();
+ scope_stack.pop();
+ break;
+ }
+
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ } // fall through
C-style comments, and /* fall-through */ should be on its own line.
Post by Gert Wollny
+ case TGSI_OPCODE_DEFAULT: {
+ auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
+ sct_switch_case : sct_switch_default;
+ if ( scopes[current].type() == sct_switch ) {
No spaces inside parenthesis (also elsewhere).
Post by Gert Wollny
+ scope_stack.push(current);
+ current = make_scope(current, scope_type, scopes[current].id(),
+ nesting_lvl, line);
+ }else{
Spaces around else (also elsewhere).
Post by Gert Wollny
+ auto p = scopes[current].parent();
+ auto scope = make_scope(p, scope_type, scopes[p].id(),
+ scopes[p].level(), line);
+ if (scopes[current].end() == -1)
+ scopes[scope].set_previous(current);
+ current = scope;
+ }
+ break;
+ }
+ case TGSI_OPCODE_BRK: {
+ if ( (scopes[current].type() == sct_switch_case) ||
+ (scopes[current].type() == sct_switch_default)) {
+ scopes[current].set_end(line-1);
+ }
+ /* Make sure that the nearest enclosing scope is a loop
+ * and not a switch case.
+ * Apart from that this is like a continue, just
+ * a bit more final */
+ else if (scopes[current].enclosed_by_loop_prior_to_switch()) {
No comment between if- and else-block.
Post by Gert Wollny
+ scopes[current].set_continue(current, line);
You do need to distinguish between break and continue (at least for
accuracy), because break allows you to skip over a piece of code
indefinitely while continue doesn't. I.e.:

BGNLOOP
...
BRK/CONT
...
MOV TEMP[i], ...
...
ENDLOOP

If it's a CONT, then i is written unconditionally inside the loop. If
it's a BRK, then it isn't.
Post by Gert Wollny
+ }
+ break;
+ }
+ case TGSI_OPCODE_CONT: {
+ scopes[current].set_continue(current, line);
+ break;
+ }
+
+ default: {
+
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->dst[j].index].append(line, acc_write, current);
+ }
+ }
+
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].append(line, acc_read, current);
+ }
+ }
+
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->tex_offsets[j].index].append(line, acc_read, current);
+ }
+ }
You need to handle the reads before the writes. After all, you might have a

ADD TEMP[i], TEMP[i], ...

which may have to register as an undefined read.
Post by Gert Wollny
+
+ } // end default
+ } // end switch (op)
No end-of-scope comments.
Post by Gert Wollny
+
+ ++line;
+ }
+
+ // make sure last scope is closed, even though no
+ // TGSI_OPCODE_END was given
+ if (scopes[current].end() < 0) {
+ scopes[current].set_end(line-1);
+ }
+
+ for(unsigned i = 1; i < lifetimes.size(); ++i) {
+ lifetimes[i] = acc[i].get_required_lifetime();
+ }
+ scopes.clear();
+ return lifetimes;
+}
+
+
+prog_scope::prog_scope(e_scope_type type, int idx, int id,
+ int lvl, int s_begin,
+ prog_scope(-1, type, idx, id, lvl, s_begin, s)
+{
+}
+
+prog_scope::prog_scope(scope_idx p, e_scope_type type,
+ int idx, int id, int lvl, int s_begin,
+ scope_type(type),
+ scope_id(id),
+ nested_level(lvl),
+ scope_begin(s_begin),
+ scope_end(-1),
+ loop_continue(numeric_limits<int>::max()),
+ my_idx(idx),
+ scope_of_loop_to_continue(0),
+ previous_switchcase(0),
+ parent_scope(p),
+ scopes(s)
+{
+}
+
+bool prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (parent_scope >= 0)
+ return scopes[parent_scope].in_loop();
+ return false;
+}
+
+scope_idx
+prog_scope::get_parent_loop() const
Based on what it does, this should be renamed to something like
get_innermost_loop.
Post by Gert Wollny
+{
+ if (scope_type == sct_loop)
+ return my_idx;
+ if (parent_scope >= 0)
+ return scopes[parent_scope].get_parent_loop();
+ else
+ return -1;
+}
+
+bool prog_scope::contains(scope_idx other) const
+{
+ return (begin() <= scopes[other].begin()) && (end() >= scopes[other].end());
+}
+
+bool prog_scope::is_conditional() const
+{
+ return scope_type == sct_if || scope_type == sct_else ||
+ scope_type == sct_switch_case || scope_type == sct_switch_default;
+}
+
+bool prog_scope::enclosed_by_loop_prior_to_switch()
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default ||
+ scope_type == sct_switch)
+ return false;
+ if (parent_scope >= 0)
+ return scopes[parent_scope].enclosed_by_loop_prior_to_switch();
+ else
+ return false;
+}
+
+scope_idx prog_scope::in_ifelse() const
+{
+ if ((scope_type == sct_if) ||
+ (scope_type == sct_else))
+ return my_idx;
+ else if (parent_scope >= 0)
+ return scopes[parent_scope].in_ifelse();
+ else
+ return -1;
+}
+
+scope_idx prog_scope::in_switchcase() const
+{
+ if ((scope_type == sct_switch_case) ||
+ (scope_type == sct_switch_default))
+ return my_idx;
+ else if (parent_scope >= 0)
+ return scopes[parent_scope].in_switchcase();
+ else
+ return -1;
+}
+
+void prog_scope::set_previous(scope_idx prev)
+{
+ previous_switchcase = prev;
+}
+
+void prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous_switchcase)
+ scopes[parent_scope].set_end(end);
+ }
+}
+
+void prog_scope::set_continue(scope_idx scope, int line)
+{
+ if (scope_type == sct_loop) {
+ scope_of_loop_to_continue = scope;
+ loop_continue = line;
+ } else if (parent_scope >= 0)
+ scopes[parent_scope].set_continue(scope, line);
+}
+
+ scopes(s),
+ keep_for_full_loop(false),
+ last_read_scope(-1),
+ undefined_read_scope(-1),
+ first_write_scope(-1),
+ first_write(-1),
+ last_read(-1),
+ last_write(-1),
+ undefined_read(numeric_limits<int>::max())
+{
+}
+
+void temp_access::append(int line, e_acc_type acc, scope_idx idx)
+{
+ last_write = line;
Looks like last_write should be called last_access.
Post by Gert Wollny
+ if (acc == acc_read) {
+ last_read = line;
+ last_read_scope = idx;
+ if (undefined_read > line) {
+ undefined_read = line;
+ undefined_read_scope = idx;
+ }
The way you're using it this is effectively just first_read, not
undefined_read.
Post by Gert Wollny
+ } else {
+ if (first_write == -1) {
+ first_write = line;
+ first_write_scope = idx;
+
+ // we write in an if-branch
+ auto fw_ifthen_scope = scopes[idx].in_ifelse();
+ if ((fw_ifthen_scope >= 0) && scopes[fw_ifthen_scope].in_loop()) {
+ // value not always written, in loops we must keep it
+ keep_for_full_loop = true;
+ } else {
+ // same thing for switch-case
+ auto fw_switch_scope = scopes[idx].in_switchcase();
+ if (fw_switch_scope >= 0 && scopes[fw_switch_scope].in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
Simplify this by using an in_conditional() instead of in_ifelse +
in_switchcase().
Post by Gert Wollny
+ }
+ }
+}
+
+pair<int, int> temp_access::get_required_lifetime()
+{
+ /* this temp is only read, this is undefined
+ behaviour, so we can use the register otherwise */
+ if (first_write_scope < 0) {
+ return make_pair(-1, -1);
+ }
+
+ /* Only written to, just make sure that renaming
+ * doesn't reuse this register too early (corner
+ * case is the one opcode with two destinations) */
+ if (last_read_scope < 0) {
+ return make_pair(first_write, first_write + 1);
What if there are multiple writes to the temporary?
Post by Gert Wollny
+ }
+
+ // evaluate the shared scope
+ int target_level = -1;
+
+ while (target_level < 0) {
+ if (scopes[last_read_scope].contains(first_write_scope)) {
+ target_level = scopes[last_read_scope].level();
+ } else if (scopes[first_write_scope].contains(last_read_scope)) {
+ target_level = scopes[first_write_scope].level();
+ } else {
+ // scopes (partially) disjunct, move up
+ if (scopes[last_read_scope].type() == sct_loop) {
+ last_read = scopes[last_read_scope].end();
+ }
+ last_read_scope = scopes[last_read_scope].parent();
+ }
+ }
+
+ // propagate the read scope to the target_level
+ while (scopes[last_read_scope].level() > target_level) {
+
+ /* if the read is in a loop we need to extend the
+ * variables life time to the end of that loop */
+ if (scopes[last_read_scope].type() == sct_loop) {
+ last_read = scopes[last_read_scope].end();
+ }
+ last_read_scope = scopes[last_read_scope].parent();
+ }
+
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after it (so it constitutes
+ * a conditional write */
It's only conditional if there was a break. Continue doesn't make it
conditional.
Post by Gert Wollny
+ if (scopes[first_write_scope].loop_continue_line() < first_write) {
+ keep_for_full_loop = true;
+ }
+
+ /* propagate lifetimes before moving to upper scopes */
+ if ((scopes[first_write_scope].type() == sct_loop) &&
+ (keep_for_full_loop || (undefined_read < first_write))) {
+ first_write = scopes[first_write_scope].begin();
+ int lr = scopes[first_write_scope].end();
+ if (last_read < lr)
+ last_read = lr;
+ }
What about:

BGNLOOP
...
BGNLOOP
IF ...
read TEMP[i]
ENDIF
write TEMP[i]
ENDLOOP
...
ENDLOOP

In this case, you don't set keep_for_full_loop, yet the lifetime must
extend for the entire _outer_ loop.

I see two related problems in the code:

1. In this case, keep_for_full_loop needs to be set to true (because
overwriting first_write in the code above means that the related check
below won't fire).

2. target_level is set to the inner loop even though it really needs to
be set to the outer loop.

That's it for now.

Cheers,
Nicolai
Post by Gert Wollny
+
+ // propagate the first_write scope to the target_level
+ while (target_level < scopes[first_write_scope].level()) {
+
+ first_write_scope = scopes[first_write_scope].parent();
+
+ if (scopes[first_write_scope].loop_continue_line() < first_write) {
+ keep_for_full_loop = true;
+ }
+
+ // if the value is conditionally written in a loop
+ // then propagate its lifetime to the full loop
+ if (scopes[first_write_scope].type() == sct_loop) {
+ if (keep_for_full_loop || (undefined_read < first_write)) {
+ first_write = scopes[first_write_scope].begin();
+ int lr = scopes[first_write_scope].end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ // if we currently don't propagate the lifetime but
+ // the enclosing scope is a conditional within a loop
+ // up to the last-read level we need to propagate,
+ // todo: to tighten the life time check whether the value
+ // is written in all consitional code path below the loop
+ if (!keep_for_full_loop &&
+ scopes[first_write_scope].is_conditional() &&
+ scopes[first_write_scope].in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+
+
+ /* We do not correct the last_write for scope, but
+ * if it is past the last_read we have to keep the
+ * temporary alive past this instructions */
+ if (last_write > last_read) {
+ last_read = last_write + 1;
+ }
+
+ return make_pair(first_write, last_read);
+}
+
+vector<pair<int, int>>
+estimate_temporary_lifetimes(exec_list *instructions, int ntemps)
+{
+ return tgsi_temp_lifetime().get_lifetimes(instructions, ntemps);
+}
+
+void evaluate_remapping(const std::vector<std::pair<int, int>>& lifetimes,
+ struct rename_reg_pair *result)
+{
+ struct access_record {
+ int begin;
+ int end;
+ unsigned reg;
+ bool erase;
+ };
+
+ auto compare_begin = [](const access_record& a, const access_record& b) {
+ return a.begin < b.begin;
+ };
+ auto compare_end_begin = [](const access_record& a, const access_record& b) {
+ return a.end <= b.begin;
+ };
+
+ vector<access_record> m(lifetimes.size() - 1);
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ m[i-1] = {lifetimes[i].first, lifetimes[i].second, i, false};
+ }
+
+ std::sort(m.begin(), m.end(), compare_begin);
+
+ auto trgt = m.begin();
+ auto mend = m.end();
+ auto first_erase = mend;
+ auto search_start = trgt + 1;
+
+ while (trgt != mend) {
+
+ auto src = std::upper_bound(search_start, mend, *trgt, compare_end_begin);
+ if (src != mend) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't erase the renamed
+ * register just now, just mark it for removal. The alternative
+ * to call m.erase(src) here would be quite expensive. */
+ src->erase = true;
+ if (first_erase == mend)
+ first_erase = src;
+ search_start = src + 1;
+ } else {
+ /* Moving to the next target register it is time to
+ * erase the already merged registers */
+ if (first_erase != mend) {
+ auto out = first_erase;
+ auto in_start = first_erase + 1;
+ while (in_start != mend) {
+ if (!in_start->erase)
+ *out++ = *in_start;
+ ++in_start;
+ }
+ mend = out;
+ first_erase = mend;
+ }
+ ++trgt;
+ search_start = trgt + 1;
+ }
+ }
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..04d5321682
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,30 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+
+std::vector<std::pair<int, int>>
+estimate_temporary_lifetimes(exec_list *instructions, int ntemps);
+
+void evaluate_remapping(const std::vector<std::pair<int, int>>& lt,
+ struct rename_reg_pair *result);
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..ac6def682a
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,40 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES) $(INCLUDE_DIRS)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gallium/drivers/trace/libtrace.la \
+ $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
+ $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS) \
+ -ldl
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..a2c59fb28f
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,959 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+using std::pair;
+
+
+/* A line to describe a TGSI instruction for building mock shaders */
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ MockCodeline(unsigned _op, const vector<int>& _dst,
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+/* A few constants to use in the mock shaders */
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+/* A class to create a shader program to check the register allocation
+ * and renaming. The created exec_list is not completely set up and can
+ * only be used for the register tife-time analyis. */
+class MockShader {
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+/* type for register lifetime expectation */
+using expectation = vector<vector<int>>;
+
+
+/* This is a teat class to check the exact life times of
+ * registers. */
+class LifetimeEvaluatorExactTest : public testing::Test {
+ void run(const vector<MockCodeline>& code, const expectation& e);
+};
+
+/* This test class checks that the life time covers at least
+ * in the expected range. It is used for cases where we know that
+ * a the implementation could be improved on estimating the minimal
+ * life time.
+ */
+class LifetimeEvaluatorAtLeastTest : public testing::Test {
+ void run(const vector<MockCodeline>& code, const expectation& e);
+};
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+/* in loop if/else value written only in one path, and read later
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
+}
+
+
+// in loop if/else value written in both path, and read later
+// - value must survive from first write to last read in loop
+// for now we only check that the minimum life time is correct
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
+}
+
+/* in loop if/else value written in both path, red in else path
+ * before read and also read later- value must survive from first
+ * write to last read in loop */
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+/* in loop if/else read in one path before written in the same loop
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries
+ */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 15
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 13
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole outer loop */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* Test whether variable is kept also if the continue is in a
+ * higher scope than the variable write */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteInLoopAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for all
+ * loops including the read loop */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to the case and not to the loop, the variable doesn't
+ * need to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_DEFAULT},
+ { TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{8, 10}}));
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to that inner loop. The variable has to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitchInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_DEFAULT, {}, {}, {}},
+ { TGSI_OPCODE_ENDSWITCH, {}, {}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{2, 10}}));
+}
+
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop that includes the read */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for all
+ * loops up onto the read scope */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 11}}));
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_DEFAULT},
+ {TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 3}}));
+}
+
+/* variable written in a switch within a loop must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* value written in one case, and read in other, in loop
+ * - must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* Make sure SWITCH is closed correctly in the scope stack */
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+
+/* value read/write in same case, stays there */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+/* value read/write in all cases, should only live from first
+ * write to last read, but currently the whole loop is used. */
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+
+/* value read/write in differnt loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+/* value read/write in differnt loops, conditional */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferentScopesCondReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* register read in if */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in switch */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+/* Check that a missing END is handled (Unigine-Haven creates such a
+ * shader) */
+TEST_F(LifetimeEvaluatorExactTest, DistinceScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+/* Dead code elimination should catch and remove the case
+ * when a variable is written after its last read, but
+ * we want the code to be aware of this case.
+ * The life time of this uselessly written variable is set
+ * to the instruction after the write, because
+ * otherwise it could be re-used too early.
+*/
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {1}, {2}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,3}, {1,2}}));
+}
+
+/*
+ * Somehow a duplicate of above tests specifically using the
+ * problematic corner case n question. DFRACEXP has two
+ * destinations, and if one value is thrown away, we must ensure
+ * that the two output registers don't merge.
+ * In this test case the last access for 2 and 3 is in line 4,
+ * but only 3 can be merged with 4 because it is read, 2 on the
+ * other hand is written to, and merging it with 4 would result in
+ * undefined behaviour.
+*/
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead2)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+ { TGSI_OPCODE_DFRACEXP , {2,4}, {3}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {4}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,2}, {1,4}, {2,3}, {3,4}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, OnlyWriteOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+
+/* Check that two destination registers are actually used */
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers and three source registers
+ * are used */
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}},
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+
+class RegisterRemapping : public testing::Test {
+ void run(const vector<pair<int, int>>& lt, const vector<int>& expect);
+};
+
+void RegisterRemapping::run(const vector<pair<int, int>>& lt,
+ const vector<int>& expect)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(lt.size(), proto);
+
+ evaluate_remapping(lt, &result[0]);
+
+ vector<int> remap(lt.size());
+ for (unsigned i = 0; i < lt.size(); ++i) {
+ remap[i] = result[i].valid ? result[i].new_reg : i;
+ }
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {
+ return rn.valid ? rn.new_reg : x;
+ });
+
+ for(unsigned i = 1; i < remap.size(); ++i)
+ EXPECT_EQ(remap[i], expect[i]);
+}
+
+TEST_F(RegisterRemapping, RegisterRemapping1)
+{
+ vector<pair<int, int>> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {1, 2},
+ {2, 10},
+ {3, 5},
+ {5, 10}
+ });
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+ run(lt, expect);
+}
+
+
+TEST_F(RegisterRemapping, RegisterRemapping2)
+{
+ vector<pair<int, int>> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {3, 3},
+ {4, 4},
+ });
+ vector<int> expect({0, 1, 2, 1, 1});
+ run(lt, expect);
+}
+
+
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+
+ // set texture registers
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ if (i.tex_offsets.size() > 0)
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ else
+ next_instr->tex_offsets = 0;
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ // the list is not fully initialized, so
+ // tearing it down also must be done manually.
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+void LifetimeEvaluatorExactTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+
+ auto lifetimes = estimate_temporary_lifetimes(shader.get_program(), shader.get_num_temps());
+
+ // lifetimes[0] not used, but created for simpler processing
+ ASSERT_EQ(lifetimes.size(), e.size());
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].first, e[i][0]);
+ EXPECT_EQ(lifetimes[i].second, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+
+ auto lifetimes = estimate_temporary_lifetimes(shader.get_program(), shader.get_num_temps());
+
+ // lifetimes[0] not used, but created for simpler processing
+ ASSERT_EQ(lifetimes.size(), e.size());
+
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].first, e[i][0]);
+ EXPECT_GE(lifetimes[i].second, e[i][1]);
+ }
+}
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-18 12:41:11 UTC
Permalink
Hello Nicolai,
Post by Nicolai Hähnle
  
  if HAVE_XLIB_GLX
  SUBDIRS += drivers/x11
@@ -101,7 +101,7 @@ AM_CFLAGS = \
   $(VISIBILITY_CFLAGS) \
   $(MSVC2013_COMPAT_CFLAGS)
  AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+        $(LLVM_CXXFLAGS) \
I kind of suspect that this might be a no-no. On the one hand it
makes sense, because it makes the use of CXXFLAGS consistent, but
it's a pretty significant build system change.
I understand that, but c++11 is just such an improvement.
Post by Nicolai Hähnle
At least separate it out into its own patch.
Okay,
Post by Nicolai Hähnle
   $(VISIBILITY_CXXFLAGS) \
   $(MSVC2013_COMPAT_CXXFLAGS)
  
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..a68e9d2afe 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
   state_tracker/st_glsl_to_tgsi.h \
   state_tracker/st_glsl_to_tgsi_private.cpp \
   state_tracker/st_glsl_to_tgsi_private.h \
+        state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
Inconsistent use of whitespace.
Then the whole patch needs to be re-arranged, i.e. the 
st_glsl_to_tgsi_private stuff,
This was a mistake when I was re-basing the patch.
Post by Nicolai Hähnle
and I agree with Emil that the tests 
should be separated out into their own patch.
I'm currently working on this.
Post by Nicolai Hähnle
BTW, you can and should use
git rebase -x "cd $builddir && make" $basecommit
to verify that you don't break the build in an intermediate patch.
Nice trick, thanks.
Post by Nicolai Hähnle
+typedef int scope_idx;
Please remove this, it's an unnecessary and distracting abstraction
that  doesn't gain you anything.
Actually, with the refactoring it did the last two days it helped a
lot.
Post by Nicolai Hähnle
I consider the scopes back-reference here and in temp_access to be
bad  style.
I already got rid of them.
Post by Nicolai Hähnle
Think less object- and more data-oriented, then you won't need them.
Most of those helper functions should be members of 
tgsi_temp_lifetime -- and the getters/setters can be removed
entirely.  YAGNI.
I think with the new design that uses pointers into a pre-allocated
array the methods must stay where they are.
Post by Nicolai Hähnle
To first approximation, you should never use mutable. Those methods 
should not be const to begin with.
This is something else I actually didn't like and corrected already.
Post by Nicolai Hähnle
+scope_idx
+tgsi_temp_lifetime::make_scope(e_scope_type type, int id, int lvl,
+                               int s_begin)const
+{
+   int idx = scopes.size();
+   scopes.push_back(prog_scope(type, idx, id, lvl, s_begin,
scopes));
+   return idx;
+}
AFAICS this overload is only used once. Please remove it.
Okay.
Post by Nicolai Hähnle
Use C-style comments.
Okay,
Post by Nicolai Hähnle
Also, this should probably have an assert, or can this really happen?
I don't think it can happen, an assert it is.
Post by Nicolai Hähnle
+      case TGSI_OPCODE_UIF:{
+         if (inst->src[0].file == PROGRAM_TEMPORARY) {
+               acc[inst->src[0].index].append(line, acc_read,
current);
+         }
+         scope_idx scope = make_scope(current, sct_if, if_id,
nesting_lvl, line);
It's probably a good idea to have scopes start at line + 1.
Otherwise, 
it may be tempting to think that the read access of the IF condition 
happens inside the IF scope, at least based on the line numbers.
I don't think that's a problem. At least I can't think of a test case
that would trigger a problem with that.
Post by Nicolai Hähnle
+      case TGSI_OPCODE_DEFAULT: {
+         auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
sct_switch_default;
+         if ( scopes[current].type() == sct_switch ) {
No spaces inside parenthesis (also elsewhere).
Okay.
Post by Nicolai Hähnle
+            scope_stack.push(current);
+            current = make_scope(current, scope_type,
scopes[current].id(),
+                                 nesting_lvl, line);
+         }else{
Spaces around else (also elsewhere).
Okay.
Post by Nicolai Hähnle
No comment between if- and else-block.
Okay.
Post by Nicolai Hähnle
+            scopes[current].set_continue(current, line);
You do need to distinguish between break and continue (at least for 
accuracy), because break allows you to skip over a piece of code 
I risk to differ, in the worst case a CONT can act like it would be a
BRK, and because I have to handle the worst case it doesn't make sense
to distinguish between the two.
Post by Nicolai Hähnle
    BGNLOOP
       ...
          BRK/CONT
       ...
       MOV TEMP[i], ...
       ...
    ENDLOOP
If it's a CONT, then i is written unconditionally inside the loop.
While one would normally expect this, it is not guarantied. Think of a
shader like this:

...
varying int a;
varying float b;

int f(int a, float b) {...}
for (int i = 0; i< n; ++i ) {
if (a && f(i, b))
continue;
...
x =
}

It may not be the best style. but it is possible.
Post by Nicolai Hähnle
+
+         for (unsigned j = 0; j < inst->tex_offset_num_offset;
j++) {
+            if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+               acc[inst->tex_offsets[j].index].append(line,
acc_read, current);
+            }
+         }
You need to handle the reads before the writes. After all, you might have a
    ADD TEMP[i], TEMP[i], ...
which may have to register as an undefined read.
I will add a test case, and correct the code accordingly.
Post by Nicolai Hähnle
+
+      } // end default
+      } // end switch (op)
No end-of-scope comments.
okay,
Post by Nicolai Hähnle
+scope_idx
+prog_scope::get_parent_loop() const
Based on what it does, this should be renamed to something like 
get_innermost_loop.
Okay.
Post by Nicolai Hähnle
+void temp_access::append(int line, e_acc_type acc, scope_idx idx)
+{
+   last_write = line;
Looks like last_write should be called last_access.
Initially I thought the same, but a last read is also (most of the
time) the last access, and it is handled differently then a write past
the last read, so i think that last_write catches it better.
Post by Nicolai Hähnle
+   if (acc == acc_read) {
+      last_read = line;
+      last_read_scope = idx;
+      if (undefined_read > line) {
+         undefined_read = line;
+         undefined_read_scope = idx;
+      }
The way you're using it this is effectively just first_read, not 
undefined_read.
Indeed.
Post by Nicolai Hähnle
+   } else {
+      if (first_write == -1) {
+         first_write = line;
+         first_write_scope = idx;
+
+         // we write in an if-branch
+         auto fw_ifthen_scope = scopes[idx].in_ifelse();
+         if ((fw_ifthen_scope >= 0) &&
scopes[fw_ifthen_scope].in_loop()) {
+            // value not always written, in loops we must keep it
+            keep_for_full_loop = true;
+         } else {
+            // same thing for switch-case
+            auto fw_switch_scope = scopes[idx].in_switchcase();
+            if (fw_switch_scope >= 0 &&
scopes[fw_switch_scope].in_loop()) {
+               keep_for_full_loop = true;
+            }
+         }
Simplify this by using an in_conditional() instead of in_ifelse + 
in_switchcase().
I was thinking about this, but I also thought that when I want to track
all code path later then I will have to distinguish the two again, but,
on the other hand, adding a method that combines the two is a no-
brainer.
Post by Nicolai Hähnle
+
+   /* Only written to, just make sure that renaming
+    * doesn't reuse this register too early (corner
+    * case is the one opcode with two destinations) */
+   if (last_read_scope < 0) {
+      return make_pair(first_write, first_write + 1);
What if there are multiple writes to the temporary?
Good catch! Next text case :)
Post by Nicolai Hähnle
+   /* propagate lifetime also if there was a continue/break
+    * in a loop and the write was after it (so it constitutes
+    * a conditional write */
It's only conditional if there was a break. Continue doesn't make it 
conditional.
I think I made my argument.
Post by Nicolai Hähnle
+   /* propagate lifetimes before moving to upper scopes */
+   if ((scopes[first_write_scope].type() == sct_loop) &&
+       (keep_for_full_loop || (undefined_read < first_write))) {
+      first_write = scopes[first_write_scope].begin();
+      int lr = scopes[first_write_scope].end();
+      if (last_read < lr)
+         last_read = lr;
+   }
    BGNLOOP
       ...
       BGNLOOP
          IF ...
             read TEMP[i]
          ENDIF
          write TEMP[i]
       ENDLOOP
       ...
    ENDLOOP
In this case, you don't set keep_for_full_loop, yet the lifetime
must extend for the entire _outer_ loop.
Another test case that needs to be added :)
Post by Nicolai Hähnle
1. In this case, keep_for_full_loop needs to be set to true (because 
overwriting first_write in the code above means that the related
check below won't fire).
2. target_level is set to the inner loop even though it really needs
to be set to the outer loop.
Indeed.

Thanks a lot,
Gert
Dieter Nützel
2017-06-18 15:35:47 UTC
Permalink
Hello Gert, (hello Nicolai, ;-))

do you have some 'work in progress' ready.
Then I'll put my Turks XT back in and test LS2015 (Farming Simulator
2015) under Wine. It show missing details (the driver (the player) and
all the other people walking around) with current Mesa git code. I get
the same 'Failed to build shader' reports like you in your
https://bugs.freedesktop.org/show_bug.cgi?id=99349

W'll start with your attachment
https://bugs.freedesktop.org/attachment.cgi?id=131683

Greetings,
Dieter
Post by Gert Wollny
Hello Nicolai,
Post by Nicolai Hähnle
  
  if HAVE_XLIB_GLX
  SUBDIRS += drivers/x11
@@ -101,7 +101,7 @@ AM_CFLAGS = \
   $(VISIBILITY_CFLAGS) \
   $(MSVC2013_COMPAT_CFLAGS)
  AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+        $(LLVM_CXXFLAGS) \
I kind of suspect that this might be a no-no. On the one hand it
makes sense, because it makes the use of CXXFLAGS consistent, but
it's a pretty significant build system change.
I understand that, but c++11 is just such an improvement.
Post by Nicolai Hähnle
At least separate it out into its own patch.
Okay,
Post by Nicolai Hähnle
   $(VISIBILITY_CXXFLAGS) \
   $(MSVC2013_COMPAT_CXXFLAGS)
  
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..a68e9d2afe 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
   state_tracker/st_glsl_to_tgsi.h \
   state_tracker/st_glsl_to_tgsi_private.cpp \
   state_tracker/st_glsl_to_tgsi_private.h \
+        state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
Inconsistent use of whitespace.
Then the whole patch needs to be re-arranged, i.e. the 
st_glsl_to_tgsi_private stuff,
This was a mistake when I was re-basing the patch.
Post by Nicolai Hähnle
and I agree with Emil that the tests 
should be separated out into their own patch.
I'm currently working on this.
Post by Nicolai Hähnle
BTW, you can and should use
git rebase -x "cd $builddir && make" $basecommit
to verify that you don't break the build in an intermediate patch.
Nice trick, thanks.
Post by Nicolai Hähnle
+typedef int scope_idx;
Please remove this, it's an unnecessary and distracting abstraction
that  doesn't gain you anything.
Actually, with the refactoring it did the last two days it helped a
lot.
Post by Nicolai Hähnle
I consider the scopes back-reference here and in temp_access to be
bad  style.
I already got rid of them.
Post by Nicolai Hähnle
Think less object- and more data-oriented, then you won't need them.
Most of those helper functions should be members of 
tgsi_temp_lifetime -- and the getters/setters can be removed
entirely.  YAGNI.
I think with the new design that uses pointers into a pre-allocated
array the methods must stay where they are.
Post by Nicolai Hähnle
To first approximation, you should never use mutable. Those methods 
should not be const to begin with.
This is something else I actually didn't like and corrected already.
Post by Nicolai Hähnle
+scope_idx
+tgsi_temp_lifetime::make_scope(e_scope_type type, int id, int lvl,
+                               int s_begin)const
+{
+   int idx = scopes.size();
+   scopes.push_back(prog_scope(type, idx, id, lvl, s_begin,
scopes));
+   return idx;
+}
AFAICS this overload is only used once. Please remove it.
Okay.
Post by Nicolai Hähnle
Use C-style comments.
Okay,
Post by Nicolai Hähnle
Also, this should probably have an assert, or can this really happen?
I don't think it can happen, an assert it is.
Post by Nicolai Hähnle
+      case TGSI_OPCODE_UIF:{
+         if (inst->src[0].file == PROGRAM_TEMPORARY) {
+               acc[inst->src[0].index].append(line, acc_read,
current);
+         }
+         scope_idx scope = make_scope(current, sct_if, if_id,
nesting_lvl, line);
It's probably a good idea to have scopes start at line + 1.
Otherwise, 
it may be tempting to think that the read access of the IF condition 
happens inside the IF scope, at least based on the line numbers.
I don't think that's a problem. At least I can't think of a test case
that would trigger a problem with that.
Post by Nicolai Hähnle
+      case TGSI_OPCODE_DEFAULT: {
+         auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
sct_switch_default;
+         if ( scopes[current].type() == sct_switch ) {
No spaces inside parenthesis (also elsewhere).
Okay.
Post by Nicolai Hähnle
+            scope_stack.push(current);
+            current = make_scope(current, scope_type,
scopes[current].id(),
+                                 nesting_lvl, line);
+         }else{
Spaces around else (also elsewhere).
Okay.
Post by Nicolai Hähnle
No comment between if- and else-block.
Okay.
Post by Nicolai Hähnle
+            scopes[current].set_continue(current, line);
You do need to distinguish between break and continue (at least for 
accuracy), because break allows you to skip over a piece of code 
I risk to differ, in the worst case a CONT can act like it would be a
BRK, and because I have to handle the worst case it doesn't make sense
to distinguish between the two.
Post by Nicolai Hähnle
    BGNLOOP
       ...
          BRK/CONT
       ...
       MOV TEMP[i], ...
       ...
    ENDLOOP
If it's a CONT, then i is written unconditionally inside the loop.
While one would normally expect this, it is not guarantied. Think of a
...
varying int a;
varying float b;
int f(int a, float b) {...}
for (int i = 0; i< n; ++i ) {
if (a && f(i, b))
continue;
...
x =
}
It may not be the best style. but it is possible.
Post by Nicolai Hähnle
+
+         for (unsigned j = 0; j < inst->tex_offset_num_offset;
j++) {
+            if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+               acc[inst->tex_offsets[j].index].append(line,
acc_read, current);
+            }
+         }
You need to handle the reads before the writes. After all, you might have a
    ADD TEMP[i], TEMP[i], ...
which may have to register as an undefined read.
I will add a test case, and correct the code accordingly.
Post by Nicolai Hähnle
+
+      } // end default
+      } // end switch (op)
No end-of-scope comments.
okay,
Post by Nicolai Hähnle
+scope_idx
+prog_scope::get_parent_loop() const
Based on what it does, this should be renamed to something like 
get_innermost_loop.
Okay.
Post by Nicolai Hähnle
+void temp_access::append(int line, e_acc_type acc, scope_idx idx)
+{
+   last_write = line;
Looks like last_write should be called last_access.
Initially I thought the same, but a last read is also (most of the
time) the last access, and it is handled differently then a write past
the last read, so i think that last_write catches it better.
Post by Nicolai Hähnle
+   if (acc == acc_read) {
+      last_read = line;
+      last_read_scope = idx;
+      if (undefined_read > line) {
+         undefined_read = line;
+         undefined_read_scope = idx;
+      }
The way you're using it this is effectively just first_read, not 
undefined_read.
Indeed.
Post by Nicolai Hähnle
+   } else {
+      if (first_write == -1) {
+         first_write = line;
+         first_write_scope = idx;
+
+         // we write in an if-branch
+         auto fw_ifthen_scope = scopes[idx].in_ifelse();
+         if ((fw_ifthen_scope >= 0) &&
scopes[fw_ifthen_scope].in_loop()) {
+            // value not always written, in loops we must keep it
+            keep_for_full_loop = true;
+         } else {
+            // same thing for switch-case
+            auto fw_switch_scope = scopes[idx].in_switchcase();
+            if (fw_switch_scope >= 0 &&
scopes[fw_switch_scope].in_loop()) {
+               keep_for_full_loop = true;
+            }
+         }
Simplify this by using an in_conditional() instead of in_ifelse + 
in_switchcase().
I was thinking about this, but I also thought that when I want to track
all code path later then I will have to distinguish the two again, but,
on the other hand, adding a method that combines the two is a no-
brainer.
Post by Nicolai Hähnle
+
+   /* Only written to, just make sure that renaming
+    * doesn't reuse this register too early (corner
+    * case is the one opcode with two destinations) */
+   if (last_read_scope < 0) {
+      return make_pair(first_write, first_write + 1);
What if there are multiple writes to the temporary?
Good catch! Next text case :)
Post by Nicolai Hähnle
+   /* propagate lifetime also if there was a continue/break
+    * in a loop and the write was after it (so it constitutes
+    * a conditional write */
It's only conditional if there was a break. Continue doesn't make it 
conditional.
I think I made my argument.
Post by Nicolai Hähnle
+   /* propagate lifetimes before moving to upper scopes */
+   if ((scopes[first_write_scope].type() == sct_loop) &&
+       (keep_for_full_loop || (undefined_read < first_write))) {
+      first_write = scopes[first_write_scope].begin();
+      int lr = scopes[first_write_scope].end();
+      if (last_read < lr)
+         last_read = lr;
+   }
    BGNLOOP
       ...
       BGNLOOP
          IF ...
             read TEMP[i]
          ENDIF
          write TEMP[i]
       ENDLOOP
       ...
    ENDLOOP
In this case, you don't set keep_for_full_loop, yet the lifetime
must extend for the entire _outer_ loop.
Another test case that needs to be added :)
Post by Nicolai Hähnle
1. In this case, keep_for_full_loop needs to be set to true (because 
overwriting first_write in the code above means that the related
check below won't fire).
2. target_level is set to the inner loop even though it really needs
to be set to the outer loop.
Indeed.
Thanks a lot,
Gert
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Gert Wollny
2017-06-18 17:15:15 UTC
Permalink
Hello Dieter,
Post by Dieter Nützel
W'll start with your attachment
https://bugs.freedesktop.org/attachment.cgi?id=131683
This one is a dirty hack that works around the use of too many
registers mostly by allowing to allocate more than the allowed limit
and let the bytecode optimizer sort it out (there is also a bit
trickery to let the byte code generator combine constants that are
otherwise allocated in the wrong register space).

It works only when the number of excess register is very small, and it
will not work if mesa is compiled with --enable-debug, because there
are some assertions in the sb code that will be enabled and fire.

I'm about to send an updated set of patches that focus on the real
solution, i.e. reducing the number of temporary registers before they
are translated to R600 byte code.

You might be better off trying these.

Best,
Gert
Nicolai Hähnle
2017-06-19 06:19:22 UTC
Permalink
Post by Gert Wollny
Hello Nicolai,
Post by Nicolai Hähnle
Post by Gert Wollny
if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
@@ -101,7 +101,7 @@ AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(MSVC2013_COMPAT_CFLAGS)
AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+ $(LLVM_CXXFLAGS) \
I kind of suspect that this might be a no-no. On the one hand it
makes sense, because it makes the use of CXXFLAGS consistent, but
it's a pretty significant build system change.
I understand that, but c++11 is just such an improvement.
Post by Nicolai Hähnle
At least separate it out into its own patch.
Okay,
Post by Nicolai Hähnle
Post by Gert Wollny
$(VISIBILITY_CXXFLAGS) \
$(MSVC2013_COMPAT_CXXFLAGS)
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..a68e9d2afe 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
Inconsistent use of whitespace.
Then the whole patch needs to be re-arranged, i.e. the
st_glsl_to_tgsi_private stuff,
This was a mistake when I was re-basing the patch.
Post by Nicolai Hähnle
and I agree with Emil that the tests
should be separated out into their own patch.
I'm currently working on this.
Post by Nicolai Hähnle
BTW, you can and should use
git rebase -x "cd $builddir && make" $basecommit
to verify that you don't break the build in an intermediate patch.
Nice trick, thanks.
Post by Nicolai Hähnle
Post by Gert Wollny
+typedef int scope_idx;
Please remove this, it's an unnecessary and distracting abstraction
that doesn't gain you anything.
Actually, with the refactoring it did the last two days it helped a
lot.
How? Perhaps your variable names stand to be improved. We try to avoid
ineffective abstractions in Mesa.
Post by Gert Wollny
Post by Nicolai Hähnle
I consider the scopes back-reference here and in temp_access to be
bad style.
I already got rid of them.
Post by Nicolai Hähnle
Think less object- and more data-oriented, then you won't need them.
Most of those helper functions should be members of
tgsi_temp_lifetime -- and the getters/setters can be removed
entirely. YAGNI.
I think with the new design that uses pointers into a pre-allocated
array the methods must stay where they are.
No, they really don't. You're thinking of scopes etc. as objects. If you
just thought of them as data, as IMO you should, it would feel natural
to move the methods into the main class and eliminate those back-pointers.
Post by Gert Wollny
Post by Nicolai Hähnle
To first approximation, you should never use mutable. Those methods
should not be const to begin with.
This is something else I actually didn't like and corrected already.
Post by Nicolai Hähnle
Post by Gert Wollny
+scope_idx
+tgsi_temp_lifetime::make_scope(e_scope_type type, int id, int lvl,
+ int s_begin)const
+{
+ int idx = scopes.size();
+ scopes.push_back(prog_scope(type, idx, id, lvl, s_begin,
scopes));
+ return idx;
+}
AFAICS this overload is only used once. Please remove it.
Okay.
Post by Nicolai Hähnle
Use C-style comments.
Okay,
Post by Nicolai Hähnle
Also, this should probably have an assert, or can this really happen?
I don't think it can happen, an assert it is.
Post by Nicolai Hähnle
Post by Gert Wollny
+ case TGSI_OPCODE_UIF:{
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ }
+ scope_idx scope = make_scope(current, sct_if, if_id, nesting_lvl, line);
It's probably a good idea to have scopes start at line + 1.
Otherwise,
it may be tempting to think that the read access of the IF condition
happens inside the IF scope, at least based on the line numbers.
I don't think that's a problem. At least I can't think of a test case
that would trigger a problem with that.
Post by Nicolai Hähnle
Post by Gert Wollny
+ case TGSI_OPCODE_DEFAULT: {
+ auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
sct_switch_default;
+ if ( scopes[current].type() == sct_switch ) {
No spaces inside parenthesis (also elsewhere).
Okay.
Post by Nicolai Hähnle
Post by Gert Wollny
+ scope_stack.push(current);
+ current = make_scope(current, scope_type,
scopes[current].id(),
+ nesting_lvl, line);
+ }else{
Spaces around else (also elsewhere).
Okay.
Post by Nicolai Hähnle
No comment between if- and else-block.
Okay.
Post by Nicolai Hähnle
Post by Gert Wollny
+ scopes[current].set_continue(current, line);
You do need to distinguish between break and continue (at least for
accuracy), because break allows you to skip over a piece of code
I risk to differ, in the worst case a CONT can act like it would be a
BRK, and because I have to handle the worst case it doesn't make sense
to distinguish between the two.
Do you have an example?
Post by Gert Wollny
Post by Nicolai Hähnle
BGNLOOP
...
BRK/CONT
...
MOV TEMP[i], ...
...
ENDLOOP
If it's a CONT, then i is written unconditionally inside the loop.
While one would normally expect this, it is not guarantied. Think of a
...
varying int a;
varying float b;
int f(int a, float b) {...}
for (int i = 0; i< n; ++i ) {
if (a && f(i, b))
continue;
...
x =
}
It may not be the best style. but it is possible.
TGSI loops are always infinite-loops. So actually, every loop should
have a BRK somewhere. But the point remains that CONT itself cannot skip
code indefinitely, because there's no breaking out of the loop in BGNLOOP.
Post by Gert Wollny
Post by Nicolai Hähnle
+
Post by Gert Wollny
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->tex_offsets[j].index].append(line,
acc_read, current);
+ }
+ }
You need to handle the reads before the writes. After all, you might have a
ADD TEMP[i], TEMP[i], ...
which may have to register as an undefined read.
I will add a test case, and correct the code accordingly.
Post by Nicolai Hähnle
Post by Gert Wollny
+
+ } // end default
+ } // end switch (op)
No end-of-scope comments.
okay,
Post by Nicolai Hähnle
Post by Gert Wollny
+scope_idx
+prog_scope::get_parent_loop() const
Based on what it does, this should be renamed to something like
get_innermost_loop.
Okay.
Post by Nicolai Hähnle
Post by Gert Wollny
+void temp_access::append(int line, e_acc_type acc, scope_idx idx)
+{
+ last_write = line;
Looks like last_write should be called last_access.
Initially I thought the same, but a last read is also (most of the
time) the last access, and it is handled differently then a write past
the last read, so i think that last_write catches it better.
So I read the register merging part only afterwards, and it seems to me
you're allowing to merge two ranges with lifetimes [a,b] and [b,c].

This makes sense if you think of instructions as having two "phases",
the read-phase which comes before the write-phase, and you think of the
start of the lifetime as pointing to the write-phase, while the end
points to the read-phase. With that interpretation, [a,b] and [b,c] are
genuinely disjoint, and it allows merging TEMP[1] and TEMP[2] in

ADD TEMP[1], ..., ....
ADD TEMP[2], TEMP[1], ...

which is good.

I think this all makes sense, but you should be more explicit about it
and make sure all the variable names are consistent with that convention.

Cheers,
Nicolai
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+ if (acc == acc_read) {
+ last_read = line;
+ last_read_scope = idx;
+ if (undefined_read > line) {
+ undefined_read = line;
+ undefined_read_scope = idx;
+ }
The way you're using it this is effectively just first_read, not
undefined_read.
Indeed.
Post by Nicolai Hähnle
Post by Gert Wollny
+ } else {
+ if (first_write == -1) {
+ first_write = line;
+ first_write_scope = idx;
+
+ // we write in an if-branch
+ auto fw_ifthen_scope = scopes[idx].in_ifelse();
+ if ((fw_ifthen_scope >= 0) &&
scopes[fw_ifthen_scope].in_loop()) {
+ // value not always written, in loops we must keep it
+ keep_for_full_loop = true;
+ } else {
+ // same thing for switch-case
+ auto fw_switch_scope = scopes[idx].in_switchcase();
+ if (fw_switch_scope >= 0 &&
scopes[fw_switch_scope].in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
Simplify this by using an in_conditional() instead of in_ifelse +
in_switchcase().
I was thinking about this, but I also thought that when I want to track
all code path later then I will have to distinguish the two again, but,
on the other hand, adding a method that combines the two is a no-
brainer.
Post by Nicolai Hähnle
Post by Gert Wollny
+
+ /* Only written to, just make sure that renaming
+ * doesn't reuse this register too early (corner
+ * case is the one opcode with two destinations) */
+ if (last_read_scope < 0) {
+ return make_pair(first_write, first_write + 1);
What if there are multiple writes to the temporary?
Good catch! Next text case :)
Post by Nicolai Hähnle
Post by Gert Wollny
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after it (so it constitutes
+ * a conditional write */
It's only conditional if there was a break. Continue doesn't make it
conditional.
I think I made my argument.
Post by Nicolai Hähnle
+ /* propagate lifetimes before moving to upper scopes */
Post by Gert Wollny
+ if ((scopes[first_write_scope].type() == sct_loop) &&
+ (keep_for_full_loop || (undefined_read < first_write))) {
+ first_write = scopes[first_write_scope].begin();
+ int lr = scopes[first_write_scope].end();
+ if (last_read < lr)
+ last_read = lr;
+ }
BGNLOOP
...
BGNLOOP
IF ...
read TEMP[i]
ENDIF
write TEMP[i]
ENDLOOP
...
ENDLOOP
In this case, you don't set keep_for_full_loop, yet the lifetime
must extend for the entire _outer_ loop.
Another test case that needs to be added :)
Post by Nicolai Hähnle
1. In this case, keep_for_full_loop needs to be set to true (because
overwriting first_write in the code above means that the related
check below won't fire).
2. target_level is set to the inner loop even though it really needs
to be set to the outer loop.
Indeed.
Thanks a lot,
Gert
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-19 10:18:16 UTC
Permalink
Hi,
Post by Nicolai Hähnle
Post by Gert Wollny
Post by Gert Wollny
+typedef int scope_idx;
Please remove this, it's an unnecessary and distracting
abstraction that  doesn't gain you anything.
Actually, with the refactoring it did the last two days it helped a
lot.
How? Perhaps your variable names stand to be improved. We try to av
oid ineffective abstractions in Mesa.

Since all of my references to the scopes where using this type, I was
able to easily switch to pointers, which also enabled me to eliminate
the use of an explicit stack.
Post by Nicolai Hähnle
Post by Gert Wollny
I think with the new design that uses pointers into a pre-allocated
array the methods must stay where they are.
No, they really don't. You're thinking of scopes etc. as objects. If
you  just thought of them as data, as IMO you should, it would feel
natural to move the methods into the main class and eliminate those
back-pointers.
Maybe, I did C++ and OOP for too much time, I kind of like the object
based approach, because it helps me better to focuse when working with
the code.
Post by Nicolai Hähnle
Post by Gert Wollny
Post by Gert Wollny
Post by Nicolai Hähnle
TGSI loops are always infinite-loops. So actually, every loop
should 
have a BRK somewhere. But the point remains that CONT itself cannot
skip code indefinitely, because there's no breaking out of the loop
in BGNLOOP.
I see your point, I will think about how this requires different
handling in the life-time analysis, and I also think that handling this
incorrectly might have clobbered my algorithm, i.e. piglit and
compiling shader-db didn't show any problems, and that's what I tested
exhaustively for the last two submissions, but the GpuTest piano
benchmark, that worked with my first version, stopped to work again.
Now I'm redoing everything from version one incorporating what I have
learned step by step.

With that interpretation, [a,b] and [b,c] are  genuinely disjoint,
and it allows merging TEMP[1] and TEMP[2] in
Post by Nicolai Hähnle
   ADD TEMP[1], ..., ....
   ADD TEMP[2], TEMP[1], ...
which is good.
Somehow I already understood that this is how it works.
Post by Nicolai Hähnle
I think this all makes sense, but you should be more explicit about
it  and make sure all the variable names are consistent with that
convention.
Okay, I will strife for it, thanks again for your very helpful
comments,

Gert
Gert Wollny
2017-06-16 09:32:00 UTC
Permalink
To prepare the implementation of a temp register lifetime tracker
some of the classes and functions are moved into seperate header/
implementation files to make them accessible from other files.

Specifically these are:

class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;

int swizzle_for_type(const glsl_type *type, int component);

as inline:

bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op)
---
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 285 +----------------------------
2 files changed, 5 insertions(+), 282 deletions(-)

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index b80882fb8d..21f9167bda 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -507,6 +507,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+ state_tracker/st_glsl_to_tgsi_private.cpp \
+ state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 24d417d670..f64aedb876 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -68,248 +69,7 @@
class st_src_reg;
class st_dst_reg;

-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
- unsigned num_elements = 4;
-
- if (type) {
- type = type->without_array();
- if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
- }
-
- int swizzle = swizzle_for_size(num_elements);
- assert(num_elements + component <= 4);
-
- swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
- return swizzle;
-}
-
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
-public:
- st_src_reg(gl_register_file file, int index, const glsl_type *type,
- int component = 0, unsigned array_id = 0)
- {
- assert(file != PROGRAM_ARRAY || array_id != 0);
- this->file = file;
- this->index = index;
- this->swizzle = swizzle_for_type(type, component);
- this->negate = 0;
- this->abs = 0;
- this->index2D = 0;
- this->type = type ? type->base_type : GLSL_TYPE_ERROR;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = array_id;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = index2D;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->swizzle = 0;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- explicit st_src_reg(st_dst_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
- int negate:4; /**< NEGATE_XYZW mask from mesa */
- unsigned abs:1;
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- /*
- * Is this the second half of a double register pair?
- * currently used for input mapping only.
- */
- unsigned double_reg2:1;
- unsigned is_double_vertex_input:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-
- st_src_reg get_abs()
- {
- st_src_reg reg = *this;
- reg.negate = 0;
- reg.abs = 1;
- return reg;
- }
-};
-
-class st_dst_reg {
-public:
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = 0;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->writemask = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->array_id = 0;
- }
-
- explicit st_dst_reg(st_src_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-};
-
-st_src_reg::st_src_reg(st_dst_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->double_reg2 = false;
- this->array_id = reg.array_id;
- this->is_double_vertex_input = false;
-}
-
-st_dst_reg::st_dst_reg(st_src_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->writemask = WRITEMASK_XYZW;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->array_id = reg.array_id;
-}
-
-class glsl_to_tgsi_instruction : public exec_node {
-public:
- DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
-
- st_dst_reg dst[2];
- st_src_reg src[4];
- st_src_reg resource; /**< sampler, image or buffer register */
- st_src_reg *tex_offsets;
-
- /** Pointer to the ir source this tree came from for debugging */
- ir_instruction *ir;
-
- unsigned op:8; /**< TGSI opcode */
- unsigned saturate:1;
- unsigned is_64bit_expanded:1;
- unsigned sampler_base:5;
- unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
- unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
- glsl_base_type tex_type:5;
- unsigned tex_shadow:1;
- unsigned image_format:9;
- unsigned tex_offset_num_offset:3;
- unsigned dead_mask:4; /**< Used in dead code elimination */
- unsigned buffer_access:3; /**< buffer access type */
-
- const struct tgsi_opcode_info *info;
-};
+extern int swizzle_for_size(int size);

class variable_storage {
DECLARE_RZALLOC_CXX_OPERATORS(variable_storage)
@@ -390,11 +150,6 @@ find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
return GLSL_TYPE_ERROR;
}

-struct rename_reg_pair {
- bool valid;
- int new_reg;
-};
-
struct glsl_to_tgsi_visitor : public ir_visitor {
public:
glsl_to_tgsi_visitor();
@@ -597,7 +352,7 @@ fail_link(struct gl_shader_program *prog, const char *fmt, ...)
prog->data->LinkStatus = linking_failure;
}

-static int
+int
swizzle_for_size(int size)
{
static const int size_swizzles[4] = {
@@ -611,40 +366,6 @@ swizzle_for_size(int size)
return size_swizzles[size - 1];
}

-static bool
-is_resource_instruction(unsigned opcode)
-{
- switch (opcode) {
- case TGSI_OPCODE_RESQ:
- case TGSI_OPCODE_LOAD:
- case TGSI_OPCODE_ATOMUADD:
- case TGSI_OPCODE_ATOMXCHG:
- case TGSI_OPCODE_ATOMCAS:
- case TGSI_OPCODE_ATOMAND:
- case TGSI_OPCODE_ATOMOR:
- case TGSI_OPCODE_ATOMXOR:
- case TGSI_OPCODE_ATOMUMIN:
- case TGSI_OPCODE_ATOMUMAX:
- case TGSI_OPCODE_ATOMIMIN:
- case TGSI_OPCODE_ATOMIMAX:
- return true;
- default:
- return false;
- }
-}
-
-static unsigned
-num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->num_dst;
-}
-
-static unsigned
-num_inst_src_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->is_tex || is_resource_instruction(op->op) ?
- op->info->num_src - 1 : op->info->num_src;
-}

glsl_to_tgsi_instruction *
glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
--
2.13.0
Gert Wollny
2017-06-16 09:32:02 UTC
Permalink
This patch replaces the old register livetime estimation with the
new approach.
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index f64aedb876..d57004c269 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,10 +55,11 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"

#include "util/hash_table.h"
#include <algorithm>
+#include <iostream>

#define PROGRAM_ANY_CONST ((1 << PROGRAM_STATE_VAR) | \
(1 << PROGRAM_CONSTANT) | \
@@ -325,6 +326,7 @@ public:

void merge_two_dsts(void);
void merge_registers(void);
+ void merge_registers_alternative(void);
void renumber_registers(void);

void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
@@ -5140,6 +5142,16 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
}
}

+void
+glsl_to_tgsi_visitor::merge_registers_alternative(void)
+{
+ struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
+ auto lt = estimate_temporary_lifetimes(&this->instructions, this->next_temp);
+ evaluate_remapping(lt, renames);
+ rename_temp_registers(&renames[0]);
+ ralloc_free(renames);
+}
+
/* Merges temporary registers together where possible to reduce the number of
* registers needed to run a program.
*
@@ -6605,7 +6617,7 @@ get_mesa_program_tgsi(struct gl_context *ctx,

v->merge_two_dsts();
if (!skip_merge_registers)
- v->merge_registers();
+ v->merge_registers_alternative();
v->renumber_registers();

/* Write the END instruction. */
--
2.13.0
Emil Velikov
2017-06-16 14:21:25 UTC
Permalink
Hi Gert,

Welcome to Mesa, and apologies for chiming in so late.

Please don't use STL within core mesa code. While some places do use
it, those are quite isolated and have specific role.
For example:
- st/clover - heavily templated, pure C++
- drivers/swr - as above
- drivers/nouveau/codegen - mix of STL and local alternatives of STL
- drivers/r600/sb - somewhat lightweight on STL usage.

A couple of additional ideas:
- where possible try to split patches even further.
IIRC 2/3 adds some 2kloc, which may be hard to review properly.
- do keep performance numbers within the commit summary.
This way the details are preserved in git log for future references.

Thanks
Emil
Gert Wollny
2017-06-16 16:16:52 UTC
Permalink
Hello Emil,
Post by Emil Velikov
Please don't use STL within core mesa code.
May I ask why? I always try to not re-implement already available
functionality and since mesa already uses C++ it seems kind of natural
to use the STL because it provides a well tested implementation for
containers and algorithms.

Anyway, to avoid code duplication, are there already alternative
implementations available within mesa for an array that can dynamically
grow like std::vector and std::stack, and for the algorithm
std::upper_bound working on an array?
Post by Emil Velikov
 - where possible try to split patches even further.
IIRC 2/3 adds some 2kloc, which may be hard to review properly.
On one hand it seems something went wrong when I was rebasing the
patches, part of 1 ended up in 2 :/

When I correct this patch 2 will add ~1500 lines of which ~700 is real
functionality and ~800 are test code that doesn't go into the library.
I can probably split out 100 lines though, separating the two parts of
the algorithm.
Post by Emil Velikov
 - do keep performance numbers within the commit summary.
This way the details are preserved in git log for future references.
Okay.

many thanks for your comments,
Gert
Emil Velikov
2017-06-17 17:28:21 UTC
Permalink
Post by Gert Wollny
Hello Emil,
Post by Emil Velikov
Please don't use STL within core mesa code.
May I ask why? I always try to not re-implement already available
functionality and since mesa already uses C++ it seems kind of natural
to use the STL because it provides a well tested implementation for
containers and algorithms.
A while ago, as the first C++ code was introduced to Mesa there was a
consensus amongst developers about the dos and don'ts. At the end
people concluded against STL, although I'm afraid I don't recall the
details.

On the other hand, devs. can use whatever they prefer in their drivers.
Post by Gert Wollny
Anyway, to avoid code duplication, are there already alternative
implementations available within mesa for an array that can dynamically
grow like std::vector and std::stack, and for the algorithm
std::upper_bound working on an array?
There's vector, list and hash in src/util. Not sure about stack or
upper_bound. From vague memory the former can be easily done via a
list?
Post by Gert Wollny
Post by Emil Velikov
- where possible try to split patches even further.
IIRC 2/3 adds some 2kloc, which may be hard to review properly.
On one hand it seems something went wrong when I was rebasing the
patches, part of 1 ended up in 2 :/
When I correct this patch 2 will add ~1500 lines of which ~700 is real
functionality and ~800 are test code that doesn't go into the library.
I can probably split out 100 lines though, separating the two parts of
the algorithm.
Hmm perhaps it's worth adding the test as a follow-up commit?
Admittedly I haven't looked at the patches so cannot offer a better
advise.
Post by Gert Wollny
Post by Emil Velikov
- do keep performance numbers within the commit summary.
This way the details are preserved in git log for future references.
Okay.
many thanks for your comments,
Yw.

Thanks
Emil
Gert Wollny
2017-06-18 17:42:52 UTC
Permalink
Dear all,

following the comments of Emil and Nicolai I've updated the patch set.

Changes with respect to the old version are:

- split the changes into more patches
- correct formatting errors
- remove the use of the STL with one exception though:
since in st_glsl_to_tgsi.cpp std::sort is already used and its run-time
performance is significantly better than qsort. It is used in the register
rename mapping evaluation. It can be disabled by commenting out the define
USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp.
- add more tests and improve the life-time evaluation accordingly
- further reduce memory allocations

The algorithms is the same as described before, with the little exception that
now initially a dry run over the instructions is used to count the numbers of
scopes. The run-time overhead of this operation can be neglected.

In order to make it easier to transition to the new code and test it I tied it
in parallel to the old code. It can be enabled by setting the environment
variable MESA_GLSL_TO_TGSI_NEW_MERGE.

piglit run on the "shader" test set doesn't show any changes. The additional
passing test of I reported for v2 no longer passes, probably because of the
more conservative life-time estimation required to make the new (valid) tests
pass, but as I wrote before, the problem with this shader
***@glsl-***@execution@variable-***@gs-input-array-vec2-index-rd
(and its sister *vec3*) is, IMHO not solvable by better register-renaming.

The performance numbers estimated by running the shader-db are given in the
commit message of the last patch, the trend is the same like reported before.

Many thanks for any commenst,
Gert


Gert Wollny (7):
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa: Propagate c++11 CXXFLAGS from LLVM_CXXFLAGS to mesa/
mesa/st: glsl_to_tgsi: implement new temporary register lifetime
tracker
mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
tracker
mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
mesa/st: glsl_to_tgsi: tie in new temporary register merge approach

configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 319 +-----
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 ++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 752 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 36 +
src/mesa/state_tracker/tests/Makefile.am | 41 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 1042 ++++++++++++++++++++
10 files changed, 2277 insertions(+), 294 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
Gert Wollny
2017-06-18 17:42:57 UTC
Permalink
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can use the alternative code path that uses qsort.
---
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 104 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 3 +
2 files changed, 107 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index aa3bad78c0..09660ff0d3 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -26,6 +26,13 @@
#include <mesa/program/prog_instruction.h>
#include <limits>

+#define USE_STL_SORT
+#ifdef USE_STL_SORT
+#include <algorithm>
+#else
+#include <cstdlib>
+#endif
+
using std::numeric_limits;

enum e_scope_type {
@@ -646,3 +653,100 @@ estimate_temporary_lifetimes(void *mem_ctx, exec_list *instructions,
{
return tgsi_temp_lifetime(mem_ctx).get_lifetimes(instructions, ntemps, lifetimes);
}
+
+struct access_record {
+ int begin;
+ int end;
+ int reg;
+ bool erase;
+};
+
+/* Find the next register between [start, end) that has a life time starting
+ * at or after bound by using a binary search.
+ * start points at the beginning of the search range,
+ * end points at the element past the end of the search range, and
+ * the array comprising [start, end) must be sorted in ascending order.
+ */
+access_record*
+find_next_rename(access_record* start, access_record* end, int bound)
+{
+ int delta = (end - start);
+ while (delta > 0) {
+ int half = delta >> 1;
+ access_record* middle = start + half;
+ if (bound <= middle->begin)
+ delta = half;
+ else {
+ start = middle;
+ ++start;
+ delta -= half + 1;
+ }
+ }
+ return start;
+}
+
+/* This functions evaluates the register merges by using an O(n log n)
+ * algorithm to find suitable merge candidates. */
+void evaluate_remapping(void *mem_ctx, int ntemps, const struct lifetime* lifetimes,
+ struct rename_reg_pair *result)
+{
+
+ access_record *m = ralloc_array(mem_ctx, access_record, ntemps - 1);
+
+ for (int i = 1; i < ntemps; ++i) {
+ m[i-1] = {lifetimes[i].begin, lifetimes[i].end, i, false};
+ }
+
+#ifdef USE_STL_SORT
+ std::sort(m, m + ntemps - 1,
+ [](const access_record& a, const access_record& b) {
+ return a.begin < b.begin;
+ });
+#else
+ std::qsort(m, ntemps - 1, sizeof(access_record),
+ [](const void *a, const void *b) {
+ const access_record *aa = static_cast<const access_record*>(a);
+ const access_record *bb = static_cast<const access_record*>(b);
+ return aa->begin < bb->begin ? -1 : (aa->begin > bb->begin ? 1 : 0);
+ });
+#endif
+
+ auto trgt = m;
+ auto mend = m + ntemps - 1;
+ auto first_erase = mend;
+ auto search_start = trgt + 1;
+
+ while (trgt != mend) {
+
+ auto src = find_next_rename(search_start, mend, trgt->end);
+ if (src != mend) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't erase the renamed
+ * register just now, just mark it for removal. */
+ src->erase = true;
+ if (first_erase == mend)
+ first_erase = src;
+ search_start = src + 1;
+ } else {
+ /* Moving to the next target register it is time to
+ * erase the already merged registers */
+ if (first_erase != mend) {
+ auto out = first_erase;
+ auto in_start = first_erase + 1;
+ while (in_start != mend) {
+ if (!in_start->erase)
+ *out++ = *in_start;
+ ++in_start;
+ }
+ mend = out;
+ first_erase = mend;
+ }
+ ++trgt;
+ search_start = trgt + 1;
+ }
+ }
+ ralloc_free(m);
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
index 0637ffab08..4ebead1b45 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -31,3 +31,6 @@ struct lifetime {
void
estimate_temporary_lifetimes(void *mem_ctx, exec_list *instructions,
int ntemps, struct lifetime *lt);
+
+void evaluate_remapping(void *mem_ctx, int ntemps, const struct lifetime *lt,
+ struct rename_reg_pair *result);
--
2.13.0
Gert Wollny
2017-06-18 17:42:54 UTC
Permalink
For the new register renaming approach C++11 is wanted.
---
src/mesa/Makefile.am | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..3339926d93 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -101,7 +101,7 @@ AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(MSVC2013_COMPAT_CFLAGS)
AM_CXXFLAGS = \
- $(LLVM_CFLAGS) \
+ $(LLVM_CXXFLAGS) \
$(VISIBILITY_CXXFLAGS) \
$(MSVC2013_COMPAT_CXXFLAGS)
--
2.13.0
Gert Wollny
2017-06-18 17:42:56 UTC
Permalink
This patch adds a set of unit tests for the new lifetime tracker.
---
configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/state_tracker/tests/Makefile.am | 41 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 969 +++++++++++++++++++++
4 files changed, 1012 insertions(+), 1 deletion(-)
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp

diff --git a/configure.ac b/configure.ac
index da7b2f8f81..5279b231ed 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2839,6 +2839,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 3339926d93..0075e91f77 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.

-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests

if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..a931cb6498
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,41 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES) $(INCLUDE_DIRS)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gallium/drivers/trace/libtrace.la \
+ $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
+ $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS) \
+ -ldl
+
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..7e07f8868f
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,969 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+
+
+/* A line to describe a TGSI instruction for building mock shaders */
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ MockCodeline(unsigned _op, const vector<int>& _dst,
+ const vector<int>& _src, const vector<int>&_to):
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+/* A few constants to use in the mock shaders */
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+/* A class to create a shader program to check the register allocation
+ * and renaming. The created exec_list is not completely set up and can
+ * only be used for the register tife-time analyis. */
+class MockShader {
+public:
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+private:
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+/* type for register lifetime expectation */
+using expectation = vector<vector<int>>;
+
+
+class LifetimeEvaluatorTest : public testing::Test {
+
+ void SetUp();
+ void TearDown();
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+private:
+ virtual void check(const vector<lifetime>& result, const expectation& e) = 0;
+ void *mem_ctx;
+};
+
+/* This is a teat class to check the exact life times of
+ * registers. */
+class LifetimeEvaluatorExactTest : public LifetimeEvaluatorTest {
+protected:
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+/* This test class checks that the life time covers at least
+ * in the expected range. It is used for cases where we know that
+ * a the implementation could be improved on estimating the minimal
+ * life time.
+ */
+class LifetimeEvaluatorAtLeastTest : public LifetimeEvaluatorTest {
+protected:
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1,in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+/* in loop if/else value written only in one path, and read later
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
+}
+
+
+// in loop if/else value written in both path, and read later
+// - value must survive from first write to last read in loop
+// for now we only check that the minimum life time is correct
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
+}
+
+/* in loop if/else value written in both path, red in else path
+ * before read and also read later- value must survive from first
+ * write to last read in loop */
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+/* in loop if/else read in one path before written in the same loop
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries
+ */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 15
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP }, // 13
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole outer loop */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* Test whether variable is kept also if the continue is in a
+ * higher scope than the variable write */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteInLoopAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for all
+ * loops including the read loop */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to the case and not to the loop, the variable doesn't
+ * need to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_DEFAULT},
+ { TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{8, 10}}));
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to that inner loop. The variable has to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitchInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_DEFAULT, {}, {}, {}},
+ { TGSI_OPCODE_ENDSWITCH, {}, {}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{2, 10}}));
+}
+
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop that includes the read */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for all
+ * loops up onto the read scope */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 11}}));
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_DEFAULT},
+ {TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 3}}));
+}
+
+/* variable written in a switch within a loop must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* value written in one case, and read in other, in loop
+ * - must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* Make sure SWITCH is closed correctly in the scope stack */
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+
+/* value read/write in same case, stays there */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+/* value read/write in all cases, should only live from first
+ * write to last read, but currently the whole loop is used. */
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+
+/* value read/write in differnt loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+/* value read/write in differnt loops, conditional */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferentScopesCondReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the last write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnlyTwice)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in if */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in switch */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+/* Check that a missing END is handled (Unigine-Haven creates such a
+ * shader) */
+TEST_F(LifetimeEvaluatorExactTest, DistinceScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+/* Dead code elimination should catch and remove the case
+ * when a variable is written after its last read, but
+ * we want the code to be aware of this case.
+ * The life time of this uselessly written variable is set
+ * to the instruction after the write, because
+ * otherwise it could be re-used too early.
+*/
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {1}, {2}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,3}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SerialReadWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {3}, {2}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}, {2,3}}));
+}
+
+/*
+ * Somehow a duplicate of above tests specifically using the
+ * problematic corner case n question. DFRACEXP has two
+ * destinations, and if one value is thrown away, we must ensure
+ * that the two output registers don't merge.
+ * In this test case the last access for 2 and 3 is in line 4,
+ * but only 3 can be merged with 4 because it is read, 2 on the
+ * other hand is written to, and merging it with 4 would result in
+ * undefined behaviour.
+*/
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead2)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+ { TGSI_OPCODE_DFRACEXP , {2,4}, {3}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {4}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,2}, {1,4}, {2,3}, {3,4}}));
+}
+
+/* The variable is conditionally read before first written, so
+ * it has to surive all the loops. */
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionInLoopAndCondition)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF },
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, OnlyWriteOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+
+/* Check that two destination registers are actually used */
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers and three source registers
+ * are used */
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}},
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+MockShader::MockShader(const vector<MockCodeline>& source):
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ if (i.tex_offsets.size() > 0)
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ else
+ next_instr->tex_offsets = 0;
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ /* the list is not fully initialized, so
+ * tearing it down also must be done manually. */
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+void LifetimeEvaluatorTest::SetUp()
+{
+ mem_ctx = ralloc_context(nullptr);
+}
+
+void LifetimeEvaluatorTest::TearDown()
+{
+ ralloc_free(mem_ctx);
+ mem_ctx = nullptr;
+}
+
+void LifetimeEvaluatorTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+ std::vector<lifetime> result(shader.get_num_temps());
+
+ estimate_temporary_lifetimes(mem_ctx, shader.get_program(),
+ shader.get_num_temps(), &result[0]);
+
+ /* lifetimes[0] not used, but created for simpler processing */
+ ASSERT_EQ(result.size(), e.size());
+ check(result, e);
+}
+
+
+void LifetimeEvaluatorExactTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].begin, e[i][0]);
+ EXPECT_EQ(lifetimes[i].end, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].begin, e[i][0]);
+ EXPECT_GE(lifetimes[i].end, e[i][1]);
+ }
+}
--
2.13.0
Gert Wollny
2017-06-18 17:42:58 UTC
Permalink
The patch adds tests for the register rename mapping evaluation.
---
.../tests/test_glsl_to_tgsi_lifetime.cpp | 71 ++++++++++++++++++++--
1 file changed, 66 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 7e07f8868f..8fd62d1db3 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -74,15 +74,18 @@ private:
using expectation = vector<vector<int>>;


-class LifetimeEvaluatorTest : public testing::Test {
-
+class MesaTestWithMemCtx : public testing::Test {
void SetUp();
void TearDown();
protected:
+ void *mem_ctx;
+};
+
+class LifetimeEvaluatorTest : public MesaTestWithMemCtx {
+protected:
void run(const vector<MockCodeline>& code, const expectation& e);
private:
virtual void check(const vector<lifetime>& result, const expectation& e) = 0;
- void *mem_ctx;
};

/* This is a teat class to check the exact life times of
@@ -925,12 +928,12 @@ st_dst_reg MockShader::create_dst_register(int dst_idx)
return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
}

-void LifetimeEvaluatorTest::SetUp()
+void MesaTestWithMemCtx::SetUp()
{
mem_ctx = ralloc_context(nullptr);
}

-void LifetimeEvaluatorTest::TearDown()
+void MesaTestWithMemCtx::TearDown()
{
ralloc_free(mem_ctx);
mem_ctx = nullptr;
@@ -967,3 +970,61 @@ void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
EXPECT_GE(lifetimes[i].end, e[i][1]);
}
}
+
+class RegisterRemapping : public MesaTestWithMemCtx {
+protected:
+ void run(const vector<lifetime>& lt, const vector<int>& expect);
+};
+
+void RegisterRemapping::run(const vector<lifetime>& lt,
+ const vector<int>& expect)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(lt.size(), proto);
+
+ evaluate_remapping(mem_ctx, lt.size(), &lt[0], &result[0]);
+
+ vector<int> remap(lt.size());
+ for (unsigned i = 0; i < lt.size(); ++i) {
+ remap[i] = result[i].valid ? result[i].new_reg : i;
+ }
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {
+ return rn.valid ? rn.new_reg : x;
+ });
+
+ for(unsigned i = 1; i < remap.size(); ++i) {
+ EXPECT_EQ(remap[i], expect[i]);
+ }
+}
+
+TEST_F(RegisterRemapping, RegisterRemapping1)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {1, 2},
+ {2, 10},
+ {3, 5},
+ {5, 10}
+ });
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+ run(lt, expect);
+}
+
+
+TEST_F(RegisterRemapping, RegisterRemapping2)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {3, 3},
+ {4, 4},
+ });
+ vector<int> expect({0, 1, 2, 1, 1});
+ run(lt, expect);
+}
+
+
--
2.13.0
Gert Wollny
2017-06-18 17:42:53 UTC
Permalink
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;

int swizzle_for_type(const glsl_type *type, int component);

as inline:

bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op)
---
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 288 +--------------------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 +++++++++++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 ++++++++++++
4 files changed, 377 insertions(+), 285 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index b80882fb8d..21f9167bda 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -507,6 +507,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+ state_tracker/st_glsl_to_tgsi_private.cpp \
+ state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 24d417d670..ebe87a7821 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -65,251 +66,7 @@

#define MAX_GLSL_TEXTURE_OFFSET 4

-class st_src_reg;
-class st_dst_reg;
-
-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
- unsigned num_elements = 4;
-
- if (type) {
- type = type->without_array();
- if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
- }
-
- int swizzle = swizzle_for_size(num_elements);
- assert(num_elements + component <= 4);
-
- swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
- return swizzle;
-}
-
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
-public:
- st_src_reg(gl_register_file file, int index, const glsl_type *type,
- int component = 0, unsigned array_id = 0)
- {
- assert(file != PROGRAM_ARRAY || array_id != 0);
- this->file = file;
- this->index = index;
- this->swizzle = swizzle_for_type(type, component);
- this->negate = 0;
- this->abs = 0;
- this->index2D = 0;
- this->type = type ? type->base_type : GLSL_TYPE_ERROR;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = array_id;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = index2D;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->swizzle = 0;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- explicit st_src_reg(st_dst_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
- int negate:4; /**< NEGATE_XYZW mask from mesa */
- unsigned abs:1;
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- /*
- * Is this the second half of a double register pair?
- * currently used for input mapping only.
- */
- unsigned double_reg2:1;
- unsigned is_double_vertex_input:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-
- st_src_reg get_abs()
- {
- st_src_reg reg = *this;
- reg.negate = 0;
- reg.abs = 1;
- return reg;
- }
-};
-
-class st_dst_reg {
-public:
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = 0;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->writemask = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->array_id = 0;
- }
-
- explicit st_dst_reg(st_src_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-};
-
-st_src_reg::st_src_reg(st_dst_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->double_reg2 = false;
- this->array_id = reg.array_id;
- this->is_double_vertex_input = false;
-}
-
-st_dst_reg::st_dst_reg(st_src_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->writemask = WRITEMASK_XYZW;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->array_id = reg.array_id;
-}
-
-class glsl_to_tgsi_instruction : public exec_node {
-public:
- DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
-
- st_dst_reg dst[2];
- st_src_reg src[4];
- st_src_reg resource; /**< sampler, image or buffer register */
- st_src_reg *tex_offsets;
-
- /** Pointer to the ir source this tree came from for debugging */
- ir_instruction *ir;
-
- unsigned op:8; /**< TGSI opcode */
- unsigned saturate:1;
- unsigned is_64bit_expanded:1;
- unsigned sampler_base:5;
- unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
- unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
- glsl_base_type tex_type:5;
- unsigned tex_shadow:1;
- unsigned image_format:9;
- unsigned tex_offset_num_offset:3;
- unsigned dead_mask:4; /**< Used in dead code elimination */
- unsigned buffer_access:3; /**< buffer access type */
-
- const struct tgsi_opcode_info *info;
-};
+extern int swizzle_for_size(int size);

class variable_storage {
DECLARE_RZALLOC_CXX_OPERATORS(variable_storage)
@@ -390,11 +147,6 @@ find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
return GLSL_TYPE_ERROR;
}

-struct rename_reg_pair {
- bool valid;
- int new_reg;
-};
-
struct glsl_to_tgsi_visitor : public ir_visitor {
public:
glsl_to_tgsi_visitor();
@@ -597,7 +349,7 @@ fail_link(struct gl_shader_program *prog, const char *fmt, ...)
prog->data->LinkStatus = linking_failure;
}

-static int
+int
swizzle_for_size(int size)
{
static const int size_swizzles[4] = {
@@ -611,40 +363,6 @@ swizzle_for_size(int size)
return size_swizzles[size - 1];
}

-static bool
-is_resource_instruction(unsigned opcode)
-{
- switch (opcode) {
- case TGSI_OPCODE_RESQ:
- case TGSI_OPCODE_LOAD:
- case TGSI_OPCODE_ATOMUADD:
- case TGSI_OPCODE_ATOMXCHG:
- case TGSI_OPCODE_ATOMCAS:
- case TGSI_OPCODE_ATOMAND:
- case TGSI_OPCODE_ATOMOR:
- case TGSI_OPCODE_ATOMXOR:
- case TGSI_OPCODE_ATOMUMIN:
- case TGSI_OPCODE_ATOMUMAX:
- case TGSI_OPCODE_ATOMIMIN:
- case TGSI_OPCODE_ATOMIMAX:
- return true;
- default:
- return false;
- }
-}
-
-static unsigned
-num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->num_dst;
-}
-
-static unsigned
-num_inst_src_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->is_tex || is_resource_instruction(op->op) ?
- op->info->num_src - 1 : op->info->num_src;
-}

glsl_to_tgsi_instruction *
glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
new file mode 100644
index 0000000000..b77313da10
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
@@ -0,0 +1,207 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+
+using std::vector;
+
+extern int swizzle_for_size(int size);
+
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+ unsigned num_elements = 4;
+
+ if (type) {
+ type = type->without_array();
+ if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+ }
+
+ int swizzle = swizzle_for_size(num_elements);
+ assert(num_elements + component <= 4);
+
+ swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+ return swizzle;
+}
+
+
+
+st_src_reg::st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component, unsigned array_id)
+{
+ assert(file != PROGRAM_ARRAY || array_id != 0);
+ this->file = file;
+ this->index = index;
+ this->swizzle = swizzle_for_type(type, component);
+ this->negate = 0;
+ this->abs = 0;
+ this->index2D = 0;
+ this->type = type ? type->base_type : GLSL_TYPE_ERROR;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = index2D;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->swizzle = 0;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+
+st_src_reg st_src_reg::get_abs()
+{
+ st_src_reg reg = *this;
+ reg.negate = 0;
+ reg.abs = 1;
+ return reg;
+}
+
+st_src_reg::st_src_reg(st_dst_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->double_reg2 = false;
+ this->array_id = reg.array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_dst_reg::st_dst_reg(st_src_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->writemask = WRITEMASK_XYZW;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->array_id = reg.array_id;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->array_id = 0;
+}
+
+
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.h b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
new file mode 100644
index 0000000000..9a2a3efa7e
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
@@ -0,0 +1,165 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <mesa/main/mtypes.h>
+#include <compiler/glsl_types.h>
+#include <compiler/glsl/ir.h>
+#include <tgsi/tgsi_info.h>
+#include <stack>
+#include <vector>
+
+class st_dst_reg;
+
+/**
+ * This struct is a corresponding struct to TGSI ureg_src.
+ */
+class st_src_reg {
+public:
+ st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component = 0, unsigned array_id = 0);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D);
+
+ st_src_reg();
+
+ explicit st_src_reg(st_dst_reg reg);
+
+ st_src_reg get_abs();
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+
+ uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
+ int negate:4; /**< NEGATE_XYZW mask from mesa */
+ unsigned abs:1;
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ /*
+ * Is this the second half of a double register pair?
+ * currently used for input mapping only.
+ */
+ unsigned double_reg2:1;
+ unsigned is_double_vertex_input:1;
+ unsigned array_id:10;
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+
+};
+
+class st_dst_reg {
+public:
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index);
+
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type);
+
+ st_dst_reg();
+
+ explicit st_dst_reg(st_src_reg reg);
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ unsigned array_id:10;
+
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+};
+
+class glsl_to_tgsi_instruction : public exec_node {
+public:
+ DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
+
+ st_dst_reg dst[2];
+ st_src_reg src[4];
+ st_src_reg resource; /**< sampler or buffer register */
+ st_src_reg *tex_offsets;
+
+ /** Pointer to the ir source this tree came from for debugging */
+ ir_instruction *ir;
+
+ unsigned op:8; /**< TGSI opcode */
+ unsigned saturate:1;
+ unsigned is_64bit_expanded:1;
+ unsigned sampler_base:5;
+ unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
+ unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type:5;
+ unsigned tex_shadow:1;
+ unsigned image_format:9;
+ unsigned tex_offset_num_offset:3;
+ unsigned dead_mask:4; /**< Used in dead code elimination */
+ unsigned buffer_access:3; /**< buffer access type */
+
+ const struct tgsi_opcode_info *info;
+};
+
+struct rename_reg_pair {
+ bool valid;
+ int new_reg;
+};
+
+inline bool
+is_resource_instruction(unsigned opcode)
+{
+ switch (opcode) {
+ case TGSI_OPCODE_RESQ:
+ case TGSI_OPCODE_LOAD:
+ case TGSI_OPCODE_ATOMUADD:
+ case TGSI_OPCODE_ATOMXCHG:
+ case TGSI_OPCODE_ATOMCAS:
+ case TGSI_OPCODE_ATOMAND:
+ case TGSI_OPCODE_ATOMOR:
+ case TGSI_OPCODE_ATOMXOR:
+ case TGSI_OPCODE_ATOMUMIN:
+ case TGSI_OPCODE_ATOMUMAX:
+ case TGSI_OPCODE_ATOMIMIN:
+ case TGSI_OPCODE_ATOMIMAX:
+ return true;
+ default:
+ return false;
+ }
+}
+
+inline unsigned
+num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->num_dst;
+}
+
+inline unsigned
+num_inst_src_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->is_tex || is_resource_instruction(op->op) ?
+ op->info->num_src - 1 : op->info->num_src;
+}
+
--
2.13.0
Gert Wollny
2017-06-18 17:42:59 UTC
Permalink
This patch ties in the new temporary register lifetime estiamtion and
rename mapping evaluation. In order to enable it, the evironment
variable MESA_GLSL_TO_TGSI_NEW_MERGE must be set.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread; Numbers are in
% of total run.

-----------------------------------------------------------
old new(qsort) new(std::sort)

------------------------ valgrind -------------------------
merge 0.21 0.20 0.13
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.12 0.06
apply mapping 0.02 0.02 0.02

--- perf (approximate because of statistic sampling) -------
merge 0.24 0.20 0.14
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.10 0.04
apply mapping 0.05 0.05 0.05
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 33 ++++++++++++++++------
.../tests/test_glsl_to_tgsi_lifetime.cpp | 12 ++++++++
2 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index ebe87a7821..f475b448c9 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,7 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -322,6 +322,7 @@ public:

void merge_two_dsts(void);
void merge_registers(void);
+ void merge_registers_alternative(void);
void renumber_registers(void);

void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
@@ -567,7 +568,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
if (swz > 1) {
dinst->src[j].double_reg2 = true;
dinst->src[j].index++;
- }
+ }

if (swz & 1)
dinst->src[j].swizzle = MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, SWIZZLE_W);
@@ -2093,7 +2094,7 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op)
st_src_reg temp = get_temp(glsl_type::uvec4_type);
st_dst_reg temp_dst = st_dst_reg(temp);
unsigned orig_swz = op[0].swizzle;
- /*
+ /*
* To convert unsigned to 64-bit:
* zero Y channel, copy X channel.
*/
@@ -2571,8 +2572,8 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
if (index) {

if (this->prog->Target == GL_VERTEX_PROGRAM_ARB &&
- src.file == PROGRAM_INPUT)
- element_size = attrib_type_size(ir->type, true);
+ src.file == PROGRAM_INPUT)
+ element_size = attrib_type_size(ir->type, true);
if (is_2D) {
src.index2D = index->value.i[0];
src.has_index2 = true;
@@ -2854,7 +2855,7 @@ glsl_to_tgsi_visitor::emit_block_mov(ir_assignment *ir, const struct glsl_type *
if (type->is_dual_slot()) {
l->index++;
if (r->is_double_vertex_input == false)
- r->index++;
+ r->index++;
}
}

@@ -5137,6 +5138,18 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
}
}

+void
+glsl_to_tgsi_visitor::merge_registers_alternative(void)
+{
+ struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
+ struct lifetime *lt = ralloc_array(mem_ctx, struct lifetime, this->next_temp);
+ estimate_temporary_lifetimes(mem_ctx, &this->instructions, this->next_temp, lt);
+ evaluate_remapping(mem_ctx, this->next_temp, lt, renames);
+ rename_temp_registers(&renames[0]);
+ ralloc_free(lt);
+ ralloc_free(renames);
+}
+
/* Merges temporary registers together where possible to reduce the number of
* registers needed to run a program.
*
@@ -6601,8 +6614,12 @@ get_mesa_program_tgsi(struct gl_context *ctx,
while (v->eliminate_dead_code());

v->merge_two_dsts();
- if (!skip_merge_registers)
- v->merge_registers();
+ if (!skip_merge_registers) {
+ if (getenv("MESA_GLSL_TO_TGSI_NEW_MERGE") != NULL)
+ v->merge_registers_alternative();
+ else
+ v->merge_registers();
+ }
v->renumber_registers();

/* Write the END instruction. */
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 8fd62d1db3..d63daea80e 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -1027,4 +1027,16 @@ TEST_F(RegisterRemapping, RegisterRemapping2)
run(lt, expect);
}

+TEST_F(RegisterRemapping, RegisterRemappingMergeAll)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {1, 2},
+ {2, 3},
+ {3, 4},
+ });
+ vector<int> expect({0, 1, 1, 1, 1});
+ run(lt, expect);
+}
+
--
2.13.0
Gert Wollny
2017-06-18 17:42:55 UTC
Permalink
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned second time to recorc the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actuall minimal life time is
evaluated.
---
src/mesa/Makefile.sources | 2 +
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 648 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 33 ++
3 files changed, 683 insertions(+)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..2359ec3c7d 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..aa3bad78c0
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,648 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <limits>
+
+using std::numeric_limits;
+
+enum e_scope_type {
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_switch_case,
+ sct_switch_default,
+ sct_unknown
+};
+
+enum e_acc_type {
+ acc_read,
+ acc_write
+};
+
+class prog_scope {
+
+public:
+ prog_scope();
+ prog_scope(e_scope_type type, int id, int lvl, int s_begin);
+ prog_scope(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin);
+
+ e_scope_type type() const { return scope_type; }
+ prog_scope *parent() { return parent_scope; }
+ int level() const {return nested_level; }
+ int id() const { return scope_id; }
+ int end() const {return scope_end; }
+ int begin() const {return scope_begin; }
+ int loop_continue_line() const {return loop_continue;}
+
+ prog_scope *in_ifelse();
+ prog_scope *in_switchcase();
+ prog_scope *in_conditional();
+
+ bool in_loop() const;
+ prog_scope *get_innermost_loop();
+ bool is_conditional() const;
+ bool contains(prog_scope *scope) const;
+ void set_end(int end);
+ void set_previous(prog_scope *prev);
+ void set_continue(prog_scope *scope, int i);
+ bool enclosed_by_loop_prior_to_switch();
+ prog_scope *get_outermost_loop();
+
+private:
+ e_scope_type scope_type;
+ int scope_id;
+ int nested_level;
+ int scope_begin;
+ int scope_end;
+ int loop_continue;
+
+ prog_scope *scope_of_loop_to_continue;
+ prog_scope *previous_switchcase;
+ prog_scope *parent_scope;
+
+};
+
+class temp_access {
+
+public:
+ temp_access();
+ void append(int index, e_acc_type rw, prog_scope *pstate);
+ lifetime get_required_lifetime();
+
+private:
+
+ prog_scope *last_read_scope;
+ prog_scope *first_read_scope;
+ prog_scope *first_write_scope;
+
+ int first_write;
+ int last_read;
+ int last_write;
+ int first_read;
+};
+
+
+class tgsi_temp_lifetime {
+public:
+ tgsi_temp_lifetime(void *mem_ctx);
+ void get_lifetimes(exec_list *instructions,
+ int ntemps,struct lifetime *lifetimes);
+private:
+ prog_scope *make_scope(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin);
+ void evaluate();
+
+ prog_scope *scopes;
+ int n_scopes;
+ int cur_scope;
+ void *mem_ctx;
+};
+
+tgsi_temp_lifetime::tgsi_temp_lifetime(void *mc):
+ scopes(0),
+ n_scopes(0),
+ cur_scope(0),
+ mem_ctx(mc)
+{
+}
+
+prog_scope *
+tgsi_temp_lifetime::make_scope(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin)
+{
+ scopes[cur_scope] = prog_scope(p, type, id, lvl, s_begin);
+ return &scopes[cur_scope++];
+}
+
+void
+tgsi_temp_lifetime::get_lifetimes(exec_list *instructions, int ntemps,
+ struct lifetime *lifetimes)
+{
+ int line = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int nesting_lvl = 0;
+ bool is_at_end = false;
+
+ int n_scopes = 1;
+
+ /* count scopes to allocate the needed space without the need for
+ * re-allocation */
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (inst->op == TGSI_OPCODE_BGNLOOP ||
+ inst->op == TGSI_OPCODE_SWITCH ||
+ inst->op == TGSI_OPCODE_CASE ||
+ inst->op == TGSI_OPCODE_IF ||
+ inst->op == TGSI_OPCODE_UIF ||
+ inst->op == TGSI_OPCODE_ELSE ||
+ inst->op == TGSI_OPCODE_DEFAULT)
+ ++n_scopes;
+ }
+
+ scopes = ralloc_array(mem_ctx, prog_scope, n_scopes);
+
+ /* using this new with mem_ctx segfaults ..., but we must call
+ * the standard constructor. The alternative option would be
+ * to use ralloc_array and the placement new but I doubt that
+ * this would make any difference in performance */
+ temp_access *acc = new temp_access[ntemps];
+
+ prog_scope *current = make_scope(nullptr, sct_outer, 0, nesting_lvl++, line);
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+
+ assert(!is_at_end && "Found instructions past TGSI_OPCODE_END");
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ current = make_scope(current, sct_loop, loop_id,
+ nesting_lvl, line);
+ ++loop_id;
+ ++nesting_lvl;
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --nesting_lvl;
+ current->set_end(line);
+ current = current->parent();
+ break;
+ }
+ case TGSI_OPCODE_IF:
+ case TGSI_OPCODE_UIF:{
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ }
+ current = make_scope(current, sct_if, if_id, nesting_lvl, line);
+ ++if_id;
+ ++nesting_lvl;
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ current->set_end(line-1);
+ current = make_scope(current->parent(), sct_else,
+ current->id(), current->level(), line);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ current->set_end(line);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --nesting_lvl;
+ current->set_end(line-1);
+ current = current->parent();
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ current = make_scope(current, sct_switch, switch_id,
+ nesting_lvl, line);
+ ++nesting_lvl;
+ ++switch_id;
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --nesting_lvl;
+ current->set_end(line-1);
+
+ if (current->type() != sct_switch) {
+ current = current->parent();
+ }
+ current = current->parent();
+ break;
+ }
+ case TGSI_OPCODE_CASE:
+ if (inst->src[0].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[0].index].append(line, acc_read, current);
+ }
+ /* fall through */
+ case TGSI_OPCODE_DEFAULT: {
+ auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
+ sct_switch_case : sct_switch_default;
+ if (current->type() == sct_switch) {
+ current = make_scope(current, scope_type, current->id(),
+ nesting_lvl, line);
+ } else {
+ auto p = current->parent();
+ auto scope = make_scope(p, scope_type, p->id(),
+ p->level(), line);
+ if (current->end() == -1)
+ scope->set_previous(current);
+ current = scope;
+ }
+ break;
+ }
+ case TGSI_OPCODE_BRK: {
+ if ((current->type() == sct_switch_case) ||
+ (current->type() == sct_switch_default)) {
+ current->set_end(line-1);
+ }
+ /* Make sure that the nearest enclosing scope is a loop
+ * and not a switch case.
+ * Apart from that this is like a continue, just
+ * a bit more final */
+ else if (current->enclosed_by_loop_prior_to_switch()) {
+ current->set_continue(current, line);
+ }
+ break;
+ }
+ case TGSI_OPCODE_CONT: {
+ current->set_continue(current, line);
+ break;
+ }
+ default: {
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].append(line, acc_read, current);
+ }
+ }
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->tex_offsets[j].index].append(line, acc_read, current);
+ }
+ }
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->dst[j].index].append(line, acc_write, current);
+ }
+ }
+ }
+ }
+
+ ++line;
+ }
+
+ /* make sure last scope is closed, even though no
+ * TGSI_OPCODE_END was given */
+ if (current->end() < 0) {
+ current->set_end(line-1);
+ }
+
+ for(int i = 1; i < ntemps; ++i) {
+ lifetimes[i] = acc[i].get_required_lifetime();
+ }
+ delete[] acc;
+ ralloc_free(scopes);
+}
+
+prog_scope::prog_scope():
+ prog_scope(nullptr, sct_unknown, -1, -1, -1)
+{
+}
+
+prog_scope::prog_scope(e_scope_type type, int id, int lvl, int s_begin):
+ prog_scope(nullptr, type, id, lvl, s_begin)
+{
+}
+
+prog_scope::prog_scope(prog_scope *p, e_scope_type type, int id, int lvl,
+ int s_begin):
+ scope_type(type),
+ scope_id(id),
+ nested_level(lvl),
+ scope_begin(s_begin),
+ scope_end(-1),
+ loop_continue(numeric_limits<int>::max()),
+ scope_of_loop_to_continue(nullptr),
+ previous_switchcase(nullptr),
+ parent_scope(p)
+{
+
+}
+
+bool prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (parent_scope)
+ return parent_scope->in_loop();
+ return false;
+}
+
+prog_scope *prog_scope::get_innermost_loop()
+{
+ if (scope_type == sct_loop)
+ return this;
+ if (parent_scope)
+ return parent_scope->get_innermost_loop();
+ else
+ return nullptr;
+}
+
+prog_scope *
+prog_scope::get_outermost_loop()
+{
+ prog_scope *loop = nullptr;
+ if (scope_type == sct_loop)
+ loop = this;
+ if (parent_scope) {
+ prog_scope *l = parent_scope->get_outermost_loop();
+ if (l)
+ loop = l;
+ }
+ return loop;
+}
+
+bool prog_scope::contains(prog_scope *other) const
+{
+ return (begin() <= other->begin()) && (end() >= other->end());
+}
+
+bool prog_scope::is_conditional() const
+{
+ return scope_type == sct_if || scope_type == sct_else ||
+ scope_type == sct_switch_case || scope_type == sct_switch_default;
+}
+
+prog_scope *prog_scope::in_conditional()
+{
+ if (is_conditional())
+ return this;
+ if (parent_scope)
+ return parent_scope->in_conditional();
+ return nullptr;
+}
+
+bool prog_scope::enclosed_by_loop_prior_to_switch()
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default ||
+ scope_type == sct_switch)
+ return false;
+ if (parent_scope)
+ return parent_scope->enclosed_by_loop_prior_to_switch();
+ else
+ return false;
+}
+
+prog_scope *prog_scope::in_ifelse()
+{
+ if ((scope_type == sct_if) ||
+ (scope_type == sct_else))
+ return this;
+ else if (parent_scope)
+ return parent_scope->in_ifelse();
+ else
+ return nullptr;
+}
+
+prog_scope *prog_scope::in_switchcase()
+{
+ if ((scope_type == sct_switch_case) ||
+ (scope_type == sct_switch_default))
+ return this;
+ else if (parent_scope)
+ return parent_scope->in_switchcase();
+ else
+ return nullptr;
+}
+
+void prog_scope::set_previous(prog_scope *prev)
+{
+ previous_switchcase = prev;
+}
+
+void prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous_switchcase)
+ parent_scope->set_end(end);
+ }
+}
+
+void prog_scope::set_continue(prog_scope *scope, int line)
+{
+ if (scope_type == sct_loop) {
+ scope_of_loop_to_continue = scope;
+ loop_continue = line;
+ } else if (parent_scope)
+ parent_scope->set_continue(scope, line);
+}
+
+temp_access::temp_access():
+ last_read_scope(nullptr),
+ first_read_scope(nullptr),
+ first_write_scope(nullptr),
+ first_write(-1),
+ last_read(-1),
+ last_write(-1),
+ first_read(numeric_limits<int>::max())
+{
+}
+
+void temp_access::append(int line, e_acc_type acc, prog_scope *scope)
+{
+ last_write = line;
+ if (acc == acc_read) {
+ last_read = line;
+ last_read_scope = scope;
+ if (first_read > line) {
+ first_read = line;
+ first_read_scope = scope;
+ }
+ } else {
+ if (first_write == -1) {
+ first_write = line;
+ first_write_scope = scope;
+ }
+ }
+}
+
+inline lifetime make_lifetime(int b, int e)
+{
+ return lifetime{b,e};
+}
+
+lifetime temp_access::get_required_lifetime()
+{
+ bool keep_for_full_loop = false;
+ prog_scope *target_write_scope = first_write_scope;
+ prog_scope *target_read_scope = nullptr;
+
+ /* this temp is only read, this is undefined
+ * behaviour, so we can use the register otherwise */
+ if (!first_write_scope) {
+ return make_lifetime(-1, -1);
+ }
+
+ /* Only written to, just make sure that renaming
+ * doesn't reuse this register too early (corner
+ * case is the one opcode with two destinations) */
+ if (!last_read_scope) {
+ return make_lifetime(first_write, last_write + 1);
+ }
+
+ if (first_read <= first_write) {
+ /* If we conditionally read first before write and we are in a
+ * loop that also contains the first write, then this read is unlikely
+ * to be undefined, and we will have to keep the variable for
+ * all containing loops, otherwise we don't care */
+ prog_scope *fr_conditional_scope = first_read_scope->in_conditional();
+ if (fr_conditional_scope) {
+ prog_scope *fr_loop = fr_conditional_scope->get_outermost_loop();
+ if (fr_loop && fr_loop->contains(target_write_scope)) {
+ keep_for_full_loop = true;
+ target_write_scope = fr_loop;
+ }
+ }
+ }
+
+ /* If the first write is conditional within a loop, and the
+ * last read is not within the same condition scope, then we
+ * have to keep the temporary for all containing loops */
+ if (!keep_for_full_loop) {
+ auto fw_conditional_scope = target_write_scope->in_conditional();
+ if (fw_conditional_scope && fw_conditional_scope->in_loop()) {
+ if (!fw_conditional_scope->contains(last_read_scope))
+ keep_for_full_loop = true;
+ }
+ }
+
+ /* evaluate the shared scope */
+ int target_level = -1;
+
+ /* If the variable must be kept for all the loops, then
+ * find the outermost loop that contains both. last read and
+ * first write */
+ target_read_scope = last_read_scope;
+ if (keep_for_full_loop) {
+ prog_scope *target_scope = target_read_scope->get_outermost_loop();
+
+ /* If the read scope is within a loop, then just go up until
+ * the scope also containes the first write, otherwise do the
+ * same for the write scope */
+ if (target_scope) {
+ while (!target_scope->contains(target_write_scope))
+ target_scope = target_scope->parent();
+ } else {
+ target_scope = target_write_scope->get_outermost_loop();
+ assert(target_scope && "at this point read or write must be in a loop");
+ while (!target_scope->contains(target_read_scope))
+ target_scope = target_scope->parent();
+ }
+ target_level = target_scope->level();
+ }
+
+ /* The shared scope is not yet defined, so find the scope that
+ * contains first write and last read */
+ while (target_level < 0) {
+ if (target_read_scope->contains(target_write_scope)) {
+ target_level = target_read_scope->level();
+ } else if (target_write_scope->contains(target_read_scope)) {
+ target_level = target_write_scope->level();
+ } else {
+ target_read_scope = target_read_scope->parent();
+ }
+ }
+
+
+ /* propagate the read scope to the target_level */
+ while (last_read_scope->level() > target_level) {
+
+ /* if the read is in a loop we need to extend the
+ * variables life time to the end of that loop */
+ if (last_read_scope->type() == sct_loop) {
+ last_read = last_read_scope->end();
+ }
+ last_read_scope = last_read_scope->parent();
+ }
+
+ /* Prepare the write scope before propagating it. */
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after it (so it constitutes
+ * a conditional write */
+ if (first_write_scope->loop_continue_line() < first_write) {
+ keep_for_full_loop = true;
+ }
+
+ /* propagate lifetimes before moving to upper scopes */
+ if ((first_write_scope->type() == sct_loop) &&
+ (keep_for_full_loop || (first_read < first_write))) {
+ first_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ /* propagate the first_write scope to the target_level */
+ while (target_level < first_write_scope->level()) {
+
+ first_write_scope = first_write_scope->parent();
+
+ if (first_write_scope->loop_continue_line() < first_write) {
+ keep_for_full_loop = true;
+ }
+
+ /* if the value is conditionally written in a loop
+ * then propagate its lifetime to the full loop */
+ if (first_write_scope->type() == sct_loop) {
+ if (keep_for_full_loop) {
+ first_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ /* if we currently don't propagate the lifetime but
+ * the enclosing scope is a conditional within a loop
+ * up to the last-read level we need to propagate,
+ * todo: to tighten the life time check whether the value
+ * is written in all consitional code path below the loop */
+ if (!keep_for_full_loop &&
+ first_write_scope->is_conditional() &&
+ first_write_scope->in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+
+
+ /* We do not correct the last_write for scope, but
+ * if it is past the last_read we have to keep the
+ * temporary alive past this instructions */
+ if (last_write > last_read) {
+ last_read = last_write + 1;
+ }
+
+ return make_lifetime(first_write, last_read);
+}
+
+/* Wrapper function for the temporary register life time estimation.
+*/
+
+void
+estimate_temporary_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes)
+{
+ return tgsi_temp_lifetime(mem_ctx).get_lifetimes(instructions, ntemps, lifetimes);
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..0637ffab08
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+
+struct lifetime {
+ int begin;
+ int end;
+};
+
+void
+estimate_temporary_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lt);
--
2.13.0
Dieter Nützel
2017-06-18 19:32:47 UTC
Permalink
Hello Gert,

starting with short log from 'git am ...'

/opt/mesa> git am
~/Dokumente/Software/DRI-Mesa/Radeon/Mesa/mesa-st-glsl_to_tgsi-improved-temp-reg-lifetime-estimation.mbox
Wende an: mesa/st: glsl_to_tgsi move some helper classes to extra files
.git/rebase-apply/patch:561: new blank line at EOF.
+
.git/rebase-apply/patch:733: new blank line at EOF.
+
warning: 2 Zeilen fügen Whitespace-Fehler hinzu.
Wende an: mesa: Propagate c++11 CXXFLAGS from LLVM_CXXFLAGS to mesa/
Wende an: mesa/st: glsl_to_tgsi: implement new temporary register
lifetime tracker
Wende an: mesa/st: glsl_to_tgsi: add tests for the new temporary
lifetime tracker
.git/rebase-apply/patch:81: new blank line at EOF.
+
warning: 1 Zeile fügt Whitespace-Fehler hinzu.
Wende an: mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
Wende an: mesa/st: glsl_to_tgsi: Add test set for evaluation of rename
mapping
.git/rebase-apply/patch:106: new blank line at EOF.
+
warning: 1 Zeile fügt Whitespace-Fehler hinzu.
Wende an: mesa/st: glsl_to_tgsi: tie in new temporary register merge
approach
.git/rebase-apply/patch:118: new blank line at EOF.
+
warning: 1 Zeile fügt Whitespace-Fehler hinzu.


More after switching RX580 with Turks XT (6670) 2 GB.

Dieter
Post by Gert Wollny
Dear all,
following the comments of Emil and Nicolai I've updated the patch set.
- split the changes into more patches
- correct formatting errors
since in st_glsl_to_tgsi.cpp std::sort is already used and its run-time
performance is significantly better than qsort. It is used in the register
rename mapping evaluation. It can be disabled by commenting out the define
USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp.
- add more tests and improve the life-time evaluation accordingly
- further reduce memory allocations
The algorithms is the same as described before, with the little exception that
now initially a dry run over the instructions is used to count the numbers of
scopes. The run-time overhead of this operation can be neglected.
In order to make it easier to transition to the new code and test it I tied it
in parallel to the old code. It can be enabled by setting the
environment
variable MESA_GLSL_TO_TGSI_NEW_MERGE.
piglit run on the "shader" test set doesn't show any changes. The additional
passing test of I reported for v2 no longer passes, probably because of the
more conservative life-time estimation required to make the new (valid) tests
pass, but as I wrote before, the problem with this shader
(and its sister *vec3*) is, IMHO not solvable by better
register-renaming.
The performance numbers estimated by running the shader-db are given in the
commit message of the last patch, the trend is the same like reported before.
Many thanks for any commenst,
Gert
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa: Propagate c++11 CXXFLAGS from LLVM_CXXFLAGS to mesa/
mesa/st: glsl_to_tgsi: implement new temporary register lifetime
tracker
mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
tracker
mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
configure.ac | 1 +
src/mesa/Makefile.am | 4 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 319 +-----
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 ++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 ++++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 752
++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 36 +
src/mesa/state_tracker/tests/Makefile.am | 41 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 1042
++++++++++++++++++++
10 files changed, 2277 insertions(+), 294 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644
src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
Gert Wollny
2017-06-18 21:54:28 UTC
Permalink
It seems some of the tests I added require a a too long life-time of
the registers so that the actual aim of the patch is no longer
achieved.

Comments on the code and algorithm structure are welcome, but I have to
rethink the tests, because they seem to be too focused on corner-cases
that may actually border on undefined behavior, and I will have to
review that.

@Dieter: if you want to test the algorithm try v2, it was far from
ready to be applied to mesa, but you could see whether the algorithms
gives you useful results.

Sorry for the noise,
Gert
Post by Gert Wollny
Dear all,
following the comments of Emil and Nicolai I've updated the patch
set. 
Changes with respect to the old version are: 
- split the changes into more patches 
- correct formatting errors
- remove the use of the STL with one exception though: 
  since in st_glsl_to_tgsi.cpp std::sort is already used and its run-
time 
  performance is significantly better than qsort. It is used in the
register 
  rename mapping evaluation. It can be disabled by commenting out the
define 
  USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp. 
- add more tests and improve the life-time evaluation accordingly
- further reduce memory allocations
The algorithms is the same as described before, with the little
exception that 
now initially a dry run over the instructions is used to count the
numbers of 
scopes. The run-time overhead of this operation can be neglected. 
In order to make it easier to transition to the new code and test it
I tied it 
in parallel to the old code. It can be enabled by setting the
environment 
variable MESA_GLSL_TO_TGSI_NEW_MERGE. 
piglit run on the "shader" test set doesn't show any changes. The
additional 
passing test of I reported for v2 no longer passes, probably because
of the 
more conservative life-time estimation required to make the new
(valid) tests 
pass, but as I wrote before, the problem with this shader 
rd
(and its sister *vec3*) is, IMHO not solvable by better register-
renaming. 
The performance numbers estimated by running the shader-db are given
in the 
commit message of the last patch, the trend is the same like reported
before. 
Many thanks for any commenst, 
Gert 
  mesa/st: glsl_to_tgsi move some helper classes to extra  files
  mesa: Propagate c++11 CXXFLAGS from LLVM_CXXFLAGS to mesa/
  mesa/st: glsl_to_tgsi: implement new temporary register lifetime
    tracker
  mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
    tracker
  mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
  mesa/st: glsl_to_tgsi: Add test set for evaluation of rename
mapping
  mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
 configure.ac                                       |    1 +
 src/mesa/Makefile.am                               |    4 +-
 src/mesa/Makefile.sources                          |    4 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp         |  319 +-----
 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp |  207 ++++
 src/mesa/state_tracker/st_glsl_to_tgsi_private.h   |  165 ++++
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   |  752
++++++++++++++
 .../state_tracker/st_glsl_to_tgsi_temprename.h     |   36 +
 src/mesa/state_tracker/tests/Makefile.am           |   41 +
 .../tests/test_glsl_to_tgsi_lifetime.cpp           | 1042
++++++++++++++++++++
 10 files changed, 2277 insertions(+), 294 deletions(-)
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
 create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
 create mode 100644
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
 create mode 100644 src/mesa/state_tracker/tests/Makefile.am
 create mode 100644
src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
Gert Wollny
2017-06-21 12:59:03 UTC
Permalink
following the comments of Emil and Nicolai I've updated the patch set.
Because v3 was borked I restarted from v2.

Changes with respect to the version 2 are:

- split the changes into more patches
- correct formatting errors
- rename functions and methods to better clarify what they are used for
- remove unused methods and variables in prog_scope
- eliminate the class tgsi_temp_lifetime
- remove the use the STL in the core library with one exception though:
since in st_glsl_to_tgsi.cpp std::sort is already used and its run-time
performance is significantly better than qsort. It is used in the register
rename mapping evaluation. It can be disabled by commenting out the define
USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp.
- add more tests and improve the life-time evaluation accordingly
- further reduce memory allocations
- no longer require C++11 for the core library code
- the tests, however, make use of C++11 and the STL

The algorithms is the same as described before, with the little exception that
now initially a dry run over the instructions is used to count the numbers of
scopes. The run-time overhead of this operation can be neglected.

In order to make it easier to transition to the new code and test it, I tied it
in parallel to the old code. It can be enabled by setting the environment
variable MESA_GLSL_TO_TGSI_NEW_MERGE.

piglit run on the "shader" test set shows no regressions and fixes
***@glsl-***@execution@variable-***@gs-input-array-vec2-index-rd

The performance numbers estimated by running the shader-db are given in the
commit message of the last patch, the trend is the same like reported before,
i.e. the all-over run-times are a bit lower, mostly because the new evaluation
for the mapping uses a binary search. However, because of the stocastical
sampling measuring these numbers with perf borders at statistical
insignificance, the influence on the all-over run-time is just too low.

I've also run a few programs (GPUtest benchmarks, Unigine-valley, Unigine-heaven)
and couldn't seen any visually indications that registers would be megrged
wrongly.

Many thanks for any commenst,
Gert
** BLURB HERE ***

Gert Wollny (6):
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi: implement new temporary register lifetime
tracker
mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
tracker
mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
mesa/st: glsl_to_tgsi: tie in new temporary register merge approach

configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 315 +-----
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 ++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 +++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 761 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 36 +
src/mesa/state_tracker/tests/Makefile.am | 40 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 1070 ++++++++++++++++++++
10 files changed, 2313 insertions(+), 288 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
Gert Wollny
2017-06-21 12:59:08 UTC
Permalink
The patch adds tests for the register rename mapping evaluation.
---
.../tests/test_glsl_to_tgsi_lifetime.cpp | 94 ++++++++++++++++++++++
1 file changed, 94 insertions(+)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 5f3378637a..f53b5c23a1 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -89,6 +89,13 @@ protected:
void check(const vector<lifetime>& result, const expectation& e);
};

+/* With this test class the renaming mepping estimation is tested */
+class RegisterRemapping : public MesaTestWithMemCtx {
+protected:
+ void run(const vector<lifetime>& lt, const vector<int>& expect);
+};
+
+
/* This test class checks that the life time covers at least
* in the expected range. It is used for cases where we know that
* a the implementation could be improved on estimating the minimal
@@ -466,6 +473,29 @@ TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
run (code, expectation({{-1,-1},{0, 9}}));
}

+/* Here we read and write to the same temp, but it is conditional,
+ * so the lifetime must start with the first read */
+TEST_F(LifetimeEvaluatorExactTest, WriteConditionallyFromSelf)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_USEQ, {0}, {in0, in1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_FSLT, {2}, {1, in1}, {}},
+ {TGSI_OPCODE_UIF, {2}, {}, {}},
+ {TGSI_OPCODE_MOV, {3}, {in1}, {}},
+ {TGSI_OPCODE_ELSE},
+ {TGSI_OPCODE_MOV, {4}, {in1}, {}},
+ {TGSI_OPCODE_MOV, {4}, {4}, {}},
+ {TGSI_OPCODE_MOV, {3}, {4}, {}},
+ {TGSI_OPCODE_ENDIF},
+ {TGSI_OPCODE_MOV,{out1}, {3}, {}},
+ {TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 5}, {5, 6}, {7, 13}, {9, 11}}));
+}

TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
{
@@ -831,6 +861,47 @@ TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
run (code, expectation({{-1,-1},{0, 8}}));
}

+TEST_F(RegisterRemapping, RegisterRemapping1)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {1, 2},
+ {2, 10},
+ {3, 5},
+ {5, 10}
+ });
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+ run(lt, expect);
+}
+
+
+TEST_F(RegisterRemapping, RegisterRemapping2)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {3, 3},
+ {4, 4},
+ });
+ vector<int> expect({0, 1, 2, 1, 1});
+ run(lt, expect);
+}
+
+TEST_F(RegisterRemapping, RegisterRemappingMergeAll)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {1, 2},
+ {2, 3},
+ {3, 4},
+ });
+ vector<int> expect({0, 1, 1, 1, 1});
+ run(lt, expect);
+}
+
+
/* Implementation of helper and test classes */

MockShader::~MockShader()
@@ -974,3 +1045,26 @@ void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
EXPECT_GE(lifetimes[i].end, e[i][1]);
}
}
+
+void RegisterRemapping::run(const vector<lifetime>& lt,
+ const vector<int>& expect)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(lt.size(), proto);
+
+ get_temp_registers_remapping(mem_ctx, lt.size(), &lt[0], &result[0]);
+
+ vector<int> remap(lt.size());
+ for (unsigned i = 0; i < lt.size(); ++i) {
+ remap[i] = result[i].valid ? result[i].new_reg : i;
+ }
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {
+ return rn.valid ? rn.new_reg : x;
+ });
+
+ for(unsigned i = 1; i < remap.size(); ++i) {
+ EXPECT_EQ(remap[i], expect[i]);
+ }
+}
--
2.13.0
Gert Wollny
2017-06-21 12:59:05 UTC
Permalink
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned second time to recorc the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actuall minimal life time is
evaluated.
---
src/mesa/Makefile.sources | 2 +
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 648 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 33 ++
3 files changed, 683 insertions(+)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..2359ec3c7d 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..1e02c4d710
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,648 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <limits>
+
+/* Without c++11 define the nullptr for forward-compatibility
+ * and better readibility */
+#if __cplusplus < 201103L
+#define nullptr 0
+#endif
+
+using std::numeric_limits;
+
+enum e_scope_type {
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_switch_case,
+ sct_switch_default,
+ sct_unknown
+};
+
+enum e_acc_type {
+ acc_read,
+ acc_write,
+ acc_write_cond_from_self
+};
+
+class prog_scope {
+
+public:
+ prog_scope(prog_scope *parent, e_scope_type type, int id, int depth,
+ int begin);
+
+ e_scope_type type() const;
+ prog_scope *parent() const;
+ int nesting_depth() const;
+ int id() const;
+ int end() const;
+ int begin() const;
+ int loop_continue_line() const;
+
+ const prog_scope *in_ifelse_scope() const;
+ const prog_scope *in_switchcase_scope() const;
+ const prog_scope *innermost_loop() const;
+ const prog_scope *outermost_loop() const;
+
+ bool in_loop() const;
+ bool is_conditional() const;
+ bool break_is_for_switchcase() const;
+ bool contains(const prog_scope& other) const;
+
+ void set_end(int end);
+ void set_previous_case_scope(prog_scope *prev);
+ void set_continue_line(int line);
+
+private:
+ e_scope_type scope_type;
+ int scope_id;
+ int scope_nesting_depth;
+ int scope_begin;
+ int scope_end;
+ int loop_cont_line;
+ prog_scope *previous_case_scope;
+ prog_scope *parent_scope;
+};
+
+class temp_access {
+public:
+ temp_access();
+ void record(int line, e_acc_type rw, prog_scope *scope);
+ lifetime get_required_lifetime();
+private:
+ prog_scope *last_read_scope;
+ prog_scope *first_read_scope;
+ prog_scope *first_write_scope;
+ int first_dominant_write;
+ int last_read;
+ int last_write;
+ int first_read;
+ bool keep_for_full_loop;
+};
+
+/* Some storage class to encapsulate the prog_scope (de-)allocations */
+class prog_scope_storage {
+public:
+ prog_scope_storage(void *mem_ctx, int n);
+ ~prog_scope_storage();
+ prog_scope * create(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin);
+private:
+ void *mem_ctx;
+ int current_slot;
+ prog_scope *storage;
+};
+
+/* Scan the program and estimate the required register life times.
+ * The array lifetimes must be pre-allocated */
+void
+get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes)
+{
+
+ int line = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int scope_level = 0;
+ bool is_at_end = false;
+
+ int n_scopes = 1;
+
+
+
+ /* count scopes to allocate the needed space without the need for
+ * re-allocation */
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (inst->op == TGSI_OPCODE_BGNLOOP ||
+ inst->op == TGSI_OPCODE_SWITCH ||
+ inst->op == TGSI_OPCODE_CASE ||
+ inst->op == TGSI_OPCODE_IF ||
+ inst->op == TGSI_OPCODE_UIF ||
+ inst->op == TGSI_OPCODE_ELSE ||
+ inst->op == TGSI_OPCODE_DEFAULT)
+ ++n_scopes;
+ }
+
+ prog_scope_storage scopes(mem_ctx, n_scopes);
+ temp_access *acc = new temp_access[ntemps];
+ prog_scope *cur_scope = scopes.create(nullptr, sct_outer, 0,
+ scope_level++, line);
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (is_at_end) {
+ assert(!"GLSL_TO_TGSI: shader has instructions past end marker");
+ break;
+ }
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ cur_scope = scopes.create(cur_scope, sct_loop, loop_id++,
+ scope_level++, line);
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --scope_level;
+ cur_scope->set_end(line);
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_IF:
+ case TGSI_OPCODE_UIF:{
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].record(line, acc_read, cur_scope);
+ }
+ }
+ cur_scope = scopes.create(cur_scope, sct_if, if_id++,
+ scope_level++, line+1);
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ cur_scope->set_end(line-1);
+ cur_scope = scopes.create(cur_scope->parent(), sct_else,
+ cur_scope->id(),cur_scope->nesting_depth(),
+ line+1);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ cur_scope->set_end(line);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --scope_level;
+ cur_scope->set_end(line-1);
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ cur_scope = scopes.create(cur_scope, sct_switch, switch_id++,
+ scope_level++, line);
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --scope_level;
+ cur_scope->set_end(line-1);
+ /* remove the case level, it might not have been
+ * closed with a break */
+ if (cur_scope->type() != sct_switch ) {
+ cur_scope = cur_scope->parent();
+ }
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_CASE:
+ case TGSI_OPCODE_DEFAULT: {
+ /* Switch cases and default are handled at the same nesting level
+ * like their enclosing switch */
+ e_scope_type t = inst->op == TGSI_OPCODE_CASE ? sct_switch_case
+ : sct_switch_default;
+ prog_scope *switch_scope = cur_scope;
+ if ( cur_scope->type() == sct_switch ) {
+ cur_scope = scopes.create(cur_scope, t, cur_scope->id(),
+ scope_level, line+1);
+ }else{
+ switch_scope = cur_scope->parent();
+ assert(switch_scope->type() == sct_switch);
+ prog_scope *scope = scopes.create(switch_scope, t,
+ switch_scope->id(),
+ switch_scope->nesting_depth(),
+ line);
+
+ /* Previous case falls through */
+ if (cur_scope->end() == -1)
+ scope->set_previous_case_scope(cur_scope);
+ cur_scope = scope;
+ }
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].record(line, acc_read, switch_scope);
+ }
+ }
+ }
+ case TGSI_OPCODE_BRK: {
+ if (cur_scope->break_is_for_switchcase()) {
+ cur_scope->set_end(line-1);
+ break;
+ }
+ }
+ case TGSI_OPCODE_CONT: {
+ cur_scope->set_continue_line(line);
+ break;
+ }
+ default: {
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->src[j].index].record(line, acc_read, cur_scope);
+ }
+ }
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->tex_offsets[j].index].record(line, acc_read, cur_scope);
+ }
+ }
+
+ e_acc_type write_type = inst->op == TGSI_OPCODE_UCMP ?
+ acc_write_cond_from_self :
+ acc_write;
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY) {
+ acc[inst->dst[j].index].record(line, write_type, cur_scope);
+ }
+ }
+ }
+ }
+ ++line;
+ }
+
+ /* make sure last scope is closed, even though no
+ * TGSI_OPCODE_END was given */
+ if (cur_scope->end() < 0) {
+ cur_scope->set_end(line-1);
+ }
+ for(int i = 1; i < ntemps; ++i) {
+ lifetimes[i] = acc[i].get_required_lifetime();
+ }
+
+ delete[] acc;
+}
+
+prog_scope::prog_scope(prog_scope *parent, e_scope_type type, int id,
+ int depth, int scope_begin):
+ scope_type(type),
+ scope_id(id),
+ scope_nesting_depth(depth),
+ scope_begin(scope_begin),
+ scope_end(-1),
+ loop_cont_line(numeric_limits<int>::max()),
+ parent_scope(parent)
+{
+}
+
+e_scope_type prog_scope::type() const
+{
+ return scope_type;
+}
+
+
+prog_scope *prog_scope::parent() const
+{
+ return parent_scope;
+}
+
+int prog_scope::nesting_depth() const
+{
+ return scope_nesting_depth;
+}
+
+bool prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+ if (parent_scope)
+ return parent_scope->in_loop();
+ return false;
+}
+
+const prog_scope *prog_scope::innermost_loop() const
+{
+ if (scope_type == sct_loop)
+ return this;
+ if (parent_scope)
+ return parent_scope->innermost_loop();
+ else
+ return nullptr;
+}
+
+const prog_scope *prog_scope::outermost_loop() const
+{
+ const prog_scope *loop = nullptr;
+ const prog_scope *p = this;
+ do {
+ if (p->type() == sct_loop)
+ loop = p;
+ p = p->parent();
+ } while (p);
+ return loop;
+}
+
+bool prog_scope::contains(const prog_scope& other) const
+{
+ return (begin() <= other.begin()) && (end() >= other.end());
+}
+
+bool prog_scope::is_conditional() const
+{
+ return scope_type == sct_if ||
+ scope_type == sct_else ||
+ scope_type == sct_switch_case ||
+ scope_type == sct_switch_default;
+}
+
+const prog_scope *prog_scope::in_ifelse_scope() const
+{
+ if (scope_type == sct_if ||
+ scope_type == sct_else)
+ return this;
+ else if (parent_scope)
+ return parent_scope->in_ifelse_scope();
+ else
+ return nullptr;
+}
+
+const prog_scope *prog_scope::in_switchcase_scope() const
+{
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default)
+ return this;
+ else if (parent_scope)
+ return parent_scope->in_switchcase_scope();
+ else
+ return nullptr;
+}
+
+bool prog_scope::break_is_for_switchcase() const
+{
+ if (scope_type == sct_loop)
+ return false;
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default ||
+ scope_type == sct_switch)
+ return true;
+ if (parent_scope)
+ return parent_scope->break_is_for_switchcase();
+ else
+ return false;
+}
+
+int prog_scope::id() const
+{
+ return scope_id;
+}
+
+int prog_scope::begin() const
+{
+ return scope_begin;
+}
+
+int prog_scope::end() const
+{
+ return scope_end;
+}
+
+void prog_scope::set_previous_case_scope(prog_scope * prev)
+{
+ previous_case_scope = prev;
+}
+
+void prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous_case_scope)
+ previous_case_scope->set_end(end);
+ }
+}
+
+void prog_scope::set_continue_line(int line)
+{
+ if (scope_type == sct_loop) {
+ loop_cont_line = line;
+ } else if (parent_scope)
+ parent()->set_continue_line(line);
+}
+
+int prog_scope::loop_continue_line() const
+{
+ return loop_cont_line;
+}
+
+temp_access::temp_access():
+ last_read_scope(nullptr),
+ first_read_scope(nullptr),
+ first_write_scope(nullptr),
+ first_dominant_write(-1),
+ last_read(-1),
+ last_write(-1),
+ first_read(numeric_limits<int>::max()),
+ keep_for_full_loop(false)
+{
+}
+
+void temp_access::record(int line, e_acc_type rw, prog_scope * scope)
+{
+ if (rw == acc_read) {
+ last_read_scope = scope;
+ last_read = line;
+ if (first_read > line) {
+ first_read = line;
+ first_read_scope = scope;
+ }
+ } else {
+ last_write = line;
+
+ /* If no first write is assigned check whether we deal with a case where
+ * the temp is read and written in the same instructions, because then
+ * it is not a dominant write, it may even be undefined. Hence postpone
+ * the assignment if the first write, only mark that the register was
+ * written at all by remembering a scope */
+ if (first_dominant_write < 0) {
+ if (line != last_read || (rw == acc_write_cond_from_self)) {
+ first_dominant_write = line;
+ }
+ first_write_scope = scope;
+ }
+
+ if (scope->is_conditional() && scope->in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+
+}
+
+inline lifetime make_lifetime(int b, int e)
+{
+ lifetime lt;
+ lt.begin = b;
+ lt.end = e;
+ return lt;
+}
+
+#include <iostream>
+using std::cerr;
+
+lifetime temp_access::get_required_lifetime()
+{
+
+ /* this temp is only read, this is undefined
+ * behaviour, so we can use the register otherwise */
+ if (!first_write_scope) {
+ return make_lifetime(-1, -1);
+ }
+
+ /* Only written to, just make sure it doesn't overlap */
+ if (!last_read_scope) {
+ return make_lifetime(first_dominant_write, last_write + 1);
+ }
+
+ /* Undefined behaviour: read and write in the same instruction
+ * but never written elsewhere. Since it is written, we need to
+ * keep it nevertheless.
+ * In this case the first dominanat write is not recorded and we use the
+ * first read to estimate the life time. This is not minimal, since another
+ * undefined first read could have happend before the first undefined
+ * write, but we don't care, because adding yet another tracking variable
+ * to handle this rare case of undefined behaviour doesn't make sense */
+ if (first_write_scope && first_dominant_write < 0) {
+ return make_lifetime(first_read, last_write + 1);
+ }
+
+ const prog_scope *target_scope = last_read_scope;
+ int enclosing_scope_depth = -1;
+
+ /* we read before writing, conditional, and in a loop
+ * hence the value must survive the loops */
+ if ((first_read <= first_dominant_write) &&
+ first_read_scope->is_conditional() &&
+ first_read_scope->in_loop()) {
+
+ keep_for_full_loop = true;
+ target_scope = first_read_scope->outermost_loop();
+ }
+
+ /* Evaluate the scope that is shared by all three, first write, and
+ * first (conditional) read before write and last read. */
+ while (enclosing_scope_depth < 0) {
+ if (target_scope->contains(*first_write_scope)) {
+ enclosing_scope_depth = target_scope->nesting_depth();
+ } else if (first_write_scope->contains(*target_scope)) {
+ target_scope = first_write_scope;
+ enclosing_scope_depth = first_write_scope->nesting_depth();
+ } else {
+ target_scope = target_scope->parent();
+ assert(target_scope);
+ }
+ }
+
+ /* propagate the read scope to the target scope */
+ while (last_read_scope->nesting_depth() > enclosing_scope_depth) {
+ /* if the read is in a loop we need to extend the
+ * variables life time to the end of that loop
+ * because at this point it is not written in the same loop */
+ if (last_read_scope->type() == sct_loop) {
+ last_read = last_read_scope->end();
+ }
+ last_read_scope = last_read_scope->parent();
+ }
+
+ /* If the first read is (conditionally) before the first write we
+ * have to keep the variable for the loop */
+ if ((first_write_scope->type() == sct_loop) &&
+ (first_read <= first_dominant_write)) {
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ /* propagate the first_write scope to the target scope */
+ while (enclosing_scope_depth < first_write_scope->nesting_depth()) {
+
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after the continue/break inside
+ * that loop. Note that this is only needed if we move up in the
+ * scopes. */
+ if (first_write_scope->loop_continue_line() < first_dominant_write &&
+ first_write_scope->end() > first_dominant_write) {
+ keep_for_full_loop = true;
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ first_write_scope = first_write_scope->parent();
+
+ /* Do the propagation again for the parent loop */
+ if (first_write_scope->type() == sct_loop) {
+ if (keep_for_full_loop || (first_read <= first_dominant_write)) {
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ /* if we currently don't propagate the lifetime but
+ * the enclosing scope is a conditional within a loop
+ * up to the last-read level we need to propagate,
+ * todo: to tighten the life time check whether the value
+ * is written in all consitional code path below the loop */
+ if (!keep_for_full_loop &&
+ first_write_scope->is_conditional() &&
+ first_write_scope->in_loop()) {
+ keep_for_full_loop = true;
+ }
+ }
+
+ /* MOvin up from a loop into a conditional we might not yet have marked
+ * life-time to scope propagation. Hence, if the conditional we just moved
+ * to is within a loop, propagate in the next round. */
+ if (last_write > last_read) {
+ last_read = last_write + 1;
+ }
+
+ /* Here we are at the same scope, all is resolved */
+ return make_lifetime(first_dominant_write, last_read);
+}
+
+prog_scope_storage::prog_scope_storage(void *mc, int n):
+ mem_ctx(mc),
+ current_slot(0)
+{
+ storage = ralloc_array(mem_ctx, prog_scope, n);
+}
+
+prog_scope_storage::~prog_scope_storage()
+{
+ ralloc_free(storage);
+}
+
+prog_scope*
+prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin)
+{
+ storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
+ return &storage[current_slot++];
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..a4124b4659
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+
+struct lifetime {
+ int begin;
+ int end;
+};
+
+void
+get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes);
--
2.13.0
Emil Velikov
2017-06-22 14:35:37 UTC
Permalink
Hi Gert,

A couple of trivial bits I've noticed in patches 2 and 3, but
applicable overall.

- Unify if/return chains
I've seen the following three examples, sometimes even right next to each other.

if foo
return f1;
if bar
return b1;
return c1;

if foo
return f1;
if bar
return b1;
else
return c1;

if foo
return f1;
else if bar
return b1;
return c1;

Personal fav. is the first one (with blank lines in between), although
regardless of the option do make sure code is consistent.

- Braces for a single line statement in a if conditionals
Some instances have, while others don't. I think most people prefer
omitting them, although as long as you're consistent with surrounding
code it's fine.

- Tests still use STL?
Am I looking at the right patches - shouldn't be a big deal either way.

Regards,
Emil
Gert Wollny
2017-06-22 16:34:09 UTC
Permalink
Thanks for the comments,

I've fixed these little issues locally, but I think in order to not to
spam the list, I'll send the changes later. I kind of suspect that
Nicolai might have one or the other additional comment :)

best,
Gert
Gert Wollny
2017-06-21 12:59:07 UTC
Permalink
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.
---
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 113 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 3 +
2 files changed, 116 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 1e02c4d710..e0e67308b4 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -27,6 +27,14 @@
#include <mesa/program/prog_instruction.h>
#include <limits>

+/* std::sort is significanter than qsort */
+#define USE_STL_SORT
+#ifdef USE_STL_SORT
+#include <algorithm>
+#else
+#include <cstdlib>
+#endif
+
/* Without c++11 define the nullptr for forward-compatibility
* and better readibility */
#if __cplusplus < 201103L
@@ -646,3 +654,108 @@ prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
return &storage[current_slot++];
}
+
+/* helper class for sorting and searching the registers based
+ * on life times. */
+struct access_record {
+ int begin;
+ int end;
+ int reg;
+ bool erase;
+
+ bool operator < (const access_record& rhs) {
+ return begin < rhs.begin;
+ }
+};
+
+/* Find the next register between [start, end) that has a life time starting
+ * at or after bound by using a binary search.
+ * start points at the beginning of the search range,
+ * end points at the element past the end of the search range, and
+ * the array comprising [start, end) must be sorted in ascending order.
+ */
+access_record*
+find_next_rename(access_record* start, access_record* end, int bound)
+{
+ int delta = (end - start);
+ while (delta > 0) {
+ int half = delta >> 1;
+ access_record* middle = start + half;
+ if (bound <= middle->begin)
+ delta = half;
+ else {
+ start = middle;
+ ++start;
+ delta -= half + 1;
+ }
+ }
+ return start;
+}
+
+#ifndef USE_STL_SORT
+int access_record_compare (const void *a, const void *b) {
+ const access_record *aa = static_cast<const access_record*>(a);
+ const access_record *bb = static_cast<const access_record*>(b);
+ return aa->begin < bb->begin ? -1 : (aa->begin > bb->begin ? 1 : 0);
+}
+#endif
+
+/* This functions evaluates the register merges by using an O(n log n)
+ * algorithm to find suitable merge candidates. */
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+ const struct lifetime* lifetimes,
+ struct rename_reg_pair *result)
+{
+ access_record *m = ralloc_array(mem_ctx, access_record, ntemps - 1);
+ for (int i = 1; i < ntemps; ++i) {
+ m[i-1].begin = lifetimes[i].begin;
+ m[i-1].end = lifetimes[i].end;
+ m[i-1].reg = i;
+ m[i-1].erase = false;
+ }
+
+#ifdef USE_STL_SORT
+ std::sort(m, m + ntemps - 1);
+#else
+ std::qsort(m, ntemps - 1, sizeof(access_record), access_record_compare);
+#endif
+
+ access_record *trgt = m;
+ access_record *mend = m + ntemps - 1;
+ access_record *first_erase = mend;
+ access_record *search_start = trgt + 1;
+
+ while (trgt != mend) {
+
+ access_record *src = find_next_rename(search_start, mend, trgt->end);
+ if (src != mend) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't remove the renamed
+ * register just now, only mark it. */
+ src->erase = true;
+ if (first_erase == mend)
+ first_erase = src;
+ search_start = src + 1;
+ } else {
+ /* Moving to the next target register it is time to remove
+ * the already merged registers from the search range */
+ if (first_erase != mend) {
+ access_record *out = first_erase;
+ access_record *in_start = first_erase + 1;
+ while (in_start != mend) {
+ if (!in_start->erase)
+ *out++ = *in_start;
+ ++in_start;
+ }
+ mend = out;
+ first_erase = mend;
+ }
+ ++trgt;
+ search_start = trgt + 1;
+ }
+ }
+ ralloc_free(m);
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
index a4124b4659..f6a89ed0d3 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -31,3 +31,6 @@ struct lifetime {
void
get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
int ntemps, struct lifetime *lifetimes);
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+ const struct lifetime* lifetimes,
+ struct rename_reg_pair *result);
--
2.13.0
Gert Wollny
2017-06-21 12:59:04 UTC
Permalink
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;

int swizzle_for_type(const glsl_type *type, int component);

as inline:

bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);
---
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 288 +--------------------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 +++++++++++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 ++++++++++++
4 files changed, 377 insertions(+), 285 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index b80882fb8d..21f9167bda 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -507,6 +507,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+ state_tracker/st_glsl_to_tgsi_private.cpp \
+ state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 7852941acd..528fc4cc64 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -65,251 +66,7 @@

#define MAX_GLSL_TEXTURE_OFFSET 4

-class st_src_reg;
-class st_dst_reg;
-
-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
- unsigned num_elements = 4;
-
- if (type) {
- type = type->without_array();
- if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
- }
-
- int swizzle = swizzle_for_size(num_elements);
- assert(num_elements + component <= 4);
-
- swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
- return swizzle;
-}
-
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
-public:
- st_src_reg(gl_register_file file, int index, const glsl_type *type,
- int component = 0, unsigned array_id = 0)
- {
- assert(file != PROGRAM_ARRAY || array_id != 0);
- this->file = file;
- this->index = index;
- this->swizzle = swizzle_for_type(type, component);
- this->negate = 0;
- this->abs = 0;
- this->index2D = 0;
- this->type = type ? type->base_type : GLSL_TYPE_ERROR;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = array_id;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = index2D;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->swizzle = 0;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- explicit st_src_reg(st_dst_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
- int negate:4; /**< NEGATE_XYZW mask from mesa */
- unsigned abs:1;
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- /*
- * Is this the second half of a double register pair?
- * currently used for input mapping only.
- */
- unsigned double_reg2:1;
- unsigned is_double_vertex_input:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-
- st_src_reg get_abs()
- {
- st_src_reg reg = *this;
- reg.negate = 0;
- reg.abs = 1;
- return reg;
- }
-};
-
-class st_dst_reg {
-public:
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = 0;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->writemask = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->array_id = 0;
- }
-
- explicit st_dst_reg(st_src_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-};
-
-st_src_reg::st_src_reg(st_dst_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->double_reg2 = false;
- this->array_id = reg.array_id;
- this->is_double_vertex_input = false;
-}
-
-st_dst_reg::st_dst_reg(st_src_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->writemask = WRITEMASK_XYZW;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->array_id = reg.array_id;
-}
-
-class glsl_to_tgsi_instruction : public exec_node {
-public:
- DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
-
- st_dst_reg dst[2];
- st_src_reg src[4];
- st_src_reg resource; /**< sampler, image or buffer register */
- st_src_reg *tex_offsets;
-
- /** Pointer to the ir source this tree came from for debugging */
- ir_instruction *ir;
-
- unsigned op:8; /**< TGSI opcode */
- unsigned saturate:1;
- unsigned is_64bit_expanded:1;
- unsigned sampler_base:5;
- unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
- unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
- glsl_base_type tex_type:5;
- unsigned tex_shadow:1;
- unsigned image_format:9;
- unsigned tex_offset_num_offset:3;
- unsigned dead_mask:4; /**< Used in dead code elimination */
- unsigned buffer_access:3; /**< buffer access type */
-
- const struct tgsi_opcode_info *info;
-};
+extern int swizzle_for_size(int size);

class variable_storage {
DECLARE_RZALLOC_CXX_OPERATORS(variable_storage)
@@ -390,11 +147,6 @@ find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
return GLSL_TYPE_ERROR;
}

-struct rename_reg_pair {
- bool valid;
- int new_reg;
-};
-
struct glsl_to_tgsi_visitor : public ir_visitor {
public:
glsl_to_tgsi_visitor();
@@ -597,7 +349,7 @@ fail_link(struct gl_shader_program *prog, const char *fmt, ...)
prog->data->LinkStatus = linking_failure;
}

-static int
+int
swizzle_for_size(int size)
{
static const int size_swizzles[4] = {
@@ -611,40 +363,6 @@ swizzle_for_size(int size)
return size_swizzles[size - 1];
}

-static bool
-is_resource_instruction(unsigned opcode)
-{
- switch (opcode) {
- case TGSI_OPCODE_RESQ:
- case TGSI_OPCODE_LOAD:
- case TGSI_OPCODE_ATOMUADD:
- case TGSI_OPCODE_ATOMXCHG:
- case TGSI_OPCODE_ATOMCAS:
- case TGSI_OPCODE_ATOMAND:
- case TGSI_OPCODE_ATOMOR:
- case TGSI_OPCODE_ATOMXOR:
- case TGSI_OPCODE_ATOMUMIN:
- case TGSI_OPCODE_ATOMUMAX:
- case TGSI_OPCODE_ATOMIMIN:
- case TGSI_OPCODE_ATOMIMAX:
- return true;
- default:
- return false;
- }
-}
-
-static unsigned
-num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->num_dst;
-}
-
-static unsigned
-num_inst_src_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->is_tex || is_resource_instruction(op->op) ?
- op->info->num_src - 1 : op->info->num_src;
-}

glsl_to_tgsi_instruction *
glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
new file mode 100644
index 0000000000..b77313da10
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
@@ -0,0 +1,207 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+
+using std::vector;
+
+extern int swizzle_for_size(int size);
+
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+ unsigned num_elements = 4;
+
+ if (type) {
+ type = type->without_array();
+ if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+ }
+
+ int swizzle = swizzle_for_size(num_elements);
+ assert(num_elements + component <= 4);
+
+ swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+ return swizzle;
+}
+
+
+
+st_src_reg::st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component, unsigned array_id)
+{
+ assert(file != PROGRAM_ARRAY || array_id != 0);
+ this->file = file;
+ this->index = index;
+ this->swizzle = swizzle_for_type(type, component);
+ this->negate = 0;
+ this->abs = 0;
+ this->index2D = 0;
+ this->type = type ? type->base_type : GLSL_TYPE_ERROR;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = index2D;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->swizzle = 0;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+
+st_src_reg st_src_reg::get_abs()
+{
+ st_src_reg reg = *this;
+ reg.negate = 0;
+ reg.abs = 1;
+ return reg;
+}
+
+st_src_reg::st_src_reg(st_dst_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->double_reg2 = false;
+ this->array_id = reg.array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_dst_reg::st_dst_reg(st_src_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->writemask = WRITEMASK_XYZW;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->array_id = reg.array_id;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->array_id = 0;
+}
+
+
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.h b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
new file mode 100644
index 0000000000..9a2a3efa7e
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
@@ -0,0 +1,165 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <mesa/main/mtypes.h>
+#include <compiler/glsl_types.h>
+#include <compiler/glsl/ir.h>
+#include <tgsi/tgsi_info.h>
+#include <stack>
+#include <vector>
+
+class st_dst_reg;
+
+/**
+ * This struct is a corresponding struct to TGSI ureg_src.
+ */
+class st_src_reg {
+public:
+ st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component = 0, unsigned array_id = 0);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D);
+
+ st_src_reg();
+
+ explicit st_src_reg(st_dst_reg reg);
+
+ st_src_reg get_abs();
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+
+ uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
+ int negate:4; /**< NEGATE_XYZW mask from mesa */
+ unsigned abs:1;
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ /*
+ * Is this the second half of a double register pair?
+ * currently used for input mapping only.
+ */
+ unsigned double_reg2:1;
+ unsigned is_double_vertex_input:1;
+ unsigned array_id:10;
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+
+};
+
+class st_dst_reg {
+public:
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index);
+
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type);
+
+ st_dst_reg();
+
+ explicit st_dst_reg(st_src_reg reg);
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ unsigned array_id:10;
+
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+};
+
+class glsl_to_tgsi_instruction : public exec_node {
+public:
+ DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
+
+ st_dst_reg dst[2];
+ st_src_reg src[4];
+ st_src_reg resource; /**< sampler or buffer register */
+ st_src_reg *tex_offsets;
+
+ /** Pointer to the ir source this tree came from for debugging */
+ ir_instruction *ir;
+
+ unsigned op:8; /**< TGSI opcode */
+ unsigned saturate:1;
+ unsigned is_64bit_expanded:1;
+ unsigned sampler_base:5;
+ unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
+ unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type:5;
+ unsigned tex_shadow:1;
+ unsigned image_format:9;
+ unsigned tex_offset_num_offset:3;
+ unsigned dead_mask:4; /**< Used in dead code elimination */
+ unsigned buffer_access:3; /**< buffer access type */
+
+ const struct tgsi_opcode_info *info;
+};
+
+struct rename_reg_pair {
+ bool valid;
+ int new_reg;
+};
+
+inline bool
+is_resource_instruction(unsigned opcode)
+{
+ switch (opcode) {
+ case TGSI_OPCODE_RESQ:
+ case TGSI_OPCODE_LOAD:
+ case TGSI_OPCODE_ATOMUADD:
+ case TGSI_OPCODE_ATOMXCHG:
+ case TGSI_OPCODE_ATOMCAS:
+ case TGSI_OPCODE_ATOMAND:
+ case TGSI_OPCODE_ATOMOR:
+ case TGSI_OPCODE_ATOMXOR:
+ case TGSI_OPCODE_ATOMUMIN:
+ case TGSI_OPCODE_ATOMUMAX:
+ case TGSI_OPCODE_ATOMIMIN:
+ case TGSI_OPCODE_ATOMIMAX:
+ return true;
+ default:
+ return false;
+ }
+}
+
+inline unsigned
+num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->num_dst;
+}
+
+inline unsigned
+num_inst_src_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->is_tex || is_resource_instruction(op->op) ?
+ op->info->num_src - 1 : op->info->num_src;
+}
+
--
2.13.0
Gert Wollny
2017-06-21 12:59:09 UTC
Permalink
This patch ties in the new temporary register lifetime estiamtion and
rename mapping evaluation. In order to enable it, the evironment
variable MESA_GLSL_TO_TGSI_NEW_MERGE must be set.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread; Numbers are in
% of total run.

-----------------------------------------------------------
old new(qsort) new(std::sort)

------------------------ valgrind -------------------------
merge 0.21 0.20 0.14
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.12 0.06
apply mapping 0.02 0.02 0.02

--- perf (approximate because of statistic sampling) -------
merge 0.23 0.17 0.15
estimate lifetime 0.03 0.05 0.07
evaluate mapping (incl=0.17) 0.09 0.06
apply mapping 0.03 0.03 0.03
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 528fc4cc64..d4abee9d02 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,7 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -322,6 +322,7 @@ public:

void merge_two_dsts(void);
void merge_registers(void);
+ void merge_registers_alternative(void);
void renumber_registers(void);

void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
@@ -5139,6 +5140,23 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
}
}

+void
+glsl_to_tgsi_visitor::merge_registers_alternative(void)
+{
+ struct rename_reg_pair *renames =
+ rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
+ struct lifetime *lifetimes =
+ rzalloc_array(mem_ctx, struct lifetime, this->next_temp);
+
+ get_temp_registers_required_lifetimes(mem_ctx, &this->instructions,
+ this->next_temp, lifetimes);
+ get_temp_registers_remapping(mem_ctx, this->next_temp, lifetimes, renames);
+ rename_temp_registers(renames);
+
+ ralloc_free(lifetimes);
+ ralloc_free(renames);
+}
+
/* Merges temporary registers together where possible to reduce the number of
* registers needed to run a program.
*
@@ -6603,8 +6621,13 @@ get_mesa_program_tgsi(struct gl_context *ctx,
while (v->eliminate_dead_code());

v->merge_two_dsts();
- if (!skip_merge_registers)
- v->merge_registers();
+ if (!skip_merge_registers) {
+ if (getenv("MESA_GLSL_TO_TGSI_NEW_MERGE") != NULL)
+ v->merge_registers_alternative();
+ else
+ v->merge_registers();
+ }
+
v->renumber_registers();

/* Write the END instruction. */
--
2.13.0
Gert Wollny
2017-06-21 12:59:06 UTC
Permalink
This patch adds a set of unit tests for the new lifetime tracker.
---
configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/state_tracker/tests/Makefile.am | 40 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 976 +++++++++++++++++++++
4 files changed, 1018 insertions(+), 1 deletion(-)
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp

diff --git a/configure.ac b/configure.ac
index da7b2f8f81..5279b231ed 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2839,6 +2839,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..a88a94165d 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.

-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests

if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..361ef5cfb7
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,40 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES) $(INCLUDE_DIRS)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gallium/drivers/trace/libtrace.la \
+ $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
+ $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS)
+
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..5f3378637a
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,976 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+using std::pair;
+
+/* A line to describe a TGSI instruction for building mock shaders */
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ MockCodeline(unsigned _op, const vector<int>& _dst, const vector<int>& _src, const vector<int>&_to):
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+class MockShader {
+public:
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+private:
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+using expectation = vector<vector<int>>;
+
+
+class MesaTestWithMemCtx : public testing::Test {
+ void SetUp();
+ void TearDown();
+protected:
+ void *mem_ctx;
+};
+
+class LifetimeEvaluatorTest : public MesaTestWithMemCtx {
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+private:
+ virtual void check(const vector<lifetime>& result, const expectation& e) = 0;
+};
+
+/* This is a test class to check the exact life times of
+ * registers. */
+class LifetimeEvaluatorExactTest : public LifetimeEvaluatorTest {
+protected:
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+/* This test class checks that the life time covers at least
+ * in the expected range. It is used for cases where we know that
+ * a the implementation could be improved on estimating the minimal
+ * life time.
+ */
+class LifetimeEvaluatorAtLeastTest : public LifetimeEvaluatorTest {
+protected:
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+/* in loop if/else value written only in one path, and read later
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
+}
+
+
+/* in loop if/else value written in both path, and read later
+ * - value must survive from first write to last read in loop
+ * for now we only check that the minimum life time is correct */
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
+}
+
+/* in loop if/else value written in both path, red in else path
+ * before read and also read later
+ * - value must survive from first write to last read in loop */
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+/* in loop if/else read in one path before written in the same loop
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop, but not further */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for all
+ * loops up untto the read scope, but not further */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_DEFAULT},
+ {TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 3}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, WriteTwoOnlyUseOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to that inner loop. The variable has to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitchInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_DEFAULT, {}, {}, {}},
+ { TGSI_OPCODE_ENDSWITCH, {}, {}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{2, 10}}));
+}
+
+/* value read/write in differnt loops, conditional */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* value written in one case, and read in other, in loop
+ * - must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* value read/write in same case, stays there */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+/* value read/write in all cases, should only live from first
+ * write to last read, but currently the whole loop is used. */
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+/* value read/write in differnt loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+/* The variable is conditionally read before first written, so
+ * it has to surive all the loops. */
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionInLoopAndCondition)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF },
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+/* If unconditionally first written and read in the same
+ * instruction, then the register must be kept for the
+ * one write, but not more (undefined behaviour) */
+
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstruction)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* If unconditionally written and read in the same
+ * instruction, various times then the register must be
+ * kept for the one write, but not more (undefined behaviour) */
+
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionMoreThenOnce)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least one instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* register read in if */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in switch and cases */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_END, {}, {1}, {}},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferentScopesCondReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, WriteTwoReadOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SomeScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SerialReadWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {3}, {2}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}, {2,3}}));
+}
+
+
+/* Check that two destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, WriteInLoopInConditionalReadOutside)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ADD, {2}, {1,in1}, {}},
+ { TGSI_OPCODE_ENDLOOP},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}, {6,8}}));
+}
+
+
+/*
+ * With two destinations if one value is thrown away, we must
+ * ensure that the two output registers don't merge.
+ * In this test case the last access for 2 and 3 is in line 4,
+ * but 4 can only be merged with 3 because it is read, 2 on the
+ * other hand is written to, and merging it with 4 would result in
+ * a bug. */
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead2)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+ { TGSI_OPCODE_DFRACEXP , {2,4}, {3}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {4}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {1,4}, {2,3}, {3,4}}));
+}
+
+/* Check that three destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}},
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+/* Check minimal lifetime for registers only written to */
+TEST_F(LifetimeEvaluatorExactTest, OverwriteWrittenOnlyTemps)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV , {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV , {2}, {in1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}}));
+}
+
+/* same register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the last write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnlyTwiceSame)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+
+/* Dead code elimination should catch and remove the case
+ * when a variable is written after its last read, but
+ * we want the code to be aware of this case.
+ * The life time of this uselessly written variable is set
+ * to the instruction after the write, because
+ * otherwise it could be re-used too early. */
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {1}, {2}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,3}, {1,2}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop the variable must survive the
+ * outer loop
+ */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* Implementation of helper and test classes */
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+MockShader::MockShader(const vector<MockCodeline>& source):
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ /* the list is not fully initialized, so
+ * tearing it down also must be done manually. */
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+
+void MesaTestWithMemCtx::SetUp()
+{
+ mem_ctx = ralloc_context(nullptr);
+}
+
+void MesaTestWithMemCtx::TearDown()
+{
+ ralloc_free(mem_ctx);
+ mem_ctx = nullptr;
+}
+
+void LifetimeEvaluatorTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+ std::vector<lifetime> result(shader.get_num_temps());
+
+ get_temp_registers_required_lifetimes(mem_ctx, shader.get_program(),
+ shader.get_num_temps(), &result[0]);
+
+ /* lifetimes[0] not used, but created for simpler processing */
+ ASSERT_EQ(result.size(), e.size());
+ check(result, e);
+}
+
+
+void LifetimeEvaluatorExactTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].begin, e[i][0]);
+ EXPECT_EQ(lifetimes[i].end, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].begin, e[i][0]);
+ EXPECT_GE(lifetimes[i].end, e[i][1]);
+ }
+}
--
2.13.0
Emil Velikov
2017-06-22 14:52:06 UTC
Permalink
Hi Gert,
Post by Gert Wollny
This patch adds a set of unit tests for the new lifetime tracker.
---
configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/state_tracker/tests/Makefile.am | 40 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 976 +++++++++++++++++++++
4 files changed, 1018 insertions(+), 1 deletion(-)
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
diff --git a/configure.ac b/configure.ac
index da7b2f8f81..5279b231ed 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2839,6 +2839,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..a88a94165d 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests
if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..361ef5cfb7
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,40 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES) $(INCLUDE_DIRS)
INCLUDE_DIRS is/should be undefined in here - please drop.
Post by Gert Wollny
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gallium/drivers/trace/libtrace.la \
+ $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \
+ $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS)
+
I'm a bit suspicious if we need all of the above list.

In particular - glapi, trace, winsys/sw and drivers/softpipe.
Please check those and drop if not applicable.

With that from build POV the patch is
Reviewed-by: Emil Velikov <***@collabora.com>

Thanks
Emil
Gert Wollny
2017-06-25 07:22:09 UTC
Permalink
Dear all,

this is a minor update to the patch set. Changes are:

- correct formatting following Emil's suggetions
- remove un-needed libraries for the tests
- rebase to master (e25950808f4eee)

I didn't change anything to the code logic and I'm using mesa with the
patch applied for a few days now without noting any regressions.

As noted before, I don't have write access to mesa-git, so I'll need someone
who sponsors this patch.

Many thanks for any additional comments,
Gert


Gert Wollny (6):
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi: implement new temporary register lifetime
tracker
mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
tracker
mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
mesa/st: glsl_to_tgsi: tie in new temporary register merge approach

configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 315 +-----
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 ++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 +++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 786 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 36 +
src/mesa/state_tracker/tests/Makefile.am | 37 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 1070 ++++++++++++++++++++
10 files changed, 2335 insertions(+), 288 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
2.13.0
Gert Wollny
2017-06-25 07:22:11 UTC
Permalink
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned second time to recorc the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actuall minimal life time is
evaluated.
---
src/mesa/Makefile.sources | 2 +
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 662 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 33 +
3 files changed, 697 insertions(+)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..2359ec3c7d 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..729d77130e
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,662 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <limits>
+
+/* Without c++11 define the nullptr for forward-compatibility
+ * and better readibility */
+#if __cplusplus < 201103L
+#define nullptr 0
+#endif
+
+using std::numeric_limits;
+
+enum e_scope_type {
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_switch_case,
+ sct_switch_default,
+ sct_unknown
+};
+
+enum e_acc_type {
+ acc_read,
+ acc_write,
+ acc_write_cond_from_self
+};
+
+class prog_scope {
+
+public:
+ prog_scope(prog_scope *parent, e_scope_type type, int id, int depth,
+ int begin);
+
+ e_scope_type type() const;
+ prog_scope *parent() const;
+ int nesting_depth() const;
+ int id() const;
+ int end() const;
+ int begin() const;
+ int loop_continue_line() const;
+
+ const prog_scope *in_ifelse_scope() const;
+ const prog_scope *in_switchcase_scope() const;
+ const prog_scope *innermost_loop() const;
+ const prog_scope *outermost_loop() const;
+
+ bool in_loop() const;
+ bool is_conditional() const;
+ bool break_is_for_switchcase() const;
+ bool contains(const prog_scope& other) const;
+
+ void set_end(int end);
+ void set_previous_case_scope(prog_scope *prev);
+ void set_continue_line(int line);
+
+private:
+ e_scope_type scope_type;
+ int scope_id;
+ int scope_nesting_depth;
+ int scope_begin;
+ int scope_end;
+ int loop_cont_line;
+ prog_scope *previous_case_scope;
+ prog_scope *parent_scope;
+};
+
+class temp_access {
+public:
+ temp_access();
+ void record(int line, e_acc_type rw, prog_scope *scope);
+ lifetime get_required_lifetime();
+private:
+ prog_scope *last_read_scope;
+ prog_scope *first_read_scope;
+ prog_scope *first_write_scope;
+ int first_dominant_write;
+ int last_read;
+ int last_write;
+ int first_read;
+ bool keep_for_full_loop;
+};
+
+/* Some storage class to encapsulate the prog_scope (de-)allocations */
+class prog_scope_storage {
+public:
+ prog_scope_storage(void *mem_ctx, int n);
+ ~prog_scope_storage();
+ prog_scope * create(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin);
+private:
+ void *mem_ctx;
+ int current_slot;
+ prog_scope *storage;
+};
+
+/* Scan the program and estimate the required register life times.
+ * The array lifetimes must be pre-allocated */
+void
+get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes)
+{
+
+ int line = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int scope_level = 0;
+ bool is_at_end = false;
+
+ int n_scopes = 1;
+
+
+
+ /* count scopes to allocate the needed space without the need for
+ * re-allocation */
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (inst->op == TGSI_OPCODE_BGNLOOP ||
+ inst->op == TGSI_OPCODE_SWITCH ||
+ inst->op == TGSI_OPCODE_CASE ||
+ inst->op == TGSI_OPCODE_IF ||
+ inst->op == TGSI_OPCODE_UIF ||
+ inst->op == TGSI_OPCODE_ELSE ||
+ inst->op == TGSI_OPCODE_DEFAULT)
+ ++n_scopes;
+ }
+
+ prog_scope_storage scopes(mem_ctx, n_scopes);
+ temp_access *acc = new temp_access[ntemps];
+ prog_scope *cur_scope = scopes.create(nullptr, sct_outer, 0,
+ scope_level++, line);
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (is_at_end) {
+ assert(!"GLSL_TO_TGSI: shader has instructions past end marker");
+ break;
+ }
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ cur_scope = scopes.create(cur_scope, sct_loop, loop_id++,
+ scope_level++, line);
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --scope_level;
+ cur_scope->set_end(line);
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_IF:
+ case TGSI_OPCODE_UIF:{
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY)
+ acc[inst->src[j].index].record(line, acc_read, cur_scope);
+ }
+ cur_scope = scopes.create(cur_scope, sct_if, if_id++,
+ scope_level++, line+1);
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ cur_scope->set_end(line-1);
+ cur_scope = scopes.create(cur_scope->parent(), sct_else,
+ cur_scope->id(),cur_scope->nesting_depth(),
+ line+1);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ cur_scope->set_end(line);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --scope_level;
+ cur_scope->set_end(line-1);
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ cur_scope = scopes.create(cur_scope, sct_switch, switch_id++,
+ scope_level++, line);
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --scope_level;
+ cur_scope->set_end(line-1);
+ /* remove the case level, it might not have been
+ * closed with a break */
+ if (cur_scope->type() != sct_switch )
+ cur_scope = cur_scope->parent();
+
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_CASE:
+ case TGSI_OPCODE_DEFAULT: {
+ /* Switch cases and default are handled at the same nesting level
+ * like their enclosing switch */
+ e_scope_type t = inst->op == TGSI_OPCODE_CASE ? sct_switch_case
+ : sct_switch_default;
+ prog_scope *switch_scope = cur_scope;
+ if ( cur_scope->type() == sct_switch ) {
+ cur_scope = scopes.create(cur_scope, t, cur_scope->id(),
+ scope_level, line+1);
+ } else {
+ switch_scope = cur_scope->parent();
+ assert(switch_scope->type() == sct_switch);
+ prog_scope *scope = scopes.create(switch_scope, t,
+ switch_scope->id(),
+ switch_scope->nesting_depth(),
+ line);
+
+ /* Previous case falls through */
+ if (cur_scope->end() == -1)
+ scope->set_previous_case_scope(cur_scope);
+
+ cur_scope = scope;
+ }
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY)
+ acc[inst->src[j].index].record(line, acc_read, switch_scope);
+ }
+ }
+ case TGSI_OPCODE_BRK: {
+ if (cur_scope->break_is_for_switchcase()) {
+ cur_scope->set_end(line-1);
+ break;
+ }
+ }
+ case TGSI_OPCODE_CONT: {
+ cur_scope->set_continue_line(line);
+ break;
+ }
+ default: {
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY)
+ acc[inst->src[j].index].record(line, acc_read, cur_scope);
+ }
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY)
+ acc[inst->tex_offsets[j].index].record(line, acc_read, cur_scope);
+ }
+
+ e_acc_type write_type = inst->op == TGSI_OPCODE_UCMP ?
+ acc_write_cond_from_self :
+ acc_write;
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY)
+ acc[inst->dst[j].index].record(line, write_type, cur_scope);
+ }
+ }
+ }
+ ++line;
+ }
+
+ /* make sure last scope is closed, even though no
+ * TGSI_OPCODE_END was given */
+ if (cur_scope->end() < 0)
+ cur_scope->set_end(line-1);
+
+ for(int i = 1; i < ntemps; ++i)
+ lifetimes[i] = acc[i].get_required_lifetime();
+
+ delete[] acc;
+}
+
+prog_scope::prog_scope(prog_scope *parent, e_scope_type type, int id,
+ int depth, int scope_begin):
+ scope_type(type),
+ scope_id(id),
+ scope_nesting_depth(depth),
+ scope_begin(scope_begin),
+ scope_end(-1),
+ loop_cont_line(numeric_limits<int>::max()),
+ parent_scope(parent)
+{
+}
+
+e_scope_type prog_scope::type() const
+{
+ return scope_type;
+}
+
+
+prog_scope *prog_scope::parent() const
+{
+ return parent_scope;
+}
+
+int prog_scope::nesting_depth() const
+{
+ return scope_nesting_depth;
+}
+
+bool prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+
+ if (parent_scope)
+ return parent_scope->in_loop();
+
+ return false;
+}
+
+const prog_scope *prog_scope::innermost_loop() const
+{
+ if (scope_type == sct_loop)
+ return this;
+
+ if (parent_scope)
+ return parent_scope->innermost_loop();
+
+ return nullptr;
+}
+
+const prog_scope *prog_scope::outermost_loop() const
+{
+ const prog_scope *loop = nullptr;
+ const prog_scope *p = this;
+
+ do {
+
+ if (p->type() == sct_loop)
+ loop = p;
+
+ p = p->parent();
+
+ } while (p);
+
+ return loop;
+}
+
+bool prog_scope::contains(const prog_scope& other) const
+{
+ return (begin() <= other.begin()) && (end() >= other.end());
+}
+
+bool prog_scope::is_conditional() const
+{
+ return scope_type == sct_if ||
+ scope_type == sct_else ||
+ scope_type == sct_switch_case ||
+ scope_type == sct_switch_default;
+}
+
+const prog_scope *prog_scope::in_ifelse_scope() const
+{
+ if (scope_type == sct_if ||
+ scope_type == sct_else)
+ return this;
+
+ if (parent_scope)
+ return parent_scope->in_ifelse_scope();
+
+ return nullptr;
+}
+
+const prog_scope *prog_scope::in_switchcase_scope() const
+{
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default)
+ return this;
+
+ if (parent_scope)
+ return parent_scope->in_switchcase_scope();
+
+ return nullptr;
+}
+
+bool prog_scope::break_is_for_switchcase() const
+{
+ if (scope_type == sct_loop)
+ return false;
+
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default ||
+ scope_type == sct_switch)
+ return true;
+
+ if (parent_scope)
+ return parent_scope->break_is_for_switchcase();
+
+ return false;
+}
+
+int prog_scope::id() const
+{
+ return scope_id;
+}
+
+int prog_scope::begin() const
+{
+ return scope_begin;
+}
+
+int prog_scope::end() const
+{
+ return scope_end;
+}
+
+void prog_scope::set_previous_case_scope(prog_scope * prev)
+{
+ previous_case_scope = prev;
+}
+
+void prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous_case_scope)
+ previous_case_scope->set_end(end);
+ }
+}
+
+void prog_scope::set_continue_line(int line)
+{
+ if (scope_type == sct_loop) {
+ loop_cont_line = line;
+ } else {
+ if (parent_scope)
+ parent()->set_continue_line(line);
+ }
+}
+
+int prog_scope::loop_continue_line() const
+{
+ return loop_cont_line;
+}
+
+temp_access::temp_access():
+ last_read_scope(nullptr),
+ first_read_scope(nullptr),
+ first_write_scope(nullptr),
+ first_dominant_write(-1),
+ last_read(-1),
+ last_write(-1),
+ first_read(numeric_limits<int>::max()),
+ keep_for_full_loop(false)
+{
+}
+
+void temp_access::record(int line, e_acc_type rw, prog_scope * scope)
+{
+ if (rw == acc_read) {
+
+ last_read_scope = scope;
+ last_read = line;
+
+ if (first_read > line) {
+ first_read = line;
+ first_read_scope = scope;
+ }
+
+ } else {
+
+ last_write = line;
+
+ /* If no first write is assigned check whether we deal with a case where
+ * the temp is read and written in the same instructions, because then
+ * it is not a dominant write, it may even be undefined. Hence postpone
+ * the assignment if the first write, only mark that the register was
+ * written at all by remembering a scope */
+
+ if (first_dominant_write < 0) {
+
+ if (line != last_read || (rw == acc_write_cond_from_self))
+ first_dominant_write = line;
+
+ first_write_scope = scope;
+ }
+
+ if (scope->is_conditional() && scope->in_loop())
+ keep_for_full_loop = true;
+
+ }
+
+}
+
+inline lifetime make_lifetime(int b, int e)
+{
+ lifetime lt;
+ lt.begin = b;
+ lt.end = e;
+ return lt;
+}
+
+#include <iostream>
+using std::cerr;
+
+lifetime temp_access::get_required_lifetime()
+{
+
+ /* this temp is only read, this is undefined
+ * behaviour, so we can use the register otherwise */
+ if (!first_write_scope)
+ return make_lifetime(-1, -1);
+
+
+ /* Only written to, just make sure it doesn't overlap */
+ if (!last_read_scope)
+ return make_lifetime(first_dominant_write, last_write + 1);
+
+
+ /* Undefined behaviour: read and write in the same instruction
+ * but never written elsewhere. Since it is written, we need to
+ * keep it nevertheless.
+ * In this case the first dominanat write is not recorded and we use the
+ * first read to estimate the life time. This is not minimal, since another
+ * undefined first read could have happend before the first undefined
+ * write, but we don't care, because adding yet another tracking variable
+ * to handle this rare case of undefined behaviour doesn't make sense */
+ if (first_write_scope && first_dominant_write < 0) {
+ return make_lifetime(first_read, last_write + 1);
+ }
+
+ const prog_scope *target_scope = last_read_scope;
+ int enclosing_scope_depth = -1;
+
+ /* we read before writing, conditional, and in a loop
+ * hence the value must survive the loops */
+ if ((first_read <= first_dominant_write) &&
+ first_read_scope->is_conditional() &&
+ first_read_scope->in_loop()) {
+ keep_for_full_loop = true;
+ target_scope = first_read_scope->outermost_loop();
+ }
+
+ /* Evaluate the scope that is shared by all three, first write, and
+ * first (conditional) read before write and last read. */
+ while (enclosing_scope_depth < 0) {
+ if (target_scope->contains(*first_write_scope)) {
+ enclosing_scope_depth = target_scope->nesting_depth();
+ } else if (first_write_scope->contains(*target_scope)) {
+ target_scope = first_write_scope;
+ enclosing_scope_depth = first_write_scope->nesting_depth();
+ } else {
+ target_scope = target_scope->parent();
+ assert(target_scope);
+ }
+ }
+
+ /* propagate the read scope to the target scope */
+ while (last_read_scope->nesting_depth() > enclosing_scope_depth) {
+ /* if the read is in a loop we need to extend the
+ * variables life time to the end of that loop
+ * because at this point it is not written in the same loop */
+ if (last_read_scope->type() == sct_loop)
+ last_read = last_read_scope->end();
+
+ last_read_scope = last_read_scope->parent();
+ }
+
+ /* If the first read is (conditionally) before the first write we
+ * have to keep the variable for the loop */
+ if ((first_write_scope->type() == sct_loop) &&
+ (first_read <= first_dominant_write)) {
+
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ /* propagate the first_write scope to the target scope */
+ while (enclosing_scope_depth < first_write_scope->nesting_depth()) {
+
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after the continue/break inside
+ * that loop. Note that this is only needed if we move up in the
+ * scopes. */
+ if (first_write_scope->loop_continue_line() < first_dominant_write &&
+ first_write_scope->end() > first_dominant_write) {
+ keep_for_full_loop = true;
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ first_write_scope = first_write_scope->parent();
+
+ /* Do the propagation again for the parent loop */
+ if (first_write_scope->type() == sct_loop) {
+ if (keep_for_full_loop || (first_read <= first_dominant_write)) {
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ /* if we currently don't propagate the lifetime but
+ * the enclosing scope is a conditional within a loop
+ * up to the last-read level we need to propagate,
+ * todo: to tighten the life time check whether the value
+ * is written in all consitional code path below the loop */
+ if (!keep_for_full_loop &&
+ first_write_scope->is_conditional() &&
+ first_write_scope->in_loop())
+ keep_for_full_loop = true;
+
+ }
+
+ /* MOvin up from a loop into a conditional we might not yet have marked
+ * life-time to scope propagation. Hence, if the conditional we just moved
+ * to is within a loop, propagate in the next round. */
+ if (last_write > last_read)
+ last_read = last_write + 1;
+
+ /* Here we are at the same scope, all is resolved */
+ return make_lifetime(first_dominant_write, last_read);
+}
+
+prog_scope_storage::prog_scope_storage(void *mc, int n):
+ mem_ctx(mc),
+ current_slot(0)
+{
+ storage = ralloc_array(mem_ctx, prog_scope, n);
+}
+
+prog_scope_storage::~prog_scope_storage()
+{
+ ralloc_free(storage);
+}
+
+prog_scope*
+prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin)
+{
+ storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
+ return &storage[current_slot++];
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..a4124b4659
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+
+struct lifetime {
+ int begin;
+ int end;
+};
+
+void
+get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes);
--
2.13.0
Nicolai Hähnle
2017-06-26 12:52:48 UTC
Permalink
Thanks for the update. First off, you're still not tracking individual
components, but that's absolute necessary. Think:

BGNLOOP
MOV TEMP[1].x, ...

UIF ...
MOV TEMP[1].y, ...
ENDIF

use TEMP[1].y
ENDLOOP
Post by Gert Wollny
This patch adds a class for tracking the life times of temporary registers
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned second time to recorc the important register
Typo: record
Post by Gert Wollny
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actuall minimal life time is
Typo: actual (though you could just drop the word entirely)
Post by Gert Wollny
evaluated.
---
src/mesa/Makefile.sources | 2 +
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 662 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 33 +
3 files changed, 697 insertions(+)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 21f9167bda..2359ec3c7d 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -509,6 +509,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+ state_tracker/st_glsl_to_tgsi_temprename.cpp \
+ state_tracker/st_glsl_to_tgsi_temprename.h \
Looks like inconsistent whitespace.
Post by Gert Wollny
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 0000000000..729d77130e
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,662 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+#include <limits>
+
+/* Without c++11 define the nullptr for forward-compatibility
+ * and better readibility */
+#if __cplusplus < 201103L
+#define nullptr 0
+#endif
+
+using std::numeric_limits;
+
+enum e_scope_type {
Please drop the "e_" prefix here and below, we don't usually do that.
Post by Gert Wollny
+ sct_outer,
+ sct_loop,
+ sct_if,
+ sct_else,
+ sct_switch,
+ sct_switch_case,
+ sct_switch_default,
+ sct_unknown
+};
+
+enum e_acc_type {
+ acc_read,
+ acc_write,
+ acc_write_cond_from_self
Document what this means.
Post by Gert Wollny
+};
+
+class prog_scope {
+
+ prog_scope(prog_scope *parent, e_scope_type type, int id, int depth,
+ int begin);
+
+ e_scope_type type() const;
+ prog_scope *parent() const;
+ int nesting_depth() const;
+ int id() const;
+ int end() const;
+ int begin() const;
+ int loop_continue_line() const;
+
+ const prog_scope *in_ifelse_scope() const;
+ const prog_scope *in_switchcase_scope() const;
+ const prog_scope *innermost_loop() const;
+ const prog_scope *outermost_loop() const;
+
+ bool in_loop() const;
+ bool is_conditional() const;
+ bool break_is_for_switchcase() const;
+ bool contains(const prog_scope& other) const;
Please stick to pointers here. (The rule of thumb I use is that if you
care about the identity of a C++ object rather than its contents, you
should not use references.)
Post by Gert Wollny
+
+ void set_end(int end);
+ void set_previous_case_scope(prog_scope *prev);
+ void set_continue_line(int line);
+
+ e_scope_type scope_type;
+ int scope_id;
+ int scope_nesting_depth;
+ int scope_begin;
+ int scope_end;
+ int loop_cont_line;
+ prog_scope *previous_case_scope;
+ prog_scope *parent_scope;
+};
+
+class temp_access {
+ temp_access();
+ void record(int line, e_acc_type rw, prog_scope *scope);
+ lifetime get_required_lifetime();
+ prog_scope *last_read_scope;
+ prog_scope *first_read_scope;
+ prog_scope *first_write_scope;
+ int first_dominant_write;
+ int last_read;
+ int last_write;
+ int first_read;
+ bool keep_for_full_loop;
+};
+
+/* Some storage class to encapsulate the prog_scope (de-)allocations */
+class prog_scope_storage {
+ prog_scope_storage(void *mem_ctx, int n);
+ ~prog_scope_storage();
+ prog_scope * create(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin);
+ void *mem_ctx;
+ int current_slot;
+ prog_scope *storage;
+};
+
+/* Scan the program and estimate the required register life times.
+ * The array lifetimes must be pre-allocated */
Closing */ on its own line.
Post by Gert Wollny
+void
+get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes)
+{
+
Remove whitespace line.
Post by Gert Wollny
+ int line = 0;
+ int loop_id = 0;
+ int if_id = 0;
+ int switch_id = 0;
+ int scope_level = 0;
+ bool is_at_end = false;
+
Remove whitespace line.
Post by Gert Wollny
+ int n_scopes = 1;
+
+
+
Collapse whitespace lines. We never have consecutive empty lines inside
function, struct, etc. declarations. The only place where two
consecutive empty lines are occasionally used is at the top level (file)
scope.
Post by Gert Wollny
+ /* count scopes to allocate the needed space without the need for
+ * re-allocation */
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (inst->op == TGSI_OPCODE_BGNLOOP ||
+ inst->op == TGSI_OPCODE_SWITCH ||
+ inst->op == TGSI_OPCODE_CASE ||
+ inst->op == TGSI_OPCODE_IF ||
+ inst->op == TGSI_OPCODE_UIF ||
+ inst->op == TGSI_OPCODE_ELSE ||
+ inst->op == TGSI_OPCODE_DEFAULT)
+ ++n_scopes;
+ }
+
+ prog_scope_storage scopes(mem_ctx, n_scopes);
+ temp_access *acc = new temp_access[ntemps];
+ prog_scope *cur_scope = scopes.create(nullptr, sct_outer, 0,
+ scope_level++, line);
+
+ foreach_in_list(glsl_to_tgsi_instruction, inst, instructions) {
+ if (is_at_end) {
+ assert(!"GLSL_TO_TGSI: shader has instructions past end marker");
+ break;
+ }
+
+ switch (inst->op) {
+ case TGSI_OPCODE_BGNLOOP: {
+ cur_scope = scopes.create(cur_scope, sct_loop, loop_id++,
+ scope_level++, line);
+ break;
+ }
+ case TGSI_OPCODE_ENDLOOP: {
+ --scope_level;
+ cur_scope->set_end(line);
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_UIF:{
Space before {
Post by Gert Wollny
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY)
+ acc[inst->src[j].index].record(line, acc_read, cur_scope);
+ }
+ cur_scope = scopes.create(cur_scope, sct_if, if_id++,
+ scope_level++, line+1);
+ break;
+ }
+ case TGSI_OPCODE_ELSE: {
+ cur_scope->set_end(line-1);
+ cur_scope = scopes.create(cur_scope->parent(), sct_else,
+ cur_scope->id(),cur_scope->nesting_depth(),
+ line+1);
+ break;
+ }
+ case TGSI_OPCODE_END:{
+ cur_scope->set_end(line);
+ is_at_end = true;
+ break;
+ }
+ case TGSI_OPCODE_ENDIF:{
+ --scope_level;
+ cur_scope->set_end(line-1);
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_SWITCH: {
+ cur_scope = scopes.create(cur_scope, sct_switch, switch_id++,
+ scope_level++, line);
+ break;
+ }
+ case TGSI_OPCODE_ENDSWITCH: {
+ --scope_level;
+ cur_scope->set_end(line-1);
+ /* remove the case level, it might not have been
+ * closed with a break */
Closing */ on its own line.
Post by Gert Wollny
+ if (cur_scope->type() != sct_switch )
+ cur_scope = cur_scope->parent();
+
+ cur_scope = cur_scope->parent();
+ assert(cur_scope);
+ break;
+ }
+ case TGSI_OPCODE_DEFAULT: {
+ /* Switch cases and default are handled at the same nesting level
+ * like their enclosing switch */
Why? It seems surprising to mess with the invariant that
parent->nesting_depth() == nesting_depth() - 1.
Post by Gert Wollny
+ e_scope_type t = inst->op == TGSI_OPCODE_CASE ? sct_switch_case
+ : sct_switch_default;
+ prog_scope *switch_scope = cur_scope;
+ if ( cur_scope->type() == sct_switch ) {
+ cur_scope = scopes.create(cur_scope, t, cur_scope->id(),
+ scope_level, line+1);
+ } else {
+ switch_scope = cur_scope->parent();
+ assert(switch_scope->type() == sct_switch);
+ prog_scope *scope = scopes.create(switch_scope, t,
+ switch_scope->id(),
+ switch_scope->nesting_depth(),
+ line);
+
+ /* Previous case falls through */
+ if (cur_scope->end() == -1)
+ scope->set_previous_case_scope(cur_scope);
+
+ cur_scope = scope;
+ }
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY)
+ acc[inst->src[j].index].record(line, acc_read, switch_scope);
+ }
+ }
+ case TGSI_OPCODE_BRK: {
+ if (cur_scope->break_is_for_switchcase()) {
+ cur_scope->set_end(line-1);
+ break;
+ }
+ }
+ case TGSI_OPCODE_CONT: {
+ cur_scope->set_continue_line(line);
I'm still frankly confused about the way you choose to handle BRK/CONT
in loops, and suspect you're doing it wrong. At the very least, having a
function called "set_continue_line" be called for a BRK is bad naming.
Post by Gert Wollny
+ break;
+ }
+ default: {
+ for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
+ if (inst->src[j].file == PROGRAM_TEMPORARY)
+ acc[inst->src[j].index].record(line, acc_read, cur_scope);
+ }
+ for (unsigned j = 0; j < inst->tex_offset_num_offset; j++) {
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY)
+ acc[inst->tex_offsets[j].index].record(line, acc_read, cur_scope);
+ }
+
+ e_acc_type write_type = inst->op == TGSI_OPCODE_UCMP ?
Despite the opcode being called "Integer Conditional Move", it does
write to dst unconditionally. It should probably have been called
"select" or something like that.
Post by Gert Wollny
+ acc_write;
+ for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
+ if (inst->dst[j].file == PROGRAM_TEMPORARY)
+ acc[inst->dst[j].index].record(line, write_type, cur_scope);
+ }
+ }
+ }
+ ++line;
+ }
+
+ /* make sure last scope is closed, even though no
+ * TGSI_OPCODE_END was given */
+ if (cur_scope->end() < 0)
+ cur_scope->set_end(line-1);
+
+ for(int i = 1; i < ntemps; ++i)
+ lifetimes[i] = acc[i].get_required_lifetime();
+
+ delete[] acc;
+}
+
+prog_scope::prog_scope(prog_scope *parent, e_scope_type type, int id,
+ scope_type(type),
+ scope_id(id),
+ scope_nesting_depth(depth),
+ scope_begin(scope_begin),
+ scope_end(-1),
+ loop_cont_line(numeric_limits<int>::max()),
+ parent_scope(parent)
+{
+}
+
+e_scope_type prog_scope::type() const
+{
+ return scope_type;
+}
+
+
Be consistent about how you use whitespace.
Post by Gert Wollny
+prog_scope *prog_scope::parent() const
+{
+ return parent_scope;
+}
+
+int prog_scope::nesting_depth() const
+{
+ return scope_nesting_depth;
+}
+
+bool prog_scope::in_loop() const
+{
+ if (scope_type == sct_loop)
+ return true;
+
+ if (parent_scope)
+ return parent_scope->in_loop();
+
+ return false;
+}
+
+const prog_scope *prog_scope::innermost_loop() const
+{
+ if (scope_type == sct_loop)
+ return this;
+
+ if (parent_scope)
+ return parent_scope->innermost_loop();
+
+ return nullptr;
+}
+
+const prog_scope *prog_scope::outermost_loop() const
+{
+ const prog_scope *loop = nullptr;
+ const prog_scope *p = this;
+
+ do {
+
Remove empty line.
Post by Gert Wollny
+ if (p->type() == sct_loop)
+ loop = p;
+
+ p = p->parent();
+
Remove empty line.
Post by Gert Wollny
+ } while (p);
+
+ return loop;
+}
+
+bool prog_scope::contains(const prog_scope& other) const
+{
+ return (begin() <= other.begin()) && (end() >= other.end());
+}
+
+bool prog_scope::is_conditional() const
+{
+ return scope_type == sct_if ||
+ scope_type == sct_else ||
+ scope_type == sct_switch_case ||
+ scope_type == sct_switch_default;
+}
+
+const prog_scope *prog_scope::in_ifelse_scope() const
+{
+ if (scope_type == sct_if ||
+ scope_type == sct_else)
+ return this;
+
+ if (parent_scope)
+ return parent_scope->in_ifelse_scope();
+
+ return nullptr;
+}
+
+const prog_scope *prog_scope::in_switchcase_scope() const
+{
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default)
+ return this;
+
+ if (parent_scope)
+ return parent_scope->in_switchcase_scope();
+
+ return nullptr;
+}
+
+bool prog_scope::break_is_for_switchcase() const
+{
+ if (scope_type == sct_loop)
+ return false;
+
+ if (scope_type == sct_switch_case ||
+ scope_type == sct_switch_default ||
+ scope_type == sct_switch)
+ return true;
+
+ if (parent_scope)
+ return parent_scope->break_is_for_switchcase();
+
+ return false;
+}
+
+int prog_scope::id() const
+{
+ return scope_id;
+}
+
+int prog_scope::begin() const
+{
+ return scope_begin;
+}
+
+int prog_scope::end() const
+{
+ return scope_end;
+}
+
+void prog_scope::set_previous_case_scope(prog_scope * prev)
+{
+ previous_case_scope = prev;
+}
+
+void prog_scope::set_end(int end)
+{
+ if (scope_end == -1) {
+ scope_end = end;
+ if (previous_case_scope)
+ previous_case_scope->set_end(end);
+ }
+}
+
+void prog_scope::set_continue_line(int line)
+{
+ if (scope_type == sct_loop) {
+ loop_cont_line = line;
+ } else {
+ if (parent_scope)
+ parent()->set_continue_line(line);
+ }
+}
+
+int prog_scope::loop_continue_line() const
+{
+ return loop_cont_line;
+}
+
+ last_read_scope(nullptr),
+ first_read_scope(nullptr),
+ first_write_scope(nullptr),
+ first_dominant_write(-1),
+ last_read(-1),
+ last_write(-1),
+ first_read(numeric_limits<int>::max()),
+ keep_for_full_loop(false)
+{
+}
+
+void temp_access::record(int line, e_acc_type rw, prog_scope * scope)
+{
+ if (rw == acc_read) {
+
Again, remove empty line. Further instances of this below.
Post by Gert Wollny
+ last_read_scope = scope;
+ last_read = line;
+
+ if (first_read > line) {
+ first_read = line;
+ first_read_scope = scope;
+ }
+
+ } else {
+
+ last_write = line;
+
+ /* If no first write is assigned check whether we deal with a case where
+ * the temp is read and written in the same instructions, because then
+ * it is not a dominant write, it may even be undefined. Hence postpone
+ * the assignment if the first write, only mark that the register was
+ * written at all by remembering a scope */
+
Closing */ on its own line, and remove empty line.

Also, I think the comment is wrong. It should count as a dominating
write even if there's a read on the same line. So the special handling
here is wrong.

What you need to do for loop handling is to use first_read <=
first_dominating_write as a check for whether the first read occurs
before the first dominating write in program order.

In general, you need to think of every instruction / line of the program
as occurring in two phases, a "read" phase and a "write" phase. Then you
don't need special cases like this.
Post by Gert Wollny
+ if (first_dominant_write < 0) {
+
+ if (line != last_read || (rw == acc_write_cond_from_self))
+ first_dominant_write = line;
+
+ first_write_scope = scope;
Should this be renamed to first_dominant_write_scope?
Post by Gert Wollny
+ }
+
+ if (scope->is_conditional() && scope->in_loop())
+ keep_for_full_loop = true;
This is only necessary as long as we don't have a dominant write yet, right?
Post by Gert Wollny
+
+ }
+
+}
+
+inline lifetime make_lifetime(int b, int e)
+{
+ lifetime lt;
+ lt.begin = b;
+ lt.end = e;
+ return lt;
+}
+
+#include <iostream>
+using std::cerr;
Always put includes and usings at the top of a source file.
Post by Gert Wollny
+
+lifetime temp_access::get_required_lifetime()
+{
+
+ /* this temp is only read, this is undefined
+ * behaviour, so we can use the register otherwise */
+ if (!first_write_scope)
+ return make_lifetime(-1, -1);
+
+
+ /* Only written to, just make sure it doesn't overlap */
+ if (!last_read_scope)
+ return make_lifetime(first_dominant_write, last_write + 1);
Should this be first_write or first_dominant_write?

(Also, the kind of whitespace problems here that I don't want to repeat
everywhere)
Post by Gert Wollny
+
+
+ /* Undefined behaviour: read and write in the same instruction
+ * but never written elsewhere. Since it is written, we need to
+ * keep it nevertheless.
This doesn't actually need to be undefined behavior, depending on the
instruction. It's likely to be dead code though.

Also, the actual code below doesn't reflect the comment.
Post by Gert Wollny
+ * In this case the first dominanat write is not recorded and we use the
+ * first read to estimate the life time. This is not minimal, since another
+ * undefined first read could have happend before the first undefined
+ * write, but we don't care, because adding yet another tracking variable
+ * to handle this rare case of undefined behaviour doesn't make sense */
+ if (first_write_scope && first_dominant_write < 0) {
+ return make_lifetime(first_read, last_write + 1);
Don't you have to expand this to the extend of the outermost loop?
Post by Gert Wollny
+ }
+
+ const prog_scope *target_scope = last_read_scope;
+ int enclosing_scope_depth = -1;
+
+ /* we read before writing, conditional, and in a loop
+ * hence the value must survive the loops */
+ if ((first_read <= first_dominant_write) &&
+ first_read_scope->is_conditional() &&
+ first_read_scope->in_loop()) {
+ keep_for_full_loop = true;
+ target_scope = first_read_scope->outermost_loop();
+ }
+
+ /* Evaluate the scope that is shared by all three, first write, and
+ * first (conditional) read before write and last read. */
What's a conditional read, and why does it matter?
Post by Gert Wollny
+ while (enclosing_scope_depth < 0) {
Too many spaces on the left of <
Post by Gert Wollny
+ if (target_scope->contains(*first_write_scope)) {
+ enclosing_scope_depth = target_scope->nesting_depth();
+ } else if (first_write_scope->contains(*target_scope)) {
+ target_scope = first_write_scope;
+ enclosing_scope_depth = first_write_scope->nesting_depth();
+ } else {
+ target_scope = target_scope->parent();
+ assert(target_scope);
+ }
+ }
+
+ /* propagate the read scope to the target scope */
+ while (last_read_scope->nesting_depth() > enclosing_scope_depth) {
+ /* if the read is in a loop we need to extend the
+ * variables life time to the end of that loop
+ * because at this point it is not written in the same loop */
+ if (last_read_scope->type() == sct_loop)
+ last_read = last_read_scope->end();
+
+ last_read_scope = last_read_scope->parent();
+ }
+
+ /* If the first read is (conditionally) before the first write we
+ * have to keep the variable for the loop */
+ if ((first_write_scope->type() == sct_loop) &&
+ (first_read <= first_dominant_write)) {
+
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ /* propagate the first_write scope to the target scope */
+ while (enclosing_scope_depth < first_write_scope->nesting_depth()) {
+
+ /* propagate lifetime also if there was a continue/break
+ * in a loop and the write was after the continue/break inside
+ * that loop. Note that this is only needed if we move up in the
+ * scopes. */
+ if (first_write_scope->loop_continue_line() < first_dominant_write &&
+ first_write_scope->end() > first_dominant_write) {
+ keep_for_full_loop = true;
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+
+ first_write_scope = first_write_scope->parent();
+
+ /* Do the propagation again for the parent loop */
+ if (first_write_scope->type() == sct_loop) {
+ if (keep_for_full_loop || (first_read <= first_dominant_write)) {
+ first_dominant_write = first_write_scope->begin();
+ int lr = first_write_scope->end();
+ if (last_read < lr)
+ last_read = lr;
+ }
+ }
+
+ /* if we currently don't propagate the lifetime but
+ * the enclosing scope is a conditional within a loop
+ * up to the last-read level we need to propagate,
+ * todo: to tighten the life time check whether the value
+ * is written in all consitional code path below the loop */
+ if (!keep_for_full_loop &&
+ first_write_scope->is_conditional() &&
+ first_write_scope->in_loop())
+ keep_for_full_loop = true;
+
+ }
+
+ /* MOvin up from a loop into a conditional we might not yet have marked
+ * life-time to scope propagation. Hence, if the conditional we just moved
+ * to is within a loop, propagate in the next round. */
+ if (last_write > last_read)
+ last_read = last_write + 1;
+
+ /* Here we are at the same scope, all is resolved */
+ return make_lifetime(first_dominant_write, last_read);
I suspect that there are a lot of logical cleanups and simplifications
that you can achieve in this function but sticking to a straight story
of what every variable really means.

But please, first address the issue of multiple components and all the
style issues, then we can see what to about this.
Post by Gert Wollny
+}
+
+ mem_ctx(mc),
+ current_slot(0)
+{
+ storage = ralloc_array(mem_ctx, prog_scope, n);
+}
+
+prog_scope_storage::~prog_scope_storage()
+{
+ ralloc_free(storage);
+}
+
+prog_scope*
+prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
+ int lvl, int s_begin)
+{
+ storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
+ return &storage[current_slot++];
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
new file mode 100644
index 0000000000..a4124b4659
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+
+struct lifetime {
+ int begin;
+ int end;
Document this: which "phases" of the corresponding instruction does the
lifetime include?

Cheers,
Nicolai
Post by Gert Wollny
+};
+
+void
+get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
+ int ntemps, struct lifetime *lifetimes);
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-27 09:32:34 UTC
Permalink
Thanks for your comments
Post by Nicolai Hähnle
Thanks for the update. First off, you're still not tracking
individual 
BGNLOOP
   MOV TEMP[1].x, ...
   UIF ...
     MOV TEMP[1].y, ...
   ENDIF
   use TEMP[1].y
ENDLOOP
Added test case and implemented.
Post by Nicolai Hähnle
+
Post by Gert Wollny
+enum e_scope_type {
Please drop the "e_" prefix here and below, we don't usually do that.
done. Do you also mean below? I usually do this to avoid name clashes
...
Post by Nicolai Hähnle
Post by Gert Wollny
+         /* Switch cases and default are handled at the same
nesting level
+          * like their enclosing switch */
Why? It seems surprising to mess with the invariant that 
parent->nesting_depth() == nesting_depth() - 1.
I'll change this.
Post by Nicolai Hähnle
+      case TGSI_OPCODE_CONT: {
Post by Gert Wollny
+         cur_scope->set_continue_line(line);
I'm still frankly confused about the way you choose to handle
BRK/CONT in loops, and suspect you're doing it wrong. At the very
least, having a function called "set_continue_line" be called for a
BRK is bad naming.
Well, I'll also changed this. In any case, handling continue like break
only means that the required lifetime of some temporaries would be
overestimated, which is not a big problem (as compared to
underestimating it).
Post by Nicolai Hähnle
Post by Gert Wollny
+         e_acc_type write_type = inst->op == TGSI_OPCODE_UCMP ?
Despite the opcode being called "Integer Conditional Move", it does 
write to dst unconditionally. It should probably have been called 
"select" or something like that.
I'm aware of that, the reason why I'm tracking this explicitly is
because considering this line with TEMP[5] never written before:

UCMP TEMP[5], IN[0], TEMP[5], In[1]

TEMP[5] can be well defined after the write, so I have to take the
write into account as a dominant write. On the other hand

MOV TEMP[5], TEMP[5]

means that since the read from TEMP[1] is always undefined, the write
to TEMP[1] also is and I only have to make sure that TEMP[5] is not
merged with another register that would then be overwritten. Actually,
I would hope that the dead code elimination removes such statements.


[...]

+
Post by Nicolai Hähnle
Post by Gert Wollny
+      last_write = line;
+
+ /* If no first write is assigned check whether we deal with a
case where the temp is read and written in the same instructions,
because then it is not a dominant write, it may even be undefined.
Hence postpone the assignment if the first write, only mark that
the register was written at all by remembering a scope */
+
Also, I think the comment is wrong. It should count as a dominating 
write even if there's a read on the same line. So the special
handling here is wrong.
This is exactly the case that I commented above, i.e. in the cases when
it is undefined behavior should I keep the register alive? I opted for
no.
Post by Nicolai Hähnle
Post by Gert Wollny
+      if (first_dominant_write < 0) {
+
+          if (line != last_read || (rw ==
acc_write_cond_from_self))
+             first_dominant_write = line;
+
+          first_write_scope = scope;
Should this be renamed to first_dominant_write_scope?
okay.
Post by Nicolai Hähnle
Post by Gert Wollny
+      }
+
+      if (scope->is_conditional() && scope->in_loop())
+         keep_for_full_loop = true;
This is only necessary as long as we don't have a dominant write yet, right?
I have a test case for this, if you have the dominant write in a loop
within conditional within loop then the propagation must be done when
moving up the scopes.
Post by Nicolai Hähnle
Post by Gert Wollny
+#include <iostream>
+using std::cerr;
Always put includes and usings at the top of a source file.
Sorry, that was supposed to be just a quick check that I wanted to
remove later, but forgot to do so.
Post by Nicolai Hähnle
Post by Gert Wollny
+
+
+   /* Undefined behaviour: read and write in the same instruction
+    * but never written elsewhere. Since it is written, we need to
+    * keep it nevertheless.
This doesn't actually need to be undefined behavior, depending on
the  instruction. It's likely to be dead code though.
CMIIW, but with the exception of UCMP I don't see a case where this is
not undefined behavior, and UCMP is handled differently.
Post by Nicolai Hähnle
Also, the actual code below doesn't reflect the comment.
I'll try to improve the comment.
Post by Nicolai Hähnle
Don't you have to expand this to the extend of the outermost loop?
Why? if it is undefined I don't need to keep the register around for
more that the instructions where it is written (hence the last_write
+1).
Post by Nicolai Hähnle
Post by Gert Wollny
+   /* Evaluate the scope that is shared by all three, first write,
and
+    * first (conditional) read before write and last read. */
What's a conditional read, and why does it matter?
An unconditional read before a first dominant write is undefined. A
conditional read before the dominant write (in a loop) is very likely
well defined and for that reason we have to take it into account. e.g.

Y=0
I=0
BGNLOOP
IF i > 1
ADD Y, X, Y
ENDIF
X = 1
I = I + 1
IF I > 5
BRK
ENDIF
ENDLOOP
OUT = Y

Here access to X is always well defined. With

Y=0
I=0
BGNLOOP
ADD Y, X, Y
X = 1
I = I +1
IF I > 5
BRK
ENDIF
ENDLOOP
OUT = Y

The read access to X in line 4 is undefined in the first round,
and hence Y and OUT are undefined. Hence, there is no point in keeping
X alive for the full loop like in the above case.

If, on the other hand we have

Y=0
I=0
BGNLOOP
UCMP Y, I, IN0, X
X = 1
I = I +1
IF I > 5
BRK
ENDIF
ENDLOOP
OUT = Y

then X must be kept alive for the whole loop.
Post by Nicolai Hähnle
Post by Gert Wollny
+
+   /* Here we are at the same scope, all is resolved */
+   return make_lifetime(first_dominant_write, last_read);
I suspect that there are a lot of logical cleanups and
simplifications that you can achieve in this function but sticking to
a straight story of what every variable really means.
But please, first address the issue of multiple components and all
the style issues, then we can see what to about this.
Okay. Although I think that there are not many simplifications possible
that don't sacrifice a close-to-minimal estimated life time.

Best,
Gert
Nicolai Hähnle
2017-06-28 08:37:28 UTC
Permalink
On 27.06.2017 11:32, Gert Wollny wrote:
[snip]
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+enum e_scope_type {
Please drop the "e_" prefix here and below, we don't usually do that.
done. Do you also mean below? I usually do this to avoid name clashes
...
Yes, I also mean below.


[snip]
Post by Gert Wollny
Post by Nicolai Hähnle
+ case TGSI_OPCODE_CONT: {
Post by Gert Wollny
+ cur_scope->set_continue_line(line);
I'm still frankly confused about the way you choose to handle
BRK/CONT in loops, and suspect you're doing it wrong. At the very
least, having a function called "set_continue_line" be called for a
BRK is bad naming.
Well, I'll also changed this. In any case, handling continue like break
only means that the required lifetime of some temporaries would be
overestimated, which is not a big problem (as compared to
underestimating it).
True, but it can also be confusing, so I think it's better to change.
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+ e_acc_type write_type = inst->op == TGSI_OPCODE_UCMP ?
Despite the opcode being called "Integer Conditional Move", it does
write to dst unconditionally. It should probably have been called
"select" or something like that.
I'm aware of that, the reason why I'm tracking this explicitly is
UCMP TEMP[5], IN[0], TEMP[5], In[1]
TEMP[5] can be well defined after the write, so I have to take the
write into account as a dominant write. On the other hand
MOV TEMP[5], TEMP[5]
means that since the read from TEMP[1] is always undefined, the write
to TEMP[1] also is and I only have to make sure that TEMP[5] is not
merged with another register that would then be overwritten. Actually,
I would hope that the dead code elimination removes such statements.
Hmm. I guess I'd have to look more carefully at how you do the lifetime
calculation at the end. I wouldn't be surprised if what you're doing is
correct, but I still think this is one of those instances where you're
making your own life too difficult by not having your concepts clear.

In both examples, you have two accesses: First, a read from the
temporary. Then, an unconditional write to the same temporary.

That is all that should matter.

For a different angle of attack:

AND TEMP[5], IN[0], TEMP[5]

Conceptually, this is *exactly* the same as your UCMP example! Even if
TEMP[5] is undefined before the instruction, it may or may not be
undefined afterwards (because IN[0] can be 0). A similar thing happens
with texturing instructions btw (because textures can be solid colors).

Really, asking whether the values in the temp registers are defined or
not is simply asking the wrong question. You can't really answer that
anyway, because even the inputs may be undefined. The right question to
ask is: Is the temp register guaranteed to have been written previously
in program order (i.e. above the read in the program text). And for that
question, there is simply no distinction between UCMP and everything else.
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+ last_write = line;
+
+ /* If no first write is assigned check whether we deal with a
case where the temp is read and written in the same instructions,
because then it is not a dominant write, it may even be undefined.
Hence postpone the assignment if the first write, only mark that
the register was written at all by remembering a scope */
+
Also, I think the comment is wrong. It should count as a dominating
write even if there's a read on the same line. So the special
handling here is wrong.
This is exactly the case that I commented above, i.e. in the cases when
it is undefined behavior should I keep the register alive? I opted for
no.
So my comment above applies as well in turn :)

BTW, as an additional thought: When I started using "dominating" here, I
meant it in a sense that is derived from dominators in control flow
graphs. Write A dominates read B if every path through the program that
reaches B must go through A. Whether the value written by A is defined
or not is irrelevant.
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+ if (first_dominant_write < 0) {
+
+ if (line != last_read || (rw ==
acc_write_cond_from_self))
+ first_dominant_write = line;
+
+ first_write_scope = scope;
Should this be renamed to first_dominant_write_scope?
okay.
Post by Nicolai Hähnle
Post by Gert Wollny
+ }
+
+ if (scope->is_conditional() && scope->in_loop())
+ keep_for_full_loop = true;
This is only necessary as long as we don't have a dominant write yet, right?
I have a test case for this, if you have the dominant write in a loop
within conditional within loop then the propagation must be done when
moving up the scopes.
Maybe you're referring to this example:

BGNLOOP
IF ...
BGNLOOP
MOV TEMP[0], ...
ENDLOOP
ENDIF
ENDLOOP

In this case, the lifetime must span the entire outer loop.

However, consider this example, where you have an earlier dominating write:

BGNLOOP
MOV TEMP[0], ...
IF ...
BGNLOOP
MOV TEMP[0], ...
ENDLOOP
ENDIF
ENDLOOP

Here, the lifetime must span only from the first MOV to the last use
inside the outer loop.

I know, you can always argue that a conservative approximation of
lifetimes is okay, and I mostly agree.

But what *isn't* okay IMNSHO is getting a conservative approximation
because of code that looks like there was confusion about *why* the code
is doing what it does.

The above is a perfect example of that. There is a check for whether
there has been a dominating write *right there*, and setting the
keep_for_full_loop is only necessary if there hasn't been a dominating
write before, so why is that not inside the dominating write check? It
just reeks of coding by trial-and-error, and that's frankly not good enough.
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+#include <iostream>
+using std::cerr;
Always put includes and usings at the top of a source file.
Sorry, that was supposed to be just a quick check that I wanted to
remove later, but forgot to do so.
Post by Nicolai Hähnle
Post by Gert Wollny
+
+
+ /* Undefined behaviour: read and write in the same instruction
+ * but never written elsewhere. Since it is written, we need to
+ * keep it nevertheless.
This doesn't actually need to be undefined behavior, depending on
the instruction. It's likely to be dead code though.
CMIIW, but with the exception of UCMP I don't see a case where this is
not undefined behavior, and UCMP is handled differently.
Well, there *are* other examples, like the AND I mentioned above, or
dually an OR, or UMUL/IMUL, or USLT and friends. Basically, like I've
argued before, UCMP *shouldn't* be handled differently.
Post by Gert Wollny
Post by Nicolai Hähnle
Also, the actual code below doesn't reflect the comment.
I'll try to improve the comment.
Post by Nicolai Hähnle
Don't you have to expand this to the extend of the outermost loop?
Why? if it is undefined I don't need to keep the register around for
more that the instructions where it is written (hence the last_write
+1).
Post by Nicolai Hähnle
Post by Gert Wollny
+ /* Evaluate the scope that is shared by all three, first write, and
+ * first (conditional) read before write and last read. */
What's a conditional read, and why does it matter?
An unconditional read before a first dominant write is undefined. A
conditional read before the dominant write (in a loop) is very likely
well defined and for that reason we have to take it into account. e.g.
Y=0
I=0
BGNLOOP
IF i > 1
ADD Y, X, Y
ENDIF
X = 1
I = I + 1
IF I > 5
BRK
ENDIF
ENDLOOP
OUT = Y
Here access to X is always well defined. With
Y=0
I=0
BGNLOOP
ADD Y, X, Y
X = 1
I = I +1
IF I > 5
BRK
ENDIF
ENDLOOP
OUT = Y
The read access to X in line 4 is undefined in the first round,
and hence Y and OUT are undefined. Hence, there is no point in keeping
X alive for the full loop like in the above case.
If, on the other hand we have
Y=0
I=0
BGNLOOP
UCMP Y, I, IN0, X
X = 1
I = I +1
IF I > 5
BRK
ENDIF
ENDLOOP
OUT = Y
then X must be kept alive for the whole loop.
Yeah, since there are more examples like this than UCMP, I think this
again falls under the category of making your own life unnecessarily
difficult.

BTW, in some sense perhaps you could think of what you're trying to do
here as taking a very aggressive stance on undefs like modern C/C++
compilers are doing. An argument can be made in favor of that, but as I
hope I've shown you, it's a minefield. Should (x & 0) be undefined if x
is undefined? Perhaps there are some languages that would like that and
would benefit from it, but I'd rather play it safe here and assume that
examples like what you've shown can be meaningful for any kind of
instruction (whether UCMP or otherwise).

If, at some later point, you could show that there are big benefits to
be had from making such undef-based optimizations, then we could
reconsider. But I doubt it, and in the meantime trying to do this just
muddles the concepts that this lifetime estimation is dealing with.

Also, just a reminder of the issue of writemasks and tracking components
separately.

Cheers,
Nicolai
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+
+ /* Here we are at the same scope, all is resolved */
+ return make_lifetime(first_dominant_write, last_read);
I suspect that there are a lot of logical cleanups and
simplifications that you can achieve in this function but sticking to
a straight story of what every variable really means.
But please, first address the issue of multiple components and all
the style issues, then we can see what to about this.
Okay. Although I think that there are not many simplifications possible
that don't sacrifice a close-to-minimal estimated life time.
Best,
Gert
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-28 15:07:26 UTC
Permalink
Hello Nicolai, 

thanks again for your insightful comments, I am really on a steep
learning curve here :) 

I've pushed all my work on this to github, but since I mostly went on
refactoring in baby steps you might not want to look at the commits
separately. 
Post by Nicolai Hähnle
Yes, I also mean below.
Okay, done.
Post by Nicolai Hähnle
[snip]
Post by Gert Wollny
Post by Nicolai Hähnle
+      case TGSI_OPCODE_CONT: {
Post by Gert Wollny
+         cur_scope->set_continue_line(line);
I'm still frankly confused about the way you choose to handle
BRK/CONT in loops, and suspect you're doing it wrong. At the very
least, having a function called "set_continue_line" be called for a
BRK is bad naming.
Well, I'll also changed this. In any case, handling continue like break
only means that the required lifetime of some temporaries would be
overestimated, which is not a big problem (as compared to
underestimating it).
True, but it can also be confusing, so I think it's better to change.
Also done. I've also added a some more tests to check that break is
properly handled. These tests were missing when I prepared v3, and for
that reason I a green board on the tests and a failure with real world
applications.
Post by Nicolai Hähnle
Hmm. I guess I'd have to look more carefully at how you do the
lifetime calculation at the end. I wouldn't be surprised if what
you're doing is correct, but I still think this is one of those
instances where you're making your own life too difficult by not
having your concepts clear.
No, I was not correct, not even by accident, and I now understand that
you are absolutely right about this. Namely

BGNLOOP 
MUL TEMP[1], IN[0], TEMP[1]
...
ENDLOOP

failed to properly extend to the loop.
Post by Nicolai Hähnle
Post by Gert Wollny
I have a test case for this, if you have the dominant write in a
loop within conditional within loop then the propagation must be
done when moving up the scopes.
BGNLOOP
   IF ...
     BGNLOOP
       MOV TEMP[0], ...
     ENDLOOP
   ENDIF
ENDLOOP
In this case, the lifetime must span the entire outer loop.
Exactly.
Post by Nicolai Hähnle
BGNLOOP
   MOV TEMP[0], ...
   IF ...
     BGNLOOP
       MOV TEMP[0], ...
     ENDLOOP
   ENDIF
ENDLOOP
Here, the lifetime must span only from the first MOV to the last use 
inside the outer loop.
Indeed, and in my algorithm, the move within the IF statement is not
recorded as dominant. I added a specific test for this, but there was
no need to change the algorithm to make it pass.
Post by Nicolai Hähnle
The above is a perfect example of that. There is a check for whether 
there has been a dominating write *right there*, and setting the 
keep_for_full_loop is only necessary if there hasn't been a
dominating 
write before, so why is that not inside the dominating write check?
At the point I was about to argue that it has to be done the way I do
it, and then I checked my latest code again, and saw that I had moved
that part out of the while loop :)

I still have the break handling inside that while loop where I move the
dominant write scope up, and I do not yet see whether this could or
should also be resolved earlier.
Post by Nicolai Hähnle
It just reeks of coding by trial-and-error, and that's frankly not
good enough.
Well, I wouldn't call it "coding by trial-and-error". As a fan of TDD
I look at it differently: Defining tests and making them pass is
initially the aim and every short cut is allowed, but then one must
refactor the code to improve it, and with the test cases in place this
refactoring can be done without the fear to break the code.
Post by Nicolai Hähnle
Also, just a reminder of the issue of writemasks and tracking
components separately.
This is already done.

My next step will be to (re-)check formatting, white spaces etc. I
already did a lot of cleaning up, but I want to be sure nothing slips
through - and of course some benchmarking.

many thanks again for your time,
Gert

Gert Wollny
2017-06-25 07:22:10 UTC
Permalink
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;

int swizzle_for_type(const glsl_type *type, int component);

as inline:

bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);
---
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 288 +--------------------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 205 +++++++++++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 164 ++++++++++++
4 files changed, 374 insertions(+), 285 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index b80882fb8d..21f9167bda 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -507,6 +507,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+ state_tracker/st_glsl_to_tgsi_private.cpp \
+ state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 7852941acd..528fc4cc64 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -65,251 +66,7 @@

#define MAX_GLSL_TEXTURE_OFFSET 4

-class st_src_reg;
-class st_dst_reg;
-
-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
- unsigned num_elements = 4;
-
- if (type) {
- type = type->without_array();
- if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
- }
-
- int swizzle = swizzle_for_size(num_elements);
- assert(num_elements + component <= 4);
-
- swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
- return swizzle;
-}
-
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
-public:
- st_src_reg(gl_register_file file, int index, const glsl_type *type,
- int component = 0, unsigned array_id = 0)
- {
- assert(file != PROGRAM_ARRAY || array_id != 0);
- this->file = file;
- this->index = index;
- this->swizzle = swizzle_for_type(type, component);
- this->negate = 0;
- this->abs = 0;
- this->index2D = 0;
- this->type = type ? type->base_type : GLSL_TYPE_ERROR;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = array_id;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = index2D;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->swizzle = 0;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- explicit st_src_reg(st_dst_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
- int negate:4; /**< NEGATE_XYZW mask from mesa */
- unsigned abs:1;
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- /*
- * Is this the second half of a double register pair?
- * currently used for input mapping only.
- */
- unsigned double_reg2:1;
- unsigned is_double_vertex_input:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-
- st_src_reg get_abs()
- {
- st_src_reg reg = *this;
- reg.negate = 0;
- reg.abs = 1;
- return reg;
- }
-};
-
-class st_dst_reg {
-public:
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = 0;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->writemask = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->array_id = 0;
- }
-
- explicit st_dst_reg(st_src_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-};
-
-st_src_reg::st_src_reg(st_dst_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->double_reg2 = false;
- this->array_id = reg.array_id;
- this->is_double_vertex_input = false;
-}
-
-st_dst_reg::st_dst_reg(st_src_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->writemask = WRITEMASK_XYZW;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->array_id = reg.array_id;
-}
-
-class glsl_to_tgsi_instruction : public exec_node {
-public:
- DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
-
- st_dst_reg dst[2];
- st_src_reg src[4];
- st_src_reg resource; /**< sampler, image or buffer register */
- st_src_reg *tex_offsets;
-
- /** Pointer to the ir source this tree came from for debugging */
- ir_instruction *ir;
-
- unsigned op:8; /**< TGSI opcode */
- unsigned saturate:1;
- unsigned is_64bit_expanded:1;
- unsigned sampler_base:5;
- unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
- unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
- glsl_base_type tex_type:5;
- unsigned tex_shadow:1;
- unsigned image_format:9;
- unsigned tex_offset_num_offset:3;
- unsigned dead_mask:4; /**< Used in dead code elimination */
- unsigned buffer_access:3; /**< buffer access type */
-
- const struct tgsi_opcode_info *info;
-};
+extern int swizzle_for_size(int size);

class variable_storage {
DECLARE_RZALLOC_CXX_OPERATORS(variable_storage)
@@ -390,11 +147,6 @@ find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
return GLSL_TYPE_ERROR;
}

-struct rename_reg_pair {
- bool valid;
- int new_reg;
-};
-
struct glsl_to_tgsi_visitor : public ir_visitor {
public:
glsl_to_tgsi_visitor();
@@ -597,7 +349,7 @@ fail_link(struct gl_shader_program *prog, const char *fmt, ...)
prog->data->LinkStatus = linking_failure;
}

-static int
+int
swizzle_for_size(int size)
{
static const int size_swizzles[4] = {
@@ -611,40 +363,6 @@ swizzle_for_size(int size)
return size_swizzles[size - 1];
}

-static bool
-is_resource_instruction(unsigned opcode)
-{
- switch (opcode) {
- case TGSI_OPCODE_RESQ:
- case TGSI_OPCODE_LOAD:
- case TGSI_OPCODE_ATOMUADD:
- case TGSI_OPCODE_ATOMXCHG:
- case TGSI_OPCODE_ATOMCAS:
- case TGSI_OPCODE_ATOMAND:
- case TGSI_OPCODE_ATOMOR:
- case TGSI_OPCODE_ATOMXOR:
- case TGSI_OPCODE_ATOMUMIN:
- case TGSI_OPCODE_ATOMUMAX:
- case TGSI_OPCODE_ATOMIMIN:
- case TGSI_OPCODE_ATOMIMAX:
- return true;
- default:
- return false;
- }
-}
-
-static unsigned
-num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->num_dst;
-}
-
-static unsigned
-num_inst_src_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->is_tex || is_resource_instruction(op->op) ?
- op->info->num_src - 1 : op->info->num_src;
-}

glsl_to_tgsi_instruction *
glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
new file mode 100644
index 0000000000..705160552a
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
@@ -0,0 +1,205 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+
+using std::vector;
+
+extern int swizzle_for_size(int size);
+
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+ unsigned num_elements = 4;
+
+ if (type) {
+ type = type->without_array();
+ if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+ }
+
+ int swizzle = swizzle_for_size(num_elements);
+ assert(num_elements + component <= 4);
+
+ swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+ return swizzle;
+}
+
+
+
+st_src_reg::st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component, unsigned array_id)
+{
+ assert(file != PROGRAM_ARRAY || array_id != 0);
+ this->file = file;
+ this->index = index;
+ this->swizzle = swizzle_for_type(type, component);
+ this->negate = 0;
+ this->abs = 0;
+ this->index2D = 0;
+ this->type = type ? type->base_type : GLSL_TYPE_ERROR;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = index2D;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->swizzle = 0;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+
+st_src_reg st_src_reg::get_abs()
+{
+ st_src_reg reg = *this;
+ reg.negate = 0;
+ reg.abs = 1;
+ return reg;
+}
+
+st_src_reg::st_src_reg(st_dst_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->double_reg2 = false;
+ this->array_id = reg.array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_dst_reg::st_dst_reg(st_src_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->writemask = WRITEMASK_XYZW;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->array_id = reg.array_id;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->array_id = 0;
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.h b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
new file mode 100644
index 0000000000..d729bc008d
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
@@ -0,0 +1,164 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <mesa/main/mtypes.h>
+#include <compiler/glsl_types.h>
+#include <compiler/glsl/ir.h>
+#include <tgsi/tgsi_info.h>
+#include <stack>
+#include <vector>
+
+class st_dst_reg;
+
+/**
+ * This struct is a corresponding struct to TGSI ureg_src.
+ */
+class st_src_reg {
+public:
+ st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component = 0, unsigned array_id = 0);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D);
+
+ st_src_reg();
+
+ explicit st_src_reg(st_dst_reg reg);
+
+ st_src_reg get_abs();
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+
+ uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
+ int negate:4; /**< NEGATE_XYZW mask from mesa */
+ unsigned abs:1;
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ /*
+ * Is this the second half of a double register pair?
+ * currently used for input mapping only.
+ */
+ unsigned double_reg2:1;
+ unsigned is_double_vertex_input:1;
+ unsigned array_id:10;
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+
+};
+
+class st_dst_reg {
+public:
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index);
+
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type);
+
+ st_dst_reg();
+
+ explicit st_dst_reg(st_src_reg reg);
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ unsigned array_id:10;
+
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+};
+
+class glsl_to_tgsi_instruction : public exec_node {
+public:
+ DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
+
+ st_dst_reg dst[2];
+ st_src_reg src[4];
+ st_src_reg resource; /**< sampler or buffer register */
+ st_src_reg *tex_offsets;
+
+ /** Pointer to the ir source this tree came from for debugging */
+ ir_instruction *ir;
+
+ unsigned op:8; /**< TGSI opcode */
+ unsigned saturate:1;
+ unsigned is_64bit_expanded:1;
+ unsigned sampler_base:5;
+ unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
+ unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type:5;
+ unsigned tex_shadow:1;
+ unsigned image_format:9;
+ unsigned tex_offset_num_offset:3;
+ unsigned dead_mask:4; /**< Used in dead code elimination */
+ unsigned buffer_access:3; /**< buffer access type */
+
+ const struct tgsi_opcode_info *info;
+};
+
+struct rename_reg_pair {
+ bool valid;
+ int new_reg;
+};
+
+inline bool
+is_resource_instruction(unsigned opcode)
+{
+ switch (opcode) {
+ case TGSI_OPCODE_RESQ:
+ case TGSI_OPCODE_LOAD:
+ case TGSI_OPCODE_ATOMUADD:
+ case TGSI_OPCODE_ATOMXCHG:
+ case TGSI_OPCODE_ATOMCAS:
+ case TGSI_OPCODE_ATOMAND:
+ case TGSI_OPCODE_ATOMOR:
+ case TGSI_OPCODE_ATOMXOR:
+ case TGSI_OPCODE_ATOMUMIN:
+ case TGSI_OPCODE_ATOMUMAX:
+ case TGSI_OPCODE_ATOMIMIN:
+ case TGSI_OPCODE_ATOMIMAX:
+ return true;
+ default:
+ return false;
+ }
+}
+
+inline unsigned
+num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->num_dst;
+}
+
+inline unsigned
+num_inst_src_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->is_tex || is_resource_instruction(op->op) ?
+ op->info->num_src - 1 : op->info->num_src;
+}
--
2.13.0
Nicolai Hähnle
2017-06-26 12:13:34 UTC
Permalink
Post by Gert Wollny
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.
class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;
int swizzle_for_type(const glsl_type *type, int component);
bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);
---
src/mesa/Makefile.sources | 2 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 288 +--------------------
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 205 +++++++++++++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 164 ++++++++++++
4 files changed, 374 insertions(+), 285 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index b80882fb8d..21f9167bda 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -507,6 +507,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+ state_tracker/st_glsl_to_tgsi_private.cpp \
+ state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 7852941acd..528fc4cc64 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"
#include "util/hash_table.h"
#include <algorithm>
@@ -65,251 +66,7 @@
#define MAX_GLSL_TEXTURE_OFFSET 4
-class st_src_reg;
-class st_dst_reg;
-
-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
- unsigned num_elements = 4;
-
- if (type) {
- type = type->without_array();
- if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
- }
-
- int swizzle = swizzle_for_size(num_elements);
- assert(num_elements + component <= 4);
-
- swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
- return swizzle;
-}
-
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
- st_src_reg(gl_register_file file, int index, const glsl_type *type,
- int component = 0, unsigned array_id = 0)
- {
- assert(file != PROGRAM_ARRAY || array_id != 0);
- this->file = file;
- this->index = index;
- this->swizzle = swizzle_for_type(type, component);
- this->negate = 0;
- this->abs = 0;
- this->index2D = 0;
- this->type = type ? type->base_type : GLSL_TYPE_ERROR;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = array_id;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->type = type;
- this->file = file;
- this->index = index;
- this->index2D = index2D;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- st_src_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->swizzle = 0;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->double_reg2 = false;
- this->array_id = 0;
- this->is_double_vertex_input = false;
- }
-
- explicit st_src_reg(st_dst_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
- int negate:4; /**< NEGATE_XYZW mask from mesa */
- unsigned abs:1;
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- /*
- * Is this the second half of a double register pair?
- * currently used for input mapping only.
- */
- unsigned double_reg2:1;
- unsigned is_double_vertex_input:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-
- st_src_reg get_abs()
- {
- st_src_reg reg = *this;
- reg.negate = 0;
- reg.abs = 1;
- return reg;
- }
-};
-
-class st_dst_reg {
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = index;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
- {
- assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
- this->file = file;
- this->index = 0;
- this->index2D = 0;
- this->writemask = writemask;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->type = type;
- this->array_id = 0;
- }
-
- st_dst_reg()
- {
- this->type = GLSL_TYPE_ERROR;
- this->file = PROGRAM_UNDEFINED;
- this->index = 0;
- this->index2D = 0;
- this->writemask = 0;
- this->reladdr = NULL;
- this->reladdr2 = NULL;
- this->has_index2 = false;
- this->array_id = 0;
- }
-
- explicit st_dst_reg(st_src_reg reg);
-
- int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
- int16_t index2D;
- gl_register_file file:5; /**< PROGRAM_* from Mesa */
- unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
- enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
- unsigned has_index2:1;
- unsigned array_id:10;
-
- /** Register index should be offset by the integer in this reg. */
- st_src_reg *reladdr;
- st_src_reg *reladdr2;
-};
-
-st_src_reg::st_src_reg(st_dst_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->swizzle = SWIZZLE_XYZW;
- this->negate = 0;
- this->abs = 0;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->double_reg2 = false;
- this->array_id = reg.array_id;
- this->is_double_vertex_input = false;
-}
-
-st_dst_reg::st_dst_reg(st_src_reg reg)
-{
- this->type = reg.type;
- this->file = reg.file;
- this->index = reg.index;
- this->writemask = WRITEMASK_XYZW;
- this->reladdr = reg.reladdr;
- this->index2D = reg.index2D;
- this->reladdr2 = reg.reladdr2;
- this->has_index2 = reg.has_index2;
- this->array_id = reg.array_id;
-}
-
-class glsl_to_tgsi_instruction : public exec_node {
- DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
-
- st_dst_reg dst[2];
- st_src_reg src[4];
- st_src_reg resource; /**< sampler, image or buffer register */
- st_src_reg *tex_offsets;
-
- /** Pointer to the ir source this tree came from for debugging */
- ir_instruction *ir;
-
- unsigned op:8; /**< TGSI opcode */
- unsigned saturate:1;
- unsigned is_64bit_expanded:1;
- unsigned sampler_base:5;
- unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
- unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
- glsl_base_type tex_type:5;
- unsigned tex_shadow:1;
- unsigned image_format:9;
- unsigned tex_offset_num_offset:3;
- unsigned dead_mask:4; /**< Used in dead code elimination */
- unsigned buffer_access:3; /**< buffer access type */
-
- const struct tgsi_opcode_info *info;
-};
+extern int swizzle_for_size(int size);
extern is unnecessary, and this should be in a header file.
Post by Gert Wollny
class variable_storage {
DECLARE_RZALLOC_CXX_OPERATORS(variable_storage)
@@ -390,11 +147,6 @@ find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
return GLSL_TYPE_ERROR;
}
-struct rename_reg_pair {
- bool valid;
- int new_reg;
-};
-
struct glsl_to_tgsi_visitor : public ir_visitor {
glsl_to_tgsi_visitor();
@@ -597,7 +349,7 @@ fail_link(struct gl_shader_program *prog, const char *fmt, ...)
prog->data->LinkStatus = linking_failure;
}
-static int
+int
swizzle_for_size(int size)
{
static const int size_swizzles[4] = {
@@ -611,40 +363,6 @@ swizzle_for_size(int size)
return size_swizzles[size - 1];
}
-static bool
-is_resource_instruction(unsigned opcode)
-{
- switch (opcode) {
- return true;
- return false;
- }
-}
-
-static unsigned
-num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->num_dst;
-}
-
-static unsigned
-num_inst_src_regs(const glsl_to_tgsi_instruction *op)
-{
- return op->info->is_tex || is_resource_instruction(op->op) ?
- op->info->num_src - 1 : op->info->num_src;
-}
glsl_to_tgsi_instruction *
glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned op,
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
new file mode 100644
index 0000000000..705160552a
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
@@ -0,0 +1,205 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_private.h"
+#include <tgsi/tgsi_info.h>
+#include <mesa/program/prog_instruction.h>
+
+using std::vector;
+
+extern int swizzle_for_size(int size);
Again, this needs to be in a header file.
Post by Gert Wollny
+
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+ unsigned num_elements = 4;
+
+ if (type) {
+ type = type->without_array();
+ if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+ }
+
+ int swizzle = swizzle_for_size(num_elements);
+ assert(num_elements + component <= 4);
+
+ swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+ return swizzle;
+}
+
+
+
+st_src_reg::st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component, unsigned array_id)
+{
+ assert(file != PROGRAM_ARRAY || array_id != 0);
+ this->file = file;
+ this->index = index;
+ this->swizzle = swizzle_for_type(type, component);
+ this->negate = 0;
+ this->abs = 0;
+ this->index2D = 0;
+ this->type = type ? type->base_type : GLSL_TYPE_ERROR;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->type = type;
+ this->file = file;
+ this->index = index;
+ this->index2D = index2D;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+st_src_reg::st_src_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->swizzle = 0;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->double_reg2 = false;
+ this->array_id = 0;
+ this->is_double_vertex_input = false;
+}
+
+
+st_src_reg st_src_reg::get_abs()
+{
+ st_src_reg reg = *this;
+ reg.negate = 0;
+ reg.abs = 1;
+ return reg;
+}
Please move get_abs below the other constructor.
Post by Gert Wollny
+st_src_reg::st_src_reg(st_dst_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->swizzle = SWIZZLE_XYZW;
+ this->negate = 0;
+ this->abs = 0;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->double_reg2 = false;
+ this->array_id = reg.array_id;
+ this->is_double_vertex_input = false;
+}
+
+st_dst_reg::st_dst_reg(st_src_reg reg)
+{
+ this->type = reg.type;
+ this->file = reg.file;
+ this->index = reg.index;
+ this->writemask = WRITEMASK_XYZW;
+ this->reladdr = reg.reladdr;
+ this->index2D = reg.index2D;
+ this->reladdr2 = reg.reladdr2;
+ this->has_index2 = reg.has_index2;
+ this->array_id = reg.array_id;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = index;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+
+st_dst_reg::st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type)
+{
+ assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
+ this->file = file;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = writemask;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->type = type;
+ this->array_id = 0;
+}
+
+st_dst_reg::st_dst_reg()
+{
+ this->type = GLSL_TYPE_ERROR;
+ this->file = PROGRAM_UNDEFINED;
+ this->index = 0;
+ this->index2D = 0;
+ this->writemask = 0;
+ this->reladdr = NULL;
+ this->reladdr2 = NULL;
+ this->has_index2 = false;
+ this->array_id = 0;
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_private.h b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
new file mode 100644
index 0000000000..d729bc008d
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_private.h
@@ -0,0 +1,164 @@
+/*
+ * Copyright © 2010 Intel Corporation
+ * Copyright © 2011 Bryan Cain
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
Missing include guards.
Post by Gert Wollny
+
+#include <mesa/main/mtypes.h>
+#include <compiler/glsl_types.h>
+#include <compiler/glsl/ir.h>
+#include <tgsi/tgsi_info.h>
+#include <stack>
+#include <vector>
+
+class st_dst_reg;
+
+/**
+ * This struct is a corresponding struct to TGSI ureg_src.
+ */
+class st_src_reg {
+ st_src_reg(gl_register_file file, int index, const glsl_type *type,
+ int component = 0, unsigned array_id = 0);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type);
+
+ st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int index2D);
+
+ st_src_reg();
+
+ explicit st_src_reg(st_dst_reg reg);
+
+ st_src_reg get_abs();
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+
+ uint16_t swizzle; /**< SWIZZLE_XYZWONEZERO swizzles from Mesa. */
+ int negate:4; /**< NEGATE_XYZW mask from mesa */
+ unsigned abs:1;
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ /*
+ * Is this the second half of a double register pair?
+ * currently used for input mapping only.
+ */
+ unsigned double_reg2:1;
+ unsigned is_double_vertex_input:1;
+ unsigned array_id:10;
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+
+};
+
+class st_dst_reg {
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type, int index);
+
+ st_dst_reg(gl_register_file file, int writemask, enum glsl_base_type type);
+
+ st_dst_reg();
+
+ explicit st_dst_reg(st_src_reg reg);
+
+ int32_t index; /**< temporary index, VERT_ATTRIB_*, VARYING_SLOT_*, etc. */
+ int16_t index2D;
+ gl_register_file file:5; /**< PROGRAM_* from Mesa */
+ unsigned writemask:4; /**< Bitfield of WRITEMASK_[XYZW] */
+ enum glsl_base_type type:5; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
+ unsigned has_index2:1;
+ unsigned array_id:10;
+
+ /** Register index should be offset by the integer in this reg. */
+ st_src_reg *reladdr;
+ st_src_reg *reladdr2;
+};
+
+class glsl_to_tgsi_instruction : public exec_node {
+ DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
+
+ st_dst_reg dst[2];
+ st_src_reg src[4];
+ st_src_reg resource; /**< sampler or buffer register */
+ st_src_reg *tex_offsets;
+
+ /** Pointer to the ir source this tree came from for debugging */
+ ir_instruction *ir;
+
+ unsigned op:8; /**< TGSI opcode */
+ unsigned saturate:1;
+ unsigned is_64bit_expanded:1;
+ unsigned sampler_base:5;
+ unsigned sampler_array_size:6; /**< 1-based size of sampler array, 1 if not array */
+ unsigned tex_target:4; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type:5;
+ unsigned tex_shadow:1;
+ unsigned image_format:9;
+ unsigned tex_offset_num_offset:3;
+ unsigned dead_mask:4; /**< Used in dead code elimination */
+ unsigned buffer_access:3; /**< buffer access type */
+
+ const struct tgsi_opcode_info *info;
+};
+
+struct rename_reg_pair {
+ bool valid;
+ int new_reg;
+};
+
+inline bool
+is_resource_instruction(unsigned opcode)
+{
+ switch (opcode) {
+ return true;
+ return false;
+ }
+}
+
+inline unsigned
+num_inst_dst_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->num_dst;
+}
+
+inline unsigned
+num_inst_src_regs(const glsl_to_tgsi_instruction *op)
+{
+ return op->info->is_tex || is_resource_instruction(op->op) ?
+ op->info->num_src - 1 : op->info->num_src;
+}
These three functions need a "static".

Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-25 07:22:14 UTC
Permalink
The patch adds tests for the register rename mapping evaluation.
---
.../tests/test_glsl_to_tgsi_lifetime.cpp | 94 ++++++++++++++++++++++
1 file changed, 94 insertions(+)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 5f3378637a..f53b5c23a1 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -89,6 +89,13 @@ protected:
void check(const vector<lifetime>& result, const expectation& e);
};

+/* With this test class the renaming mepping estimation is tested */
+class RegisterRemapping : public MesaTestWithMemCtx {
+protected:
+ void run(const vector<lifetime>& lt, const vector<int>& expect);
+};
+
+
/* This test class checks that the life time covers at least
* in the expected range. It is used for cases where we know that
* a the implementation could be improved on estimating the minimal
@@ -466,6 +473,29 @@ TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
run (code, expectation({{-1,-1},{0, 9}}));
}

+/* Here we read and write to the same temp, but it is conditional,
+ * so the lifetime must start with the first read */
+TEST_F(LifetimeEvaluatorExactTest, WriteConditionallyFromSelf)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_USEQ, {0}, {in0, in1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_FSLT, {2}, {1, in1}, {}},
+ {TGSI_OPCODE_UIF, {2}, {}, {}},
+ {TGSI_OPCODE_MOV, {3}, {in1}, {}},
+ {TGSI_OPCODE_ELSE},
+ {TGSI_OPCODE_MOV, {4}, {in1}, {}},
+ {TGSI_OPCODE_MOV, {4}, {4}, {}},
+ {TGSI_OPCODE_MOV, {3}, {4}, {}},
+ {TGSI_OPCODE_ENDIF},
+ {TGSI_OPCODE_MOV,{out1}, {3}, {}},
+ {TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 5}, {5, 6}, {7, 13}, {9, 11}}));
+}

TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
{
@@ -831,6 +861,47 @@ TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
run (code, expectation({{-1,-1},{0, 8}}));
}

+TEST_F(RegisterRemapping, RegisterRemapping1)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {1, 2},
+ {2, 10},
+ {3, 5},
+ {5, 10}
+ });
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+ run(lt, expect);
+}
+
+
+TEST_F(RegisterRemapping, RegisterRemapping2)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {3, 3},
+ {4, 4},
+ });
+ vector<int> expect({0, 1, 2, 1, 1});
+ run(lt, expect);
+}
+
+TEST_F(RegisterRemapping, RegisterRemappingMergeAll)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {1, 2},
+ {2, 3},
+ {3, 4},
+ });
+ vector<int> expect({0, 1, 1, 1, 1});
+ run(lt, expect);
+}
+
+
/* Implementation of helper and test classes */

MockShader::~MockShader()
@@ -974,3 +1045,26 @@ void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
EXPECT_GE(lifetimes[i].end, e[i][1]);
}
}
+
+void RegisterRemapping::run(const vector<lifetime>& lt,
+ const vector<int>& expect)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(lt.size(), proto);
+
+ get_temp_registers_remapping(mem_ctx, lt.size(), &lt[0], &result[0]);
+
+ vector<int> remap(lt.size());
+ for (unsigned i = 0; i < lt.size(); ++i) {
+ remap[i] = result[i].valid ? result[i].new_reg : i;
+ }
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {
+ return rn.valid ? rn.new_reg : x;
+ });
+
+ for(unsigned i = 1; i < remap.size(); ++i) {
+ EXPECT_EQ(remap[i], expect[i]);
+ }
+}
--
2.13.0
Nicolai Hähnle
2017-06-26 13:27:03 UTC
Permalink
Post by Gert Wollny
The patch adds tests for the register rename mapping evaluation.
---
.../tests/test_glsl_to_tgsi_lifetime.cpp | 94 ++++++++++++++++++++++
1 file changed, 94 insertions(+)
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 5f3378637a..f53b5c23a1 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
void check(const vector<lifetime>& result, const expectation& e);
};
+/* With this test class the renaming mepping estimation is tested */
+class RegisterRemapping : public MesaTestWithMemCtx {
+ void run(const vector<lifetime>& lt, const vector<int>& expect);
+};
+
+
/* This test class checks that the life time covers at least
* in the expected range. It is used for cases where we know that
* a the implementation could be improved on estimating the minimal
@@ -466,6 +473,29 @@ TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
run (code, expectation({{-1,-1},{0, 9}}));
}
+/* Here we read and write to the same temp, but it is conditional,
+ * so the lifetime must start with the first read */
+TEST_F(LifetimeEvaluatorExactTest, WriteConditionallyFromSelf)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_USEQ, {0}, {in0, in1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_UCMP, {1}, {0, in1, 1}, {}},
+ {TGSI_OPCODE_FSLT, {2}, {1, in1}, {}},
+ {TGSI_OPCODE_UIF, {2}, {}, {}},
Shouldn't 2 be a src here?
Post by Gert Wollny
+ {TGSI_OPCODE_MOV, {3}, {in1}, {}},
+ {TGSI_OPCODE_ELSE},
+ {TGSI_OPCODE_MOV, {4}, {in1}, {}},
+ {TGSI_OPCODE_MOV, {4}, {4}, {}},
+ {TGSI_OPCODE_MOV, {3}, {4}, {}},
+ {TGSI_OPCODE_ENDIF},
+ {TGSI_OPCODE_MOV,{out1}, {3}, {}},
+ {TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1, 5}, {5, 6}, {7, 13}, {9, 11}}));
How can this work? The lifetime of 0 should be {0, 4}.
Post by Gert Wollny
+}
TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
{
@@ -831,6 +861,47 @@ TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
run (code, expectation({{-1,-1},{0, 8}}));
}
+TEST_F(RegisterRemapping, RegisterRemapping1)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {1, 2},
+ {2, 10},
+ {3, 5},
+ {5, 10}
+ });
+
+ vector<int> expect({0, 1, 2, 1, 1, 2, 2});
+ run(lt, expect);
+}
+
+
+TEST_F(RegisterRemapping, RegisterRemapping2)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {0, 2},
+ {3, 3},
+ {4, 4},
Is {3, 3} ever a legitimate lifetime?

Cheers,
Nicolai
Post by Gert Wollny
+ });
+ vector<int> expect({0, 1, 2, 1, 1});
+ run(lt, expect);
+}
+
+TEST_F(RegisterRemapping, RegisterRemappingMergeAll)
+{
+ vector<lifetime> lt({{-1,-1},
+ {0, 1},
+ {1, 2},
+ {2, 3},
+ {3, 4},
+ });
+ vector<int> expect({0, 1, 1, 1, 1});
+ run(lt, expect);
+}
+
+
/* Implementation of helper and test classes */
MockShader::~MockShader()
@@ -974,3 +1045,26 @@ void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
EXPECT_GE(lifetimes[i].end, e[i][1]);
}
}
+
+void RegisterRemapping::run(const vector<lifetime>& lt,
+ const vector<int>& expect)
+{
+ rename_reg_pair proto{false, 0};
+ vector<rename_reg_pair> result(lt.size(), proto);
+
+ get_temp_registers_remapping(mem_ctx, lt.size(), &lt[0], &result[0]);
+
+ vector<int> remap(lt.size());
+ for (unsigned i = 0; i < lt.size(); ++i) {
+ remap[i] = result[i].valid ? result[i].new_reg : i;
+ }
+
+ std::transform(remap.begin(), remap.end(), result.begin(), remap.begin(),
+ [](int x, const rename_reg_pair& rn) {
+ return rn.valid ? rn.new_reg : x;
+ });
+
+ for(unsigned i = 1; i < remap.size(); ++i) {
+ EXPECT_EQ(remap[i], expect[i]);
+ }
+}
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-25 07:22:15 UTC
Permalink
This patch ties in the new temporary register lifetime estiamtion and
rename mapping evaluation. In order to enable it, the evironment
variable MESA_GLSL_TO_TGSI_NEW_MERGE must be set.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread; Numbers are in
% of total run.

-----------------------------------------------------------
old new(qsort) new(std::sort)

------------------------ valgrind -------------------------
merge 0.21 0.20 0.13
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.12 0.06
apply mapping 0.02 0.02 0.02

--- perf (approximate because of statistic sampling) -------
merge 0.24 0.20 0.14
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.10 0.04
apply mapping 0.05 0.05 0.05
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 528fc4cc64..d4abee9d02 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,7 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"

#include "util/hash_table.h"
#include <algorithm>
@@ -322,6 +322,7 @@ public:

void merge_two_dsts(void);
void merge_registers(void);
+ void merge_registers_alternative(void);
void renumber_registers(void);

void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
@@ -5139,6 +5140,23 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
}
}

+void
+glsl_to_tgsi_visitor::merge_registers_alternative(void)
+{
+ struct rename_reg_pair *renames =
+ rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
+ struct lifetime *lifetimes =
+ rzalloc_array(mem_ctx, struct lifetime, this->next_temp);
+
+ get_temp_registers_required_lifetimes(mem_ctx, &this->instructions,
+ this->next_temp, lifetimes);
+ get_temp_registers_remapping(mem_ctx, this->next_temp, lifetimes, renames);
+ rename_temp_registers(renames);
+
+ ralloc_free(lifetimes);
+ ralloc_free(renames);
+}
+
/* Merges temporary registers together where possible to reduce the number of
* registers needed to run a program.
*
@@ -6603,8 +6621,13 @@ get_mesa_program_tgsi(struct gl_context *ctx,
while (v->eliminate_dead_code());

v->merge_two_dsts();
- if (!skip_merge_registers)
- v->merge_registers();
+ if (!skip_merge_registers) {
+ if (getenv("MESA_GLSL_TO_TGSI_NEW_MERGE") != NULL)
+ v->merge_registers_alternative();
+ else
+ v->merge_registers();
+ }
+
v->renumber_registers();

/* Write the END instruction. */
--
2.13.0
Nicolai Hähnle
2017-06-26 13:29:33 UTC
Permalink
Post by Gert Wollny
This patch ties in the new temporary register lifetime estiamtion and
Tpyo: estimation
Post by Gert Wollny
rename mapping evaluation. In order to enable it, the evironment
variable MESA_GLSL_TO_TGSI_NEW_MERGE must be set.
This make sense during development, but I'd say the goal here is to
either merge this series unconditionally or not at all. Too many
configuration options are poison.
Post by Gert Wollny
Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread; Numbers are in
% of total run.
-----------------------------------------------------------
old new(qsort) new(std::sort)
------------------------ valgrind -------------------------
merge 0.21 0.20 0.13
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.12 0.06
apply mapping 0.02 0.02 0.02
--- perf (approximate because of statistic sampling) -------
merge 0.24 0.20 0.14
estimate lifetime 0.03 0.05 0.05
evaluate mapping (incl=0.16) 0.10 0.04
apply mapping 0.05 0.05 0.05
Please provide total running times as well.

Apart from that, the patch looks good.

Cheers,
Nicolai
Post by Gert Wollny
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 528fc4cc64..d4abee9d02 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,7 +55,7 @@
#include "st_glsl_types.h"
#include "st_nir.h"
#include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"
#include "util/hash_table.h"
#include <algorithm>
void merge_two_dsts(void);
void merge_registers(void);
+ void merge_registers_alternative(void);
void renumber_registers(void);
void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
@@ -5139,6 +5140,23 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
}
}
+void
+glsl_to_tgsi_visitor::merge_registers_alternative(void)
+{
+ struct rename_reg_pair *renames =
+ rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
+ struct lifetime *lifetimes =
+ rzalloc_array(mem_ctx, struct lifetime, this->next_temp);
+
+ get_temp_registers_required_lifetimes(mem_ctx, &this->instructions,
+ this->next_temp, lifetimes);
+ get_temp_registers_remapping(mem_ctx, this->next_temp, lifetimes, renames);
+ rename_temp_registers(renames);
+
+ ralloc_free(lifetimes);
+ ralloc_free(renames);
+}
+
/* Merges temporary registers together where possible to reduce the number of
* registers needed to run a program.
*
@@ -6603,8 +6621,13 @@ get_mesa_program_tgsi(struct gl_context *ctx,
while (v->eliminate_dead_code());
v->merge_two_dsts();
- if (!skip_merge_registers)
- v->merge_registers();
+ if (!skip_merge_registers) {
+ if (getenv("MESA_GLSL_TO_TGSI_NEW_MERGE") != NULL)
+ v->merge_registers_alternative();
+ else
+ v->merge_registers();
+ }
+
v->renumber_registers();
/* Write the END instruction. */
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-25 07:22:12 UTC
Permalink
This patch adds a set of unit tests for the new lifetime tracker.
---
configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/state_tracker/tests/Makefile.am | 36 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 976 +++++++++++++++++++++
4 files changed, 1014 insertions(+), 1 deletion(-)
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp

diff --git a/configure.ac b/configure.ac
index da7b2f8f81..5279b231ed 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2839,6 +2839,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..a88a94165d 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.

-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests

if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..fb64cf9dc2
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,36 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS)
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..5f3378637a
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,976 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+using std::pair;
+
+/* A line to describe a TGSI instruction for building mock shaders */
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ MockCodeline(unsigned _op, const vector<int>& _dst, const vector<int>& _src, const vector<int>&_to):
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+class MockShader {
+public:
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+private:
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+using expectation = vector<vector<int>>;
+
+
+class MesaTestWithMemCtx : public testing::Test {
+ void SetUp();
+ void TearDown();
+protected:
+ void *mem_ctx;
+};
+
+class LifetimeEvaluatorTest : public MesaTestWithMemCtx {
+protected:
+ void run(const vector<MockCodeline>& code, const expectation& e);
+private:
+ virtual void check(const vector<lifetime>& result, const expectation& e) = 0;
+};
+
+/* This is a test class to check the exact life times of
+ * registers. */
+class LifetimeEvaluatorExactTest : public LifetimeEvaluatorTest {
+protected:
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+/* This test class checks that the life time covers at least
+ * in the expected range. It is used for cases where we know that
+ * a the implementation could be improved on estimating the minimal
+ * life time.
+ */
+class LifetimeEvaluatorAtLeastTest : public LifetimeEvaluatorTest {
+protected:
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+/* in loop if/else value written only in one path, and read later
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
+}
+
+
+/* in loop if/else value written in both path, and read later
+ * - value must survive from first write to last read in loop
+ * for now we only check that the minimum life time is correct */
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
+}
+
+/* in loop if/else value written in both path, red in else path
+ * before read and also read later
+ * - value must survive from first write to last read in loop */
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+/* in loop if/else read in one path before written in the same loop
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop, but not further */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for all
+ * loops up untto the read scope, but not further */
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_DEFAULT},
+ {TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 3}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, WriteTwoOnlyUseOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to that inner loop. The variable has to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitchInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_DEFAULT, {}, {}, {}},
+ { TGSI_OPCODE_ENDSWITCH, {}, {}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{2, 10}}));
+}
+
+/* value read/write in differnt loops, conditional */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* value written in one case, and read in other, in loop
+ * - must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* value read/write in same case, stays there */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+/* value read/write in all cases, should only live from first
+ * write to last read, but currently the whole loop is used. */
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+/* value read/write in differnt loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+/* The variable is conditionally read before first written, so
+ * it has to surive all the loops. */
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionInLoopAndCondition)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF },
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+/* If unconditionally first written and read in the same
+ * instruction, then the register must be kept for the
+ * one write, but not more (undefined behaviour) */
+
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstruction)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* If unconditionally written and read in the same
+ * instruction, various times then the register must be
+ * kept for the one write, but not more (undefined behaviour) */
+
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionMoreThenOnce)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least one instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* register read in if */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in switch and cases */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_END, {}, {1}, {}},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferentScopesCondReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, WriteTwoReadOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SomeScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SerialReadWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {3}, {2}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}, {2,3}}));
+}
+
+
+/* Check that two destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, WriteInLoopInConditionalReadOutside)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ADD, {2}, {1,in1}, {}},
+ { TGSI_OPCODE_ENDLOOP},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}, {6,8}}));
+}
+
+
+/*
+ * With two destinations if one value is thrown away, we must
+ * ensure that the two output registers don't merge.
+ * In this test case the last access for 2 and 3 is in line 4,
+ * but 4 can only be merged with 3 because it is read, 2 on the
+ * other hand is written to, and merging it with 4 would result in
+ * a bug. */
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead2)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+ { TGSI_OPCODE_DFRACEXP , {2,4}, {3}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {4}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {1,4}, {2,3}, {3,4}}));
+}
+
+/* Check that three destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}},
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+/* Check minimal lifetime for registers only written to */
+TEST_F(LifetimeEvaluatorExactTest, OverwriteWrittenOnlyTemps)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV , {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV , {2}, {in1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}}));
+}
+
+/* same register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the last write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnlyTwiceSame)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+
+/* Dead code elimination should catch and remove the case
+ * when a variable is written after its last read, but
+ * we want the code to be aware of this case.
+ * The life time of this uselessly written variable is set
+ * to the instruction after the write, because
+ * otherwise it could be re-used too early. */
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {1}, {2}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,3}, {1,2}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop the variable must survive the
+ * outer loop
+ */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* Implementation of helper and test classes */
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+MockShader::MockShader(const vector<MockCodeline>& source):
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ /* the list is not fully initialized, so
+ * tearing it down also must be done manually. */
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+
+void MesaTestWithMemCtx::SetUp()
+{
+ mem_ctx = ralloc_context(nullptr);
+}
+
+void MesaTestWithMemCtx::TearDown()
+{
+ ralloc_free(mem_ctx);
+ mem_ctx = nullptr;
+}
+
+void LifetimeEvaluatorTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+ std::vector<lifetime> result(shader.get_num_temps());
+
+ get_temp_registers_required_lifetimes(mem_ctx, shader.get_program(),
+ shader.get_num_temps(), &result[0]);
+
+ /* lifetimes[0] not used, but created for simpler processing */
+ ASSERT_EQ(result.size(), e.size());
+ check(result, e);
+}
+
+
+void LifetimeEvaluatorExactTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].begin, e[i][0]);
+ EXPECT_EQ(lifetimes[i].end, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].begin, e[i][0]);
+ EXPECT_GE(lifetimes[i].end, e[i][1]);
+ }
+}
--
2.13.0
Nicolai Hähnle
2017-06-26 13:12:30 UTC
Permalink
Post by Gert Wollny
This patch adds a set of unit tests for the new lifetime tracker.
---
configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/state_tracker/tests/Makefile.am | 36 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 976 +++++++++++++++++++++
4 files changed, 1014 insertions(+), 1 deletion(-)
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
diff --git a/configure.ac b/configure.ac
index da7b2f8f81..5279b231ed 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2839,6 +2839,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/osmesa/osmesa.pc
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
src/vulkan/Makefile])
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..a88a94165d 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests
if HAVE_XLIB_GLX
SUBDIRS += drivers/x11
diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 0000000000..fb64cf9dc2
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,36 @@
+AM_CFLAGS = \
+ $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+ $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+ -I$(top_srcdir)/src/gtest/include \
+ -I$(top_srcdir)/src \
+ -I$(top_srcdir)/src/mapi \
+ -I$(top_builddir)/src/mesa \
+ -I$(top_srcdir)/src/mesa \
+ -I$(top_srcdir)/include \
+ -I$(top_srcdir)/src/gallium/include \
+ -I$(top_srcdir)/src/gallium/auxiliary \
+ $(DEFINES)
+
+TESTS = st-renumerate-test
+check_PROGRAMS = st-renumerate-test
+
+st_renumerate_test_SOURCES = \
+ test_glsl_to_tgsi_lifetime.cpp
+
+st_renumerate_test_LDFLAGS = \
+ $(LLVM_LDFLAGS)
+
+st_renumerate_test_LDADD = \
+ $(top_builddir)/src/mesa/libmesagallium.la \
+ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \
+ $(top_builddir)/src/gallium/auxiliary/libgallium.la \
+ $(top_builddir)/src/util/libmesautil.la \
+ $(top_builddir)/src/gtest/libgtest.la \
+ $(GALLIUM_COMMON_LIB_DEPS) \
+ $(LLVM_LIBS) \
+ $(PTHREAD_LIBS) \
+ $(DLOPEN_LIBS)
diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
new file mode 100644
index 0000000000..5f3378637a
--- /dev/null
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -0,0 +1,976 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <state_tracker/st_glsl_to_tgsi_temprename.h>
+#include <tgsi/tgsi_ureg.h>
+#include <tgsi/tgsi_info.h>
+#include <compiler/glsl/list.h>
+#include <gtest/gtest.h>
+
+using std::vector;
+using std::pair;
+
+/* A line to describe a TGSI instruction for building mock shaders */
+struct MockCodeline {
+ MockCodeline(unsigned _op): op(_op) {}
+ op(_op), dst(_dst), src(_src), tex_offsets(_to){}
+ unsigned op;
+ vector<int> dst;
+ vector<int> src;
+ vector<int> tex_offsets;
+};
+
+const int in0 = 0;
+const int in1 = -1;
+const int in2 = -2;
+
+const int out0 = 0;
+const int out1 = -1;
+
+class MockShader {
+ MockShader(const vector<MockCodeline>& source);
+ ~MockShader();
+
+ void free();
+
+ exec_list* get_program();
+ int get_num_temps();
+ st_src_reg create_src_register(int src_idx);
+ st_dst_reg create_dst_register(int dst_idx);
+ exec_list* program;
+ int num_temps;
+ void *mem_ctx;
+};
+
+using expectation = vector<vector<int>>;
+
+
+class MesaTestWithMemCtx : public testing::Test {
+ void SetUp();
+ void TearDown();
+ void *mem_ctx;
+};
+
+class LifetimeEvaluatorTest : public MesaTestWithMemCtx {
+ void run(const vector<MockCodeline>& code, const expectation& e);
+ virtual void check(const vector<lifetime>& result, const expectation& e) = 0;
+};
+
+/* This is a test class to check the exact life times of
+ * registers. */
+class LifetimeEvaluatorExactTest : public LifetimeEvaluatorTest {
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+/* This test class checks that the life time covers at least
+ * in the expected range. It is used for cases where we know that
+ * a the implementation could be improved on estimating the minimal
+ * life time.
+ */
+class LifetimeEvaluatorAtLeastTest : public LifetimeEvaluatorTest {
+ void check(const vector<lifetime>& result, const expectation& e);
+};
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {1, in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMove)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,1}, {1,2}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
UADD doesn't have texoffsets.
Post by Gert Wollny
+ { TGSI_OPCODE_END}
+ };
+ run(code, expectation({{-1, -1}, {0,2}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 5}, {2,3}, {3, 6}}));
+}
+
+
+/* in loop if/else value written only in one path, and read later
+ * - value must survive the whole loop */
Closing */ on its own line. Also, please capitalize the start of sentences.
Post by Gert Wollny
+TEST_F(LifetimeEvaluatorExactTest, MoveInIfInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {5, 8}}));
Space between , and {
Post by Gert Wollny
+}
+
+
+/* in loop if/else value written in both path, and read later
Same comments as above, plus typo: *paths
Post by Gert Wollny
+ * - value must survive from first write to last read in loop
+ * for now we only check that the minimum life time is correct */
+TEST_F(LifetimeEvaluatorAtLeastTest, WriteInIfAndElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {3,7}, {7, 10}}));
Space (recurring problem also below)
Post by Gert Wollny
+}
+
+/* in loop if/else value written in both path, red in else path
As above, plus typo: *read
Post by Gert Wollny
+ * before read and also read later
+ * - value must survive from first write to last read in loop */
+TEST_F(LifetimeEvaluatorExactTest, WriteInIfAndElseReadInElseInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF},
+ { TGSI_OPCODE_UADD, {2}, {1, in0}, {}},
+ { TGSI_OPCODE_ELSE },
+ { TGSI_OPCODE_ADD, {2}, {1, 2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}, {1,9}, {7, 10}}));
+}
+
+/* in loop if/else read in one path before written in the same loop
+ * - value must survive the whole loop */
+TEST_F(LifetimeEvaluatorExactTest, ReadInIfInLoopBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_UADD, {2}, {1, 3}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_UADD, {3}, {1, 2}, {}},
+ { TGSI_OPCODE_UADD, {3}, {3, in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 7}, {1,7}, {1, 8}}));
+}
+
+/* Write in nested ifs in loop, for now we do test whether the
+ * life time is atleast what is required, but we know that the
+ * implementation doesn't do a full check and sets larger boundaries */
+TEST_F(LifetimeEvaluatorAtLeastTest, NestedIfInLoopAlwaysWriteButNotPropagated)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3, 14}}));
+}
+
+
+
At most two consecutive empty lines at global scope.
Post by Gert Wollny
+TEST_F(LifetimeEvaluatorExactTest, NestedIfInLoopWriteNotAlways)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ELSE},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 13}}));
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
Sorry to keep harping on this, but this is still incorrect.

TGSI loops don't have an implied loop condition, so the only way to exit
a loop is via BRK. The CONT here doesn't matter, the lifetime should be
{4, 6}.
Post by Gert Wollny
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for the
+ * whole loop, but not further */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
Should be {5, 7} for the same reason.
Post by Gert Wollny
+}
+
+/* if a continue is in the loop, all variables written after the
+ * continue and used outside the loop must be maintained for all
+ * loops up untto the read scope, but not further */
untto? :)
Post by Gert Wollny
+TEST_F(LifetimeEvaluatorExactTest, Nested2LoopWithWriteAfterContinue)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_CONT},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 10}}));
{6, 9}
Post by Gert Wollny
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+ const vector<MockCodeline> code = {
+ {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_DEFAULT},
+ {TGSI_OPCODE_ENDSWITCH},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 3}}));
So, SWITCH/CASE is a bit of an odd-ball, and I don't think we really use
it, precisely because of how weird it is.

I think the correct interpretation would be that all the sources on both
the SWITCH and the corresponding CASE lines have a read access on the
line of the switch statement.

Please adjust the test accordingly (also, use different sources for the
SWITCH and CASE statements!).
Post by Gert Wollny
+}
+
+TEST_F(LifetimeEvaluatorExactTest, WriteTwoOnlyUseOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
Remove space between ADD and ,
Post by Gert Wollny
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+/* if a break is in the loop, all variables written after the
Remove extra space between break and is, and break and and below.
Post by Gert Wollny
+ * break and used outside the loop must be maintained for the
+ * whole loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 6}}));
+
+}
+
+/* if a break is in the loop, but inside a switch case, so it
+ * referes to that inner loop. The variable has to survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteAfterBreakInSwitchInLoop)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_SWITCH, {}, {in1}, {}},
+ { TGSI_OPCODE_CASE, {}, {in1}, {}},
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_DEFAULT, {}, {}, {}},
+ { TGSI_OPCODE_ENDSWITCH, {}, {}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{2, 10}}));
+}
+
+/* value read/write in differnt loops, conditional */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopWithWriteInSwitch)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+/* value written in one case, and read in other, in loop
+ * - must survive the loop */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchDifferentCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, LoopRWInSwitchCaseLastCaseWithoutBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* value read/write in same case, stays there */
+TEST_F(LifetimeEvaluatorExactTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {} },
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,4}}));
+}
+
+/* value read/write in all cases, should only live from first
+ * write to last read, but currently the whole loop is used. */
+TEST_F(LifetimeEvaluatorAtLeastTest, LoopWithReadWriteInSwitchSameCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_SWITCH, {}, {in0}, {}},
+ { TGSI_OPCODE_CASE, {}, {in0}, {} },
+ { TGSI_OPCODE_MOV, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_DEFAULT },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_BRK },
+ { TGSI_OPCODE_ENDSWITCH },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{3,9}}));
+}
+
+/* value read/write in differnt loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopes)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{1,5}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferntScopesConditionalReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+/* The variable is conditionally read before first written, so
+ * it has to surive all the loops. */
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionInLoopAndCondition)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF },
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,7}}));
+}
+
+/* If unconditionally first written and read in the same
+ * instruction, then the register must be kept for the
+ * one write, but not more (undefined behaviour) */
+
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstruction)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* If unconditionally written and read in the same
+ * instruction, various times then the register must be
+ * kept for the one write, but not more (undefined behaviour) */
+
+TEST_F(LifetimeEvaluatorExactTest, FRaWSameInstructionMoreThenOnce)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_ADD, {1}, {1,in0}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+
+/* register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least one instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnly)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}}));
+}
+
+/* register read in if */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForIf)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {in0, in1}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_ENDIF}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+/* register read in switch and cases */
+TEST_F(LifetimeEvaluatorExactTest, SimpleReadForSwitchAndCase)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_SWITCH, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_CASE, {}, {1}, {}},
+ { TGSI_OPCODE_END, {}, {1}, {}},
+ };
+ run (code, expectation({{-1,-1},{0,3}}));
+}
+
+/* first read before first write wiredness with nested loops */
+TEST_F(LifetimeEvaluatorExactTest, LoopsWithDifferentScopesCondReadBeforeWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,9}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, WriteTwoReadOne)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1, 2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {2, in0}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}, {1,2}}));
+}
+
+
+TEST_F(LifetimeEvaluatorExactTest, SomeScopesAndNoEndProgramId)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_IF, {}, {1}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_ENDIF},
+ };
+ run (code, expectation({{-1,-1},{0,4}, {2,5}}));
+}
+
+TEST_F(LifetimeEvaluatorExactTest, SerialReadWrite)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {3}, {2}, {}},
+ { TGSI_OPCODE_MOV, {out0}, {3}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}, {2,3}}));
+}
+
+
+/* Check that two destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, TwoDestRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {out0}, {1,2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {0,1}}));
+}
+
+/* Check that two destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, WriteInLoopInConditionalReadOutside)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP},
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BGNLOOP},
+ { TGSI_OPCODE_MOV, {1}, {in1}, {}},
+ { TGSI_OPCODE_ENDLOOP},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_ADD, {2}, {1,in1}, {}},
+ { TGSI_OPCODE_ENDLOOP},
+ { TGSI_OPCODE_MOV, {out0}, {2}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,7}, {6,8}}));
+}
+
+
+/*
+ * With two destinations if one value is thrown away, we must
+ * ensure that the two output registers don't merge.
+ * In this test case the last access for 2 and 3 is in line 4,
+ * but 4 can only be merged with 3 because it is read, 2 on the
+ * other hand is written to, and merging it with 4 would result in
+ * a bug. */
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead2)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in0}, {}},
+ { TGSI_OPCODE_ADD, {3}, {1,2}, {}},
+ { TGSI_OPCODE_DFRACEXP , {2,4}, {3}, {}},
+ { TGSI_OPCODE_MOV, {out1}, {4}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {1,4}, {2,3}, {3,4}}));
+}
+
+/* Check that three destination registers are used */
+TEST_F(LifetimeEvaluatorExactTest, ThreeSourceRegisters)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_DFRACEXP , {1,2}, {in0}, {}},
+ { TGSI_OPCODE_ADD , {3}, {in0, in1}, {}},
+ { TGSI_OPCODE_MAD, {out0}, {1,2, 3}, {}},
Be consistent about whitespace, please.

There are more issues like this throughout, please fix them.

Cheers,
Nicolai
Post by Gert Wollny
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}, {0,2}, {1,2}}));
+}
+
+/* Check minimal lifetime for registers only written to */
+TEST_F(LifetimeEvaluatorExactTest, OverwriteWrittenOnlyTemps)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV , {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV , {2}, {in1}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,1}, {1,2}}));
+}
+
+/* same register is only written. This should not happen,
+ * but to handle the case we want the register to life
+ * at least past the last write instruction */
+TEST_F(LifetimeEvaluatorExactTest, WriteOnlyTwiceSame)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0,2}}));
+}
+
+
+/* Dead code elimination should catch and remove the case
+ * when a variable is written after its last read, but
+ * we want the code to be aware of this case.
+ * The life time of this uselessly written variable is set
+ * to the instruction after the write, because
+ * otherwise it could be re-used too early. */
+TEST_F(LifetimeEvaluatorExactTest, WritePastLastRead)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {1}, {}},
+ { TGSI_OPCODE_MOV, {1}, {2}, {}},
+ { TGSI_OPCODE_END},
+
+ };
+ run (code, expectation({{-1,-1},{0,3}, {1,2}}));
+}
+
+/* if a break is in the loop, all variables written after the
+ * break and used outside the loop the variable must survive the
+ * outer loop
+ */
+TEST_F(LifetimeEvaluatorExactTest, NestedLoopWithWriteAfterBreak)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_BGNLOOP },
+ { TGSI_OPCODE_IF, {}, {in0}, {}},
+ { TGSI_OPCODE_BRK},
+ { TGSI_OPCODE_ENDIF},
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_MOV, {out0}, {1}, {}},
+ { TGSI_OPCODE_ENDLOOP },
+ { TGSI_OPCODE_END}
+ };
+ run (code, expectation({{-1,-1},{0, 8}}));
+}
+
+/* Implementation of helper and test classes */
+
+MockShader::~MockShader()
+{
+ free();
+ ralloc_free(mem_ctx);
+}
+
+int MockShader::get_num_temps()
+{
+ return num_temps;
+}
+
+
+exec_list* MockShader::get_program()
+{
+ return program;
+}
+
+ num_temps(0)
+{
+ mem_ctx = ralloc_context(NULL);
+
+ program = new(mem_ctx) exec_list();
+
+ for (MockCodeline i: source) {
+ glsl_to_tgsi_instruction *next_instr = new(mem_ctx) glsl_to_tgsi_instruction();
+ next_instr->op = i.op;
+ next_instr->info = tgsi_get_opcode_info(i.op);
+
+ assert(i.src.size() < 4);
+ assert(i.dst.size() < 3);
+ assert(i.tex_offsets.size() < 3);
+
+ for (unsigned k = 0; k < i.src.size(); ++k) {
+ next_instr->src[k] = create_src_register(i.src[k]);
+ }
+ for (unsigned k = 0; k < i.dst.size(); ++k) {
+ next_instr->dst[k] = create_dst_register(i.dst[k]);
+ }
+ next_instr->tex_offset_num_offset = i.tex_offsets.size();
+ next_instr->tex_offsets = new st_src_reg[i.tex_offsets.size()];
+ for (unsigned k = 0; k < i.tex_offsets.size(); ++k) {
+ next_instr->tex_offsets[k] = create_src_register(i.tex_offsets[k]);
+ }
+
+ program->push_tail(next_instr);
+ }
+ ++num_temps;
+}
+
+void MockShader::free()
+{
+ /* the list is not fully initialized, so
+ * tearing it down also must be done manually. */
+ exec_node *p;
+ while ((p = program->pop_head())) {
+ glsl_to_tgsi_instruction * instr = static_cast<glsl_to_tgsi_instruction *>(p);
+ if (instr->tex_offset_num_offset > 0)
+ delete[] instr->tex_offsets;
+ delete p;
+ }
+ program = 0;
+ num_temps = 0;
+}
+
+st_src_reg MockShader::create_src_register(int src_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (src_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = src_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_INPUT;
+ idx = -src_idx;
+ }
+ return st_src_reg(file, idx, GLSL_TYPE_INT);
+
+}
+
+st_dst_reg MockShader::create_dst_register(int dst_idx)
+{
+ gl_register_file file;
+ int idx = 0;
+ if (dst_idx > 0) {
+ file = PROGRAM_TEMPORARY;
+ idx = dst_idx;
+ if (num_temps < idx)
+ num_temps = idx;
+ } else {
+ file = PROGRAM_OUTPUT;
+ idx = - dst_idx;
+ }
+ return st_dst_reg(file, 0xF, GLSL_TYPE_INT, idx);
+}
+
+
+void MesaTestWithMemCtx::SetUp()
+{
+ mem_ctx = ralloc_context(nullptr);
+}
+
+void MesaTestWithMemCtx::TearDown()
+{
+ ralloc_free(mem_ctx);
+ mem_ctx = nullptr;
+}
+
+void LifetimeEvaluatorTest::run(const vector<MockCodeline>& code, const expectation& e)
+{
+ MockShader shader(code);
+ std::vector<lifetime> result(shader.get_num_temps());
+
+ get_temp_registers_required_lifetimes(mem_ctx, shader.get_program(),
+ shader.get_num_temps(), &result[0]);
+
+ /* lifetimes[0] not used, but created for simpler processing */
+ ASSERT_EQ(result.size(), e.size());
+ check(result, e);
+}
+
+
+void LifetimeEvaluatorExactTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_EQ(lifetimes[i].begin, e[i][0]);
+ EXPECT_EQ(lifetimes[i].end, e[i][1]);
+ }
+}
+
+void LifetimeEvaluatorAtLeastTest::check( const vector<lifetime>& lifetimes,
+ const expectation& e)
+{
+ for (unsigned i = 1; i < lifetimes.size(); ++i) {
+ EXPECT_LE(lifetimes[i].begin, e[i][0]);
+ EXPECT_GE(lifetimes[i].end, e[i][1]);
+ }
+}
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-27 12:10:53 UTC
Permalink
Post by Nicolai Hähnle
Post by Gert Wollny
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+   const vector<MockCodeline> code = {
+      { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+      { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+      { TGSI_OPCODE_UADD, {out0}, {},  {1,2}},
UADD doesn't have texoffsets.
The test just checks that src from textoffsets are picked up, but I
would appreciate if you could give me a well formed TGSI instruction
line that takes a texoffset, (I guess that TGSI_OPCODE_TEX would be the
opcode, but the TGSI documentation doesn't give a real example).
Post by Nicolai Hähnle
Sorry to keep harping on this, but this is still incorrect.
TGSI loops don't have an implied loop condition, so the only way to
exit  a loop is via BRK. The CONT here doesn't matter, the lifetime
should be {4, 6}.
I'll change it, but at least I was not underestimating the lifetime.
Post by Nicolai Hähnle
Post by Gert Wollny
+}
+
+/* Temporary used to switch must live through all case statememts */
+TEST_F(LifetimeEvaluatorExactTest, UseSwitchCase)
+{
+   const vector<MockCodeline> code = {
+      {TGSI_OPCODE_MOV, {1}, {in0}, {}},
+      {TGSI_OPCODE_SWITCH, {}, {1}, {}},
+      { TGSI_OPCODE_CASE, {}, {1}, {}},
+      { TGSI_OPCODE_CASE, {}, {1}, {}},
+      { TGSI_OPCODE_BRK},
+      { TGSI_OPCODE_DEFAULT},
+      {TGSI_OPCODE_ENDSWITCH},
+      { TGSI_OPCODE_END}
+   };
+   run (code, expectation({{-1,-1},{0, 3}}));
So, SWITCH/CASE is a bit of an odd-ball, and I don't think we really
use it, precisely because of how weird it is.
I think the correct interpretation would be that all the sources on
both  the SWITCH and the corresponding CASE lines have a read access
on the line of the switch statement.
Please adjust the test accordingly (also, use different sources for
the SWITCH and CASE statements!).
I've corrected this, to let src for SWITCH live through all case
statements (case and switch both take one argument). But you seem to be
right that the according switch code is actually emulated by IF chains
in the TGSI.

Best,
Gert
Ilia Mirkin
2017-06-27 12:12:30 UTC
Permalink
Post by Gert Wollny
Post by Nicolai Hähnle
Post by Gert Wollny
+TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAddMoveTexoffset)
+{
+ const vector<MockCodeline> code = {
+ { TGSI_OPCODE_MOV, {1}, {in0}, {}},
+ { TGSI_OPCODE_MOV, {2}, {in1}, {}},
+ { TGSI_OPCODE_UADD, {out0}, {}, {1,2}},
UADD doesn't have texoffsets.
The test just checks that src from textoffsets are picked up, but I
would appreciate if you could give me a well formed TGSI instruction
line that takes a texoffset, (I guess that TGSI_OPCODE_TEX would be the
opcode, but the TGSI documentation doesn't give a real example).
TEX, TXF, TXD - basically all the texturing operations.
Gert Wollny
2017-06-25 07:22:13 UTC
Permalink
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.
---
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 124 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 3 +
2 files changed, 127 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 729d77130e..d52d912951 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -27,6 +27,12 @@
#include <mesa/program/prog_instruction.h>
#include <limits>

+/* std::sort is significanter than qsort */
+#define USE_STL_SORT
+#ifdef USE_STL_SORT
+#include <algorithm>
+#endif
+
/* Without c++11 define the nullptr for forward-compatibility
* and better readibility */
#if __cplusplus < 201103L
@@ -660,3 +666,121 @@ prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
return &storage[current_slot++];
}
+
+/* helper class for sorting and searching the registers based
+ * on life times. */
+struct access_record {
+ int begin;
+ int end;
+ int reg;
+ bool erase;
+
+ bool operator < (const access_record& rhs) {
+ return begin < rhs.begin;
+ }
+};
+
+/* Find the next register between [start, end) that has a life time starting
+ * at or after bound by using a binary search.
+ * start points at the beginning of the search range,
+ * end points at the element past the end of the search range, and
+ * the array comprising [start, end) must be sorted in ascending order.
+ */
+access_record*
+find_next_rename(access_record* start, access_record* end, int bound)
+{
+ int delta = (end - start);
+
+ while (delta > 0) {
+
+ int half = delta >> 1;
+ access_record* middle = start + half;
+
+ if (bound <= middle->begin) {
+ delta = half;
+ } else {
+ start = middle;
+ ++start;
+ delta -= half + 1;
+ }
+ }
+
+ return start;
+}
+
+#ifndef USE_STL_SORT
+int access_record_compare (const void *a, const void *b) {
+ const access_record *aa = static_cast<const access_record*>(a);
+ const access_record *bb = static_cast<const access_record*>(b);
+ return aa->begin < bb->begin ? -1 : (aa->begin > bb->begin ? 1 : 0);
+}
+#endif
+
+/* This functions evaluates the register merges by using an O(n log n)
+ * algorithm to find suitable merge candidates. */
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+ const struct lifetime* lifetimes,
+ struct rename_reg_pair *result)
+{
+ access_record *m = ralloc_array(mem_ctx, access_record, ntemps - 1);
+
+ for (int i = 1; i < ntemps; ++i) {
+ m[i-1].begin = lifetimes[i].begin;
+ m[i-1].end = lifetimes[i].end;
+ m[i-1].reg = i;
+ m[i-1].erase = false;
+ }
+
+#ifdef USE_STL_SORT
+ std::sort(m, m + ntemps - 1);
+#else
+ std::qsort(m, ntemps - 1, sizeof(access_record), access_record_compare);
+#endif
+
+ access_record *trgt = m;
+ access_record *mend = m + ntemps - 1;
+ access_record *first_erase = mend;
+ access_record *search_start = trgt + 1;
+
+ while (trgt != mend) {
+
+ access_record *src = find_next_rename(search_start, mend, trgt->end);
+
+ if (src != mend) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't remove the renamed
+ * register just now, only mark it. */
+ src->erase = true;
+
+ if (first_erase == mend)
+ first_erase = src;
+
+ search_start = src + 1;
+ } else {
+ /* Moving to the next target register it is time to remove
+ * the already merged registers from the search range */
+ if (first_erase != mend) {
+
+ access_record *out = first_erase;
+ access_record *in_start = first_erase + 1;
+
+ while (in_start != mend) {
+
+ if (!in_start->erase)
+ *out++ = *in_start;
+
+ ++in_start;
+ }
+ mend = out;
+ first_erase = mend;
+ }
+
+ ++trgt;
+ search_start = trgt + 1;
+ }
+ }
+ ralloc_free(m);
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
index a4124b4659..f6a89ed0d3 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -31,3 +31,6 @@ struct lifetime {
void
get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
int ntemps, struct lifetime *lifetimes);
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+ const struct lifetime* lifetimes,
+ struct rename_reg_pair *result);
--
2.13.0
Nicolai Hähnle
2017-06-26 13:24:26 UTC
Permalink
Post by Gert Wollny
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.
---
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 124 +++++++++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 3 +
2 files changed, 127 insertions(+)
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 729d77130e..d52d912951 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -27,6 +27,12 @@
#include <mesa/program/prog_instruction.h>
#include <limits>
+/* std::sort is significanter than qsort */
Something is missing in this comment :)
Post by Gert Wollny
+#define USE_STL_SORT
+#ifdef USE_STL_SORT
+#include <algorithm>
+#endif
+
/* Without c++11 define the nullptr for forward-compatibility
* and better readibility */
#if __cplusplus < 201103L
@@ -660,3 +666,121 @@ prog_scope_storage::create(prog_scope *p, e_scope_type type, int id,
storage[current_slot] = prog_scope(p, type, id, lvl, s_begin);
return &storage[current_slot++];
}
+
+/* helper class for sorting and searching the registers based
+ * on life times. */
Closing */ on its own line, and proper capitalization, please.
Post by Gert Wollny
+struct access_record {
+ int begin;
+ int end;
+ int reg;
+ bool erase;
+
+ bool operator < (const access_record& rhs) {
const?
Post by Gert Wollny
+ return begin < rhs.begin;
+ }
+};
+
+/* Find the next register between [start, end) that has a life time starting
+ * at or after bound by using a binary search.
+ * start points at the beginning of the search range,
+ * end points at the element past the end of the search range, and
+ * the array comprising [start, end) must be sorted in ascending order.
+ */
+access_record*
+find_next_rename(access_record* start, access_record* end, int bound)
Function should be static.
Post by Gert Wollny
+{
+ int delta = (end - start);
+
+ while (delta > 0) {
+
Again with the whitespace issue. Please also fix other occurrences.
Post by Gert Wollny
+ int half = delta >> 1;
+ access_record* middle = start + half;
+
+ if (bound <= middle->begin) {
+ delta = half;
+ } else {
+ start = middle;
+ ++start;
+ delta -= half + 1;
+ }
+ }
+
+ return start;
+}
+
+#ifndef USE_STL_SORT
Function should be static.
Post by Gert Wollny
+int access_record_compare (const void *a, const void *b) { > + const access_record *aa = static_cast<const access_record*>(a);
+ const access_record *bb = static_cast<const access_record*>(b);
+ return aa->begin < bb->begin ? -1 : (aa->begin > bb->begin ? 1 : 0);
+}
+#endif
+
+/* This functions evaluates the register merges by using an O(n log n)
+ * algorithm to find suitable merge candidates. */
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+ const struct lifetime* lifetimes,
+ struct rename_reg_pair *result)
+{
+ access_record *m = ralloc_array(mem_ctx, access_record, ntemps - 1);
m is not a very descriptive name.

Why ntemps - 1?
Post by Gert Wollny
+
+ for (int i = 1; i < ntemps; ++i) {
+ m[i-1].begin = lifetimes[i].begin;
+ m[i-1].end = lifetimes[i].end;
+ m[i-1].reg = i;
+ m[i-1].erase = false;
+ }
+
+#ifdef USE_STL_SORT
+ std::sort(m, m + ntemps - 1);
+#else
+ std::qsort(m, ntemps - 1, sizeof(access_record), access_record_compare);
+#endif
+
+ access_record *trgt = m;
+ access_record *mend = m + ntemps - 1;
+ access_record *first_erase = mend;
+ access_record *search_start = trgt + 1;
+
+ while (trgt != mend) {
+
+ access_record *src = find_next_rename(search_start, mend, trgt->end);
+
+ if (src != mend) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't remove the renamed
+ * register just now, only mark it. */
+ src->erase = true;
+
+ if (first_erase == mend)
+ first_erase = src;
+
+ search_start = src + 1;
+ } else {
+ /* Moving to the next target register it is time to remove
+ * the already merged registers from the search range */
+ if (first_erase != mend) {
+
+ access_record *out = first_erase;
+ access_record *in_start = first_erase + 1;
Why in_start? Better just in. Or maybe even dst and src, that's more
idiomatic.

On a more high-level note, this algorithm isn't actually O(n log n) as
you claimed somewhere. It's true that you improved the search part, but
now the compacting is the asymptotic bottleneck, and it's like still
O(n^2), unless I'm missing something.

Cheers,
Nicolai
Post by Gert Wollny
+
+ while (in_start != mend) {
+
+ if (!in_start->erase)
+ *out++ = *in_start;
+
+ ++in_start;
+ }
+ mend = out;
+ first_erase = mend;
+ }
+
+ ++trgt;
+ search_start = trgt + 1;
+ }
+ }
+ ralloc_free(m);
+}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
index a4124b4659..f6a89ed0d3 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -31,3 +31,6 @@ struct lifetime {
void
get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
int ntemps, struct lifetime *lifetimes);
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+ const struct lifetime* lifetimes,
+ struct rename_reg_pair *result);
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Nicolai Hähnle
2017-06-26 13:13:53 UTC
Permalink
Thanks for the update. Do you have the series on an accessible git
repository somewhere? E.g. on GitHub or Gitlab or wherever? That would
be helpful.

Also, please don't continue to chain this thread for subsequent
versions. It's okay to do this for quick follow-up fixes to patches you
sent, but it's not a good idea when there are many revisions.

Thanks,
Nicolai
Post by Gert Wollny
Dear all,
- correct formatting following Emil's suggetions
- remove un-needed libraries for the tests
- rebase to master (e25950808f4eee)
I didn't change anything to the code logic and I'm using mesa with the
patch applied for a few days now without noting any regressions.
As noted before, I don't have write access to mesa-git, so I'll need someone
who sponsors this patch.
Many thanks for any additional comments,
Gert
mesa/st: glsl_to_tgsi move some helper classes to extra files
mesa/st: glsl_to_tgsi: implement new temporary register lifetime
tracker
mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
tracker
mesa/st: glsl_to_tgsi: add register renamame mapping evaluator
mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
configure.ac | 1 +
src/mesa/Makefile.am | 2 +-
src/mesa/Makefile.sources | 4 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 315 +-----
src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 207 ++++
src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 165 +++
.../state_tracker/st_glsl_to_tgsi_temprename.cpp | 786 ++++++++++++++
.../state_tracker/st_glsl_to_tgsi_temprename.h | 36 +
src/mesa/state_tracker/tests/Makefile.am | 37 +
.../tests/test_glsl_to_tgsi_lifetime.cpp | 1070 ++++++++++++++++++++
10 files changed, 2335 insertions(+), 288 deletions(-)
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
create mode 100644 src/mesa/state_tracker/tests/Makefile.am
create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Gert Wollny
2017-06-27 10:19:55 UTC
Permalink
Thanks for the update. Do you have the series on an accessible git 
repository somewhere? E.g. on GitHub or Gitlab or wherever? That
would be helpful.
I've put the code on

https://github.com/gerddie/mesa

and I am already pushing my changes there. I will not try to squash the
commits into the original patch set though.
Also, please don't continue to chain this thread for subsequent 
versions. It's okay to do this for quick follow-up fixes to patches
you sent, but it's not a good idea when there are many revisions.
Sorry, I was not sure what is the best way.

Best,
Gert
Nicolai Hähnle
2017-06-28 08:11:05 UTC
Permalink
Post by Gert Wollny
Post by Nicolai Hähnle
Thanks for the update. Do you have the series on an accessible git
repository somewhere? E.g. on GitHub or Gitlab or wherever? That
would be helpful.
I've put the code on
https://github.com/gerddie/mesa
and I am already pushing my changes there. I will not try to squash the
commits into the original patch set though.
Thanks!
Post by Gert Wollny
Post by Nicolai Hähnle
Also, please don't continue to chain this thread for subsequent
versions. It's okay to do this for quick follow-up fixes to patches
you sent, but it's not a good idea when there are many revisions.
Sorry, I was not sure what is the best way.
No worries :)

Cheers,
Nicolai
Post by Gert Wollny
Best,
Gert
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
Loading...