Discussion:
[PATCH 01/18] intel/genxml: Normalize GS_STATE.
(too old to reply)
Rafael Antognolli
2017-06-16 23:31:14 UTC
Permalink
Raw Message
Rename "Rendering Enable" to "Rendering Enabled", so it matches gen6+.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen5.xml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index 65479d2..4651192 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -485,7 +485,7 @@
<field name="Number of URB Entries" start="139" end="146" type="uint"/>
<field name="GS Statistics Enable" start="138" end="138" type="bool"/>
<field name="SO Statistics Enable" start="137" end="137" type="bool"/>
- <field name="Rendering Enable" start="136" end="136" type="bool"/>
+ <field name="Rendering Enabled" start="136" end="136" type="bool"/>
<field name="Sampler State Pointer" start="165" end="191" type="address"/>
<field name="Sampler Count" start="160" end="162" type="uint"/>
<field name="Reorder Enable" start="222" end="222" type="bool"/>
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:16 UTC
Permalink
Raw Message
This is a bitmask, so it can't be a boolean. Also rename it so it matches
gen6+.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen4.xml | 2 +-
src/intel/genxml/gen45.xml | 2 +-
src/intel/genxml/gen5.xml | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
index d873422..e327bf4 100644
--- a/src/intel/genxml/gen4.xml
+++ b/src/intel/genxml/gen4.xml
@@ -379,7 +379,7 @@
<field name="Viewport Z ClipTest Enable" start="187" end="187" type="bool"/>
<field name="Guardband ClipTest Enable" start="186" end="186" type="bool"/>
<field name="UserClipFlags MustClip Enable" start="184" end="184" type="bool"/>
- <field name="UserClipFlags ClipTest Enable Bitmask" start="176" end="183" type="bool"/>
+ <field name="UserClipDistance ClipTest Enable Bitmask" start="176" end="183" type="uint"/>
<field name="Clip Mode" start="173" end="175" type="uint" prefix="CLIPMODE">
<value name="NORMAL" value="0"/>
<value name="ALL" value="1"/>
diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index 4064278..864946a 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -381,7 +381,7 @@
<field name="Guardband ClipTest Enable" start="186" end="186" type="bool"/>
<field name="Negative W ClipTest Enable" start="185" end="185" type="bool"/>
<field name="UserClipFlags MustClip Enable" start="184" end="184" type="bool"/>
- <field name="UserClipFlags ClipTest Enable Bitmask" start="176" end="183" type="bool"/>
+ <field name="UserClipDistance ClipTest Enable Bitmask" start="176" end="183" type="uint"/>
<field name="Clip Mode" start="173" end="175" type="uint" prefix="CLIPMODE">
<value name="NORMAL" value="0"/>
<value name="ALL" value="1"/>
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index d759ad1..1dd7fed 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -379,7 +379,7 @@
<field name="Guardband ClipTest Enable" start="186" end="186" type="bool"/>
<field name="Negative W ClipTest Enable" start="185" end="185" type="bool"/>
<field name="UserClipFlags MustClip Enable" start="184" end="184" type="bool"/>
- <field name="UserClipFlags ClipTest Enable Bitmask" start="176" end="183" type="bool"/>
+ <field name="UserClipDistance ClipTest Enable Bitmask" start="176" end="183" type="uint"/>
<field name="Clip Mode" start="173" end="175" type="uint" prefix="CLIPMODE">
<value name="NORMAL" value="0"/>
<value name="ALL" value="1"/>
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:15 UTC
Permalink
Raw Message
These fields are set by brw_clip_unit, so we need them when converting to
genxml.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen45.xml | 1 +
src/intel/genxml/gen5.xml | 1 +
2 files changed, 2 insertions(+)

diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index 59460fd..4064278 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -379,6 +379,7 @@
<field name="Viewport XY ClipTest Enable" start="188" end="188" type="bool"/>
<field name="Viewport Z ClipTest Enable" start="187" end="187" type="bool"/>
<field name="Guardband ClipTest Enable" start="186" end="186" type="bool"/>
+ <field name="Negative W ClipTest Enable" start="185" end="185" type="bool"/>
<field name="UserClipFlags MustClip Enable" start="184" end="184" type="bool"/>
<field name="UserClipFlags ClipTest Enable Bitmask" start="176" end="183" type="bool"/>
<field name="Clip Mode" start="173" end="175" type="uint" prefix="CLIPMODE">
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index 4651192..d759ad1 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -377,6 +377,7 @@
<field name="Viewport XY ClipTest Enable" start="188" end="188" type="bool"/>
<field name="Viewport Z ClipTest Enable" start="187" end="187" type="bool"/>
<field name="Guardband ClipTest Enable" start="186" end="186" type="bool"/>
+ <field name="Negative W ClipTest Enable" start="185" end="185" type="bool"/>
<field name="UserClipFlags MustClip Enable" start="184" end="184" type="bool"/>
<field name="UserClipFlags ClipTest Enable Bitmask" start="176" end="183" type="bool"/>
<field name="Clip Mode" start="173" end="175" type="uint" prefix="CLIPMODE">
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:17 UTC
Permalink
Raw Message
Just because it's not set doesn't mean that it doesn't exist. And since the
field is there on newer gens, having it on gen5 simplifies the code when
porting gen5 and lower.

Also add missing value to API Mode on CLIP_STATE on gen4.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen4.xml | 1 +
src/intel/genxml/gen5.xml | 4 ++++
2 files changed, 5 insertions(+)

diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
index e327bf4..5fcd6c9 100644
--- a/src/intel/genxml/gen4.xml
+++ b/src/intel/genxml/gen4.xml
@@ -370,6 +370,7 @@
<field name="GS Output Object Statistics Enable" start="138" end="138" type="bool"/>
<field name="API Mode" start="190" end="190" type="uint" prefix="APIMODE">
<value name="OGL" value="0"/>
+ <value name="D3D" value="1"/>
</field>
<field name="Vertex Position Space" start="189" end="189" type="uint" prefix="VPOS">
<value name="NDCSPACE" value="0"/>
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index 1dd7fed..d6b2662 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -370,6 +370,10 @@
<field name="Maximum Number of Threads" start="153" end="158" type="uint"/>
<field name="URB Entry Allocation Size" start="147" end="151" type="uint"/>
<field name="Number of URB Entries" start="139" end="146" type="uint"/>
+ <field name="API Mode" start="190" end="190" type="uint" prefix="APIMODE">
+ <value name="OGL" value="0"/>
+ <value name="D3D" value="1"/>
+ </field>
<field name="Vertex Position Space" start="189" end="189" type="uint" prefix="VPOS">
<value name="NDCSPACE" value="0"/>
<value name="SCREENSPACE" value="1"/>
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:25 UTC
Permalink
Raw Message
Use set_blend_entry_bits and set_depth_stencil_bits to fill most of the
color calc struct, and then manually update the rest.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/brw_cc.c | 174 --------------------------
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 92 --------------
src/mesa/drivers/dri/i965/brw_util.h | 1 -
src/mesa/drivers/dri/i965/genX_state_upload.c | 99 ++++++++++++---
5 files changed, 81 insertions(+), 286 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index cdaa696..503ec83 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -39,180 +39,6 @@
#include "main/stencil.h"
#include "intel_batchbuffer.h"

-/**
- * Modify blend function to force destination alpha to 1.0
- *
- * If \c function specifies a blend function that uses destination alpha,
- * replace it with a function that hard-wires destination alpha to 1.0. This
- * is used when rendering to xRGB targets.
- */
-GLenum
-brw_fix_xRGB_alpha(GLenum function)
-{
- switch (function) {
- case GL_DST_ALPHA:
- return GL_ONE;
-
- case GL_ONE_MINUS_DST_ALPHA:
- case GL_SRC_ALPHA_SATURATE:
- return GL_ZERO;
- }
-
- return function;
-}
-
-/**
- * Creates a CC unit packet from the current blend state.
- */
-static void upload_cc_unit(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- struct brw_cc_unit_state *cc;
-
- cc = brw_state_batch(brw, sizeof(*cc), 64, &brw->cc.state_offset);
- memset(cc, 0, sizeof(*cc));
-
- /* _NEW_STENCIL | _NEW_BUFFERS */
- if (ctx->Stencil._Enabled) {
- const unsigned back = ctx->Stencil._BackFace;
-
- cc->cc0.stencil_enable = 1;
- cc->cc0.stencil_func =
- intel_translate_compare_func(ctx->Stencil.Function[0]);
- cc->cc0.stencil_fail_op =
- intel_translate_stencil_op(ctx->Stencil.FailFunc[0]);
- cc->cc0.stencil_pass_depth_fail_op =
- intel_translate_stencil_op(ctx->Stencil.ZFailFunc[0]);
- cc->cc0.stencil_pass_depth_pass_op =
- intel_translate_stencil_op(ctx->Stencil.ZPassFunc[0]);
- cc->cc1.stencil_ref = _mesa_get_stencil_ref(ctx, 0);
- cc->cc1.stencil_write_mask = ctx->Stencil.WriteMask[0];
- cc->cc1.stencil_test_mask = ctx->Stencil.ValueMask[0];
-
- if (ctx->Stencil._TestTwoSide) {
- cc->cc0.bf_stencil_enable = 1;
- cc->cc0.bf_stencil_func =
- intel_translate_compare_func(ctx->Stencil.Function[back]);
- cc->cc0.bf_stencil_fail_op =
- intel_translate_stencil_op(ctx->Stencil.FailFunc[back]);
- cc->cc0.bf_stencil_pass_depth_fail_op =
- intel_translate_stencil_op(ctx->Stencil.ZFailFunc[back]);
- cc->cc0.bf_stencil_pass_depth_pass_op =
- intel_translate_stencil_op(ctx->Stencil.ZPassFunc[back]);
- cc->cc1.bf_stencil_ref = _mesa_get_stencil_ref(ctx, back);
- cc->cc2.bf_stencil_write_mask = ctx->Stencil.WriteMask[back];
- cc->cc2.bf_stencil_test_mask = ctx->Stencil.ValueMask[back];
- }
-
- /* Not really sure about this:
- */
- if (ctx->Stencil.WriteMask[0] ||
- (ctx->Stencil._TestTwoSide && ctx->Stencil.WriteMask[back]))
- cc->cc0.stencil_write_enable = 1;
- }
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled && ctx->Color.LogicOp != GL_COPY) {
- cc->cc2.logicop_enable = 1;
- cc->cc5.logicop_func = intel_translate_logic_op(ctx->Color.LogicOp);
- } else if (ctx->Color.BlendEnabled && !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[0].EquationRGB;
- GLenum eqA = ctx->Color.Blend[0].EquationA;
- GLenum srcRGB = ctx->Color.Blend[0].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[0].DstRGB;
- GLenum srcA = ctx->Color.Blend[0].SrcA;
- GLenum dstA = ctx->Color.Blend[0].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
- srcRGB = dstRGB = GL_ONE;
- }
-
- if (eqA == GL_MIN || eqA == GL_MAX) {
- srcA = dstA = GL_ONE;
- }
-
- /* If the renderbuffer is XRGB, we have to frob the blend function to
- * force the destination alpha to 1.0. This means replacing GL_DST_ALPHA
- * with GL_ONE and GL_ONE_MINUS_DST_ALPHA with GL_ZERO.
- */
- const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- cc->cc6.dest_blend_factor = brw_translate_blend_factor(dstRGB);
- cc->cc6.src_blend_factor = brw_translate_blend_factor(srcRGB);
- cc->cc6.blend_function = brw_translate_blend_equation(eqRGB);
-
- cc->cc5.ia_dest_blend_factor = brw_translate_blend_factor(dstA);
- cc->cc5.ia_src_blend_factor = brw_translate_blend_factor(srcA);
- cc->cc5.ia_blend_function = brw_translate_blend_equation(eqA);
-
- cc->cc3.blend_enable = 1;
- cc->cc3.ia_blend_enable = (srcA != srcRGB ||
- dstA != dstRGB ||
- eqA != eqRGB);
- }
-
- /* _NEW_BUFFERS */
- if (ctx->Color.AlphaEnabled && ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
- cc->cc3.alpha_test = 1;
- cc->cc3.alpha_test_func =
- intel_translate_compare_func(ctx->Color.AlphaFunc);
- cc->cc3.alpha_test_format = BRW_ALPHATEST_FORMAT_UNORM8;
-
- UNCLAMPED_FLOAT_TO_UBYTE(cc->cc7.alpha_ref.ub[0], ctx->Color.AlphaRef);
- }
-
- if (ctx->Color.DitherFlag) {
- cc->cc5.dither_enable = 1;
- cc->cc6.y_dither_offset = 0;
- cc->cc6.x_dither_offset = 0;
- }
-
- /* _NEW_DEPTH */
- if (ctx->Depth.Test) {
- cc->cc2.depth_test = 1;
- cc->cc2.depth_test_function =
- intel_translate_compare_func(ctx->Depth.Func);
- cc->cc2.depth_write_enable = brw_depth_writes_enabled(brw);
- }
-
- if (brw->stats_wm)
- cc->cc5.statistics_enable = 1;
-
- /* BRW_NEW_CC_VP */
- cc->cc4.cc_viewport_state_offset = (brw->batch.bo->offset64 +
- brw->cc.vp_offset) >> 5; /* reloc */
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* Emit CC viewport relocation */
- brw_emit_reloc(&brw->batch,
- (brw->cc.state_offset +
- offsetof(struct brw_cc_unit_state, cc4)),
- brw->batch.bo, brw->cc.vp_offset,
- I915_GEM_DOMAIN_INSTRUCTION, 0);
-}
-
-const struct brw_tracked_state brw_cc_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_DEPTH |
- _NEW_STENCIL,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_CC_VP |
- BRW_NEW_STATS_WM,
- },
- .emit = upload_cc_unit,
-};
-
static void upload_blend_constant_color(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 5f5ba64..ead0078 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -42,7 +42,6 @@ extern "C" {
enum intel_msaa_layout;

extern const struct brw_tracked_state brw_blend_constant_color;
-extern const struct brw_tracked_state brw_cc_unit;
extern const struct brw_tracked_state brw_clip_unit;
extern const struct brw_tracked_state brw_vs_pull_constants;
extern const struct brw_tracked_state brw_tcs_pull_constants;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 6d3f80d..12f3024 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -180,98 +180,6 @@ struct brw_clip_unit_state
float viewport_ymax;
};

-struct brw_cc_unit_state
-{
- struct
- {
- unsigned pad0:3;
- unsigned bf_stencil_pass_depth_pass_op:3;
- unsigned bf_stencil_pass_depth_fail_op:3;
- unsigned bf_stencil_fail_op:3;
- unsigned bf_stencil_func:3;
- unsigned bf_stencil_enable:1;
- unsigned pad1:2;
- unsigned stencil_write_enable:1;
- unsigned stencil_pass_depth_pass_op:3;
- unsigned stencil_pass_depth_fail_op:3;
- unsigned stencil_fail_op:3;
- unsigned stencil_func:3;
- unsigned stencil_enable:1;
- } cc0;
-
-
- struct
- {
- unsigned bf_stencil_ref:8;
- unsigned stencil_write_mask:8;
- unsigned stencil_test_mask:8;
- unsigned stencil_ref:8;
- } cc1;
-
-
- struct
- {
- unsigned logicop_enable:1;
- unsigned pad0:10;
- unsigned depth_write_enable:1;
- unsigned depth_test_function:3;
- unsigned depth_test:1;
- unsigned bf_stencil_write_mask:8;
- unsigned bf_stencil_test_mask:8;
- } cc2;
-
-
- struct
- {
- unsigned pad0:8;
- unsigned alpha_test_func:3;
- unsigned alpha_test:1;
- unsigned blend_enable:1;
- unsigned ia_blend_enable:1;
- unsigned pad1:1;
- unsigned alpha_test_format:1;
- unsigned pad2:16;
- } cc3;
-
- struct
- {
- unsigned pad0:5;
- unsigned cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
- } cc4;
-
- struct
- {
- unsigned pad0:2;
- unsigned ia_dest_blend_factor:5;
- unsigned ia_src_blend_factor:5;
- unsigned ia_blend_function:3;
- unsigned statistics_enable:1;
- unsigned logicop_func:4;
- unsigned pad1:11;
- unsigned dither_enable:1;
- } cc5;
-
- struct
- {
- unsigned clamp_post_alpha_blend:1;
- unsigned clamp_pre_alpha_blend:1;
- unsigned clamp_range:2;
- unsigned pad0:11;
- unsigned y_dither_offset:2;
- unsigned x_dither_offset:2;
- unsigned dest_blend_factor:5;
- unsigned src_blend_factor:5;
- unsigned blend_function:3;
- } cc6;
-
- struct {
- union {
- float f;
- uint8_t ub[4];
- } alpha_ref;
- } cc7;
-};
-
struct brw_gs_unit_state
{
struct thread0 thread0;
diff --git a/src/mesa/drivers/dri/i965/brw_util.h b/src/mesa/drivers/dri/i965/brw_util.h
index 8142860..095c43a 100644
--- a/src/mesa/drivers/dri/i965/brw_util.h
+++ b/src/mesa/drivers/dri/i965/brw_util.h
@@ -38,7 +38,6 @@

extern GLuint brw_translate_blend_factor( GLenum factor );
extern GLuint brw_translate_blend_equation( GLenum mode );
-extern GLenum brw_fix_xRGB_alpha(GLenum function);

static inline float
brw_get_line_width(struct brw_context *brw)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 8e99c89..d8dcaf4 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1177,7 +1177,7 @@ set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
struct gl_stencil_attrib *stencil = &ctx->Stencil;
const int b = stencil->_BackFace;

- if (depth->Test && depth_irb) {
+ if (depth->Test && (GEN_GEN <= 5 || depth_irb)) {
ds->DepthTestEnable = true;
ds->DepthBufferWriteEnable = brw_depth_writes_enabled(brw);
ds->DepthTestFunction = intel_translate_compare_func(depth->Func);
@@ -1214,9 +1214,11 @@ set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
intel_translate_stencil_op(stencil->ZFailFunc[b]);
}

-#if GEN_GEN >= 9
+#if GEN_GEN <= 5 || GEN_GEN >= 9
ds->StencilReferenceValue = _mesa_get_stencil_ref(ctx, 0);
- ds->BackfaceStencilReferenceValue = _mesa_get_stencil_ref(ctx, b);
+ ds->BackfaceStencilReferenceValue =
+ GEN_GEN >= 9 || stencil->_TestTwoSide ?
+ _mesa_get_stencil_ref(ctx, b) : 0;
#endif
}
}
@@ -2527,6 +2529,28 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_factor(x) brw_translate_blend_factor(x)
#define blend_eqn(x) brw_translate_blend_equation(x)

+/**
+ * Modify blend function to force destination alpha to 1.0
+ *
+ * If \c function specifies a blend function that uses destination alpha,
+ * replace it with a function that hard-wires destination alpha to 1.0. This
+ * is used when rendering to xRGB targets.
+ */
+static GLenum
+brw_fix_xRGB_alpha(GLenum function)
+{
+ switch (function) {
+ case GL_DST_ALPHA:
+ return GL_ONE;
+
+ case GL_ONE_MINUS_DST_ALPHA:
+ case GL_SRC_ALPHA_SATURATE:
+ return GL_ZERO;
+ }
+
+ return function;
+}
+
#if GEN_GEN >= 6
typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
#else
@@ -2552,6 +2576,9 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
*/
const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);

+ const unsigned blend_enabled = GEN_GEN >= 6 ?
+ ctx->Color.BlendEnabled & (1 << i) : ctx->Color.BlendEnabled;
+
/* _NEW_COLOR */
if (ctx->Color.ColorLogicOpEnabled) {
GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
@@ -2567,8 +2594,8 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
entry->LogicOpFunction =
intel_translate_logic_op(ctx->Color.LogicOp);
}
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
+ } else if (blend_enabled && !ctx->Color._AdvancedBlendMode
+ && (GEN_GEN <= 5 || !integer)) {
GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
GLenum eqA = ctx->Color.Blend[i].EquationA;
GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
@@ -2994,17 +3021,40 @@ static const struct brw_tracked_state genX(multisample_state) = {

/* ---------------------------------------------------------------------- */

-#if GEN_GEN >= 6
static void
genX(upload_color_calc_state)(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;

brw_state_emit(brw, GENX(COLOR_CALC_STATE), 64, &brw->cc.state_offset, cc) {
+#if GEN_GEN <= 5
+ cc.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &cc, 0, false);
+ set_depth_stencil_bits(brw, &cc);
+
+ if (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
+ cc.AlphaTestEnable = true;
+ cc.AlphaTestFunction =
+ intel_translate_compare_func(ctx->Color.AlphaFunc);
+ }
+
+ if (ctx->Color.DitherFlag) {
+ cc.ColorDitherEnable = true;
+ cc.XDitherOffset = 0;
+ cc.YDitherOffset = 0;
+ }
+
+ cc.StatisticsEnable = brw->stats_wm;
+
+ cc.CCViewportStatePointer =
+ instruction_ro_bo(brw->batch.bo, brw->cc.vp_offset);
+#else
/* _NEW_COLOR */
- cc.AlphaTestFormat = ALPHATEST_UNORM8;
- UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
- ctx->Color.AlphaRef);
+ cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
+ cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
+ cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
+ cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];

#if GEN_GEN < 9
/* _NEW_STENCIL */
@@ -3013,34 +3063,47 @@ genX(upload_color_calc_state)(struct brw_context *brw)
_mesa_get_stencil_ref(ctx, ctx->Stencil._BackFace);
#endif

+#endif
+
/* _NEW_COLOR */
- cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
- cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
- cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
- cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
+ if (GEN_GEN >= 6 ||
+ (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1)) {
+ cc.AlphaTestFormat = ALPHATEST_UNORM8;
+ UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
+ ctx->Color.AlphaRef);
+ }
}

+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_CC_STATE_POINTERS), ptr) {
ptr.ColorCalcStatePointer = brw->cc.state_offset;
#if GEN_GEN != 7
ptr.ColorCalcStatePointerValid = true;
#endif
}
+#endif
+
+ brw->ctx.NewDriverState |= GEN_GEN <= 5 ? BRW_NEW_GEN4_UNIT_STATE : 0;
}

static const struct brw_tracked_state genX(color_calc_state) = {
.dirty = {
.mesa = _NEW_COLOR |
- _NEW_STENCIL,
+ _NEW_STENCIL |
+ (GEN_GEN <= 5 ? _NEW_BUFFERS |
+ _NEW_DEPTH
+ : 0),
.brw = BRW_NEW_BATCH |
BRW_NEW_BLORP |
- BRW_NEW_CC_STATE |
- BRW_NEW_STATE_BASE_ADDRESS,
+ (GEN_GEN <= 5 ? BRW_NEW_CC_VP |
+ BRW_NEW_STATS_WM
+ : BRW_NEW_CC_STATE |
+ BRW_NEW_STATE_BASE_ADDRESS),
},
.emit = genX(upload_color_calc_state),
};

-#endif

/* ---------------------------------------------------------------------- */

@@ -4252,7 +4315,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_recalculate_urb_fence,

&genX(cc_vp),
- &brw_cc_unit,
+ &genX(color_calc_state),

/* Surface state setup. Must come before the VS/WM unit. The binding
* table upload must be last.
--
2.9.4
Kenneth Graunke
2017-06-17 18:31:51 UTC
Permalink
Raw Message
Post by Rafael Antognolli
Use set_blend_entry_bits and set_depth_stencil_bits to fill most of the
color calc struct, and then manually update the rest.
---
src/mesa/drivers/dri/i965/brw_cc.c | 174 --------------------------
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 92 --------------
src/mesa/drivers/dri/i965/brw_util.h | 1 -
src/mesa/drivers/dri/i965/genX_state_upload.c | 99 ++++++++++++---
5 files changed, 81 insertions(+), 286 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index cdaa696..503ec83 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -39,180 +39,6 @@
#include "main/stencil.h"
#include "intel_batchbuffer.h"
-/**
- * Modify blend function to force destination alpha to 1.0
- *
- * If \c function specifies a blend function that uses destination alpha,
- * replace it with a function that hard-wires destination alpha to 1.0. This
- * is used when rendering to xRGB targets.
- */
-GLenum
-brw_fix_xRGB_alpha(GLenum function)
-{
- switch (function) {
- return GL_ONE;
-
- return GL_ZERO;
- }
-
- return function;
-}
-
-/**
- * Creates a CC unit packet from the current blend state.
- */
-static void upload_cc_unit(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- struct brw_cc_unit_state *cc;
-
- cc = brw_state_batch(brw, sizeof(*cc), 64, &brw->cc.state_offset);
- memset(cc, 0, sizeof(*cc));
-
- /* _NEW_STENCIL | _NEW_BUFFERS */
- if (ctx->Stencil._Enabled) {
- const unsigned back = ctx->Stencil._BackFace;
-
- cc->cc0.stencil_enable = 1;
- cc->cc0.stencil_func =
- intel_translate_compare_func(ctx->Stencil.Function[0]);
- cc->cc0.stencil_fail_op =
- intel_translate_stencil_op(ctx->Stencil.FailFunc[0]);
- cc->cc0.stencil_pass_depth_fail_op =
- intel_translate_stencil_op(ctx->Stencil.ZFailFunc[0]);
- cc->cc0.stencil_pass_depth_pass_op =
- intel_translate_stencil_op(ctx->Stencil.ZPassFunc[0]);
- cc->cc1.stencil_ref = _mesa_get_stencil_ref(ctx, 0);
- cc->cc1.stencil_write_mask = ctx->Stencil.WriteMask[0];
- cc->cc1.stencil_test_mask = ctx->Stencil.ValueMask[0];
-
- if (ctx->Stencil._TestTwoSide) {
- cc->cc0.bf_stencil_enable = 1;
- cc->cc0.bf_stencil_func =
- intel_translate_compare_func(ctx->Stencil.Function[back]);
- cc->cc0.bf_stencil_fail_op =
- intel_translate_stencil_op(ctx->Stencil.FailFunc[back]);
- cc->cc0.bf_stencil_pass_depth_fail_op =
- intel_translate_stencil_op(ctx->Stencil.ZFailFunc[back]);
- cc->cc0.bf_stencil_pass_depth_pass_op =
- intel_translate_stencil_op(ctx->Stencil.ZPassFunc[back]);
- cc->cc1.bf_stencil_ref = _mesa_get_stencil_ref(ctx, back);
- cc->cc2.bf_stencil_write_mask = ctx->Stencil.WriteMask[back];
- cc->cc2.bf_stencil_test_mask = ctx->Stencil.ValueMask[back];
- }
-
- */
- if (ctx->Stencil.WriteMask[0] ||
- (ctx->Stencil._TestTwoSide && ctx->Stencil.WriteMask[back]))
- cc->cc0.stencil_write_enable = 1;
- }
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled && ctx->Color.LogicOp != GL_COPY) {
- cc->cc2.logicop_enable = 1;
- cc->cc5.logicop_func = intel_translate_logic_op(ctx->Color.LogicOp);
- } else if (ctx->Color.BlendEnabled && !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[0].EquationRGB;
- GLenum eqA = ctx->Color.Blend[0].EquationA;
- GLenum srcRGB = ctx->Color.Blend[0].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[0].DstRGB;
- GLenum srcA = ctx->Color.Blend[0].SrcA;
- GLenum dstA = ctx->Color.Blend[0].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
- srcRGB = dstRGB = GL_ONE;
- }
-
- if (eqA == GL_MIN || eqA == GL_MAX) {
- srcA = dstA = GL_ONE;
- }
-
- /* If the renderbuffer is XRGB, we have to frob the blend function to
- * force the destination alpha to 1.0. This means replacing GL_DST_ALPHA
- * with GL_ONE and GL_ONE_MINUS_DST_ALPHA with GL_ZERO.
- */
- const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- cc->cc6.dest_blend_factor = brw_translate_blend_factor(dstRGB);
- cc->cc6.src_blend_factor = brw_translate_blend_factor(srcRGB);
- cc->cc6.blend_function = brw_translate_blend_equation(eqRGB);
-
- cc->cc5.ia_dest_blend_factor = brw_translate_blend_factor(dstA);
- cc->cc5.ia_src_blend_factor = brw_translate_blend_factor(srcA);
- cc->cc5.ia_blend_function = brw_translate_blend_equation(eqA);
-
- cc->cc3.blend_enable = 1;
- cc->cc3.ia_blend_enable = (srcA != srcRGB ||
- dstA != dstRGB ||
- eqA != eqRGB);
- }
-
- /* _NEW_BUFFERS */
- if (ctx->Color.AlphaEnabled && ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
- cc->cc3.alpha_test = 1;
- cc->cc3.alpha_test_func =
- intel_translate_compare_func(ctx->Color.AlphaFunc);
- cc->cc3.alpha_test_format = BRW_ALPHATEST_FORMAT_UNORM8;
-
- UNCLAMPED_FLOAT_TO_UBYTE(cc->cc7.alpha_ref.ub[0], ctx->Color.AlphaRef);
- }
-
- if (ctx->Color.DitherFlag) {
- cc->cc5.dither_enable = 1;
- cc->cc6.y_dither_offset = 0;
- cc->cc6.x_dither_offset = 0;
- }
-
- /* _NEW_DEPTH */
- if (ctx->Depth.Test) {
- cc->cc2.depth_test = 1;
- cc->cc2.depth_test_function =
- intel_translate_compare_func(ctx->Depth.Func);
- cc->cc2.depth_write_enable = brw_depth_writes_enabled(brw);
- }
-
- if (brw->stats_wm)
- cc->cc5.statistics_enable = 1;
-
- /* BRW_NEW_CC_VP */
- cc->cc4.cc_viewport_state_offset = (brw->batch.bo->offset64 +
- brw->cc.vp_offset) >> 5; /* reloc */
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* Emit CC viewport relocation */
- brw_emit_reloc(&brw->batch,
- (brw->cc.state_offset +
- offsetof(struct brw_cc_unit_state, cc4)),
- brw->batch.bo, brw->cc.vp_offset,
- I915_GEM_DOMAIN_INSTRUCTION, 0);
-}
-
-const struct brw_tracked_state brw_cc_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_DEPTH |
- _NEW_STENCIL,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_CC_VP |
- BRW_NEW_STATS_WM,
- },
- .emit = upload_cc_unit,
-};
-
static void upload_blend_constant_color(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 5f5ba64..ead0078 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -42,7 +42,6 @@ extern "C" {
enum intel_msaa_layout;
extern const struct brw_tracked_state brw_blend_constant_color;
-extern const struct brw_tracked_state brw_cc_unit;
extern const struct brw_tracked_state brw_clip_unit;
extern const struct brw_tracked_state brw_vs_pull_constants;
extern const struct brw_tracked_state brw_tcs_pull_constants;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 6d3f80d..12f3024 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -180,98 +180,6 @@ struct brw_clip_unit_state
float viewport_ymax;
};
-struct brw_cc_unit_state
-{
- struct
- {
- unsigned pad0:3;
- unsigned bf_stencil_pass_depth_pass_op:3;
- unsigned bf_stencil_pass_depth_fail_op:3;
- unsigned bf_stencil_fail_op:3;
- unsigned bf_stencil_func:3;
- unsigned bf_stencil_enable:1;
- unsigned pad1:2;
- unsigned stencil_write_enable:1;
- unsigned stencil_pass_depth_pass_op:3;
- unsigned stencil_pass_depth_fail_op:3;
- unsigned stencil_fail_op:3;
- unsigned stencil_func:3;
- unsigned stencil_enable:1;
- } cc0;
-
-
- struct
- {
- unsigned bf_stencil_ref:8;
- unsigned stencil_write_mask:8;
- unsigned stencil_test_mask:8;
- unsigned stencil_ref:8;
- } cc1;
-
-
- struct
- {
- unsigned logicop_enable:1;
- unsigned pad0:10;
- unsigned depth_write_enable:1;
- unsigned depth_test_function:3;
- unsigned depth_test:1;
- unsigned bf_stencil_write_mask:8;
- unsigned bf_stencil_test_mask:8;
- } cc2;
-
-
- struct
- {
- unsigned pad0:8;
- unsigned alpha_test_func:3;
- unsigned alpha_test:1;
- unsigned blend_enable:1;
- unsigned ia_blend_enable:1;
- unsigned pad1:1;
- unsigned alpha_test_format:1;
- unsigned pad2:16;
- } cc3;
-
- struct
- {
- unsigned pad0:5;
- unsigned cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
- } cc4;
-
- struct
- {
- unsigned pad0:2;
- unsigned ia_dest_blend_factor:5;
- unsigned ia_src_blend_factor:5;
- unsigned ia_blend_function:3;
- unsigned statistics_enable:1;
- unsigned logicop_func:4;
- unsigned pad1:11;
- unsigned dither_enable:1;
- } cc5;
-
- struct
- {
- unsigned clamp_post_alpha_blend:1;
- unsigned clamp_pre_alpha_blend:1;
- unsigned clamp_range:2;
- unsigned pad0:11;
- unsigned y_dither_offset:2;
- unsigned x_dither_offset:2;
- unsigned dest_blend_factor:5;
- unsigned src_blend_factor:5;
- unsigned blend_function:3;
- } cc6;
-
- struct {
- union {
- float f;
- uint8_t ub[4];
- } alpha_ref;
- } cc7;
-};
-
struct brw_gs_unit_state
{
struct thread0 thread0;
diff --git a/src/mesa/drivers/dri/i965/brw_util.h b/src/mesa/drivers/dri/i965/brw_util.h
index 8142860..095c43a 100644
--- a/src/mesa/drivers/dri/i965/brw_util.h
+++ b/src/mesa/drivers/dri/i965/brw_util.h
@@ -38,7 +38,6 @@
extern GLuint brw_translate_blend_factor( GLenum factor );
extern GLuint brw_translate_blend_equation( GLenum mode );
-extern GLenum brw_fix_xRGB_alpha(GLenum function);
static inline float
brw_get_line_width(struct brw_context *brw)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 8e99c89..d8dcaf4 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1177,7 +1177,7 @@ set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
struct gl_stencil_attrib *stencil = &ctx->Stencil;
const int b = stencil->_BackFace;
- if (depth->Test && depth_irb) {
+ if (depth->Test && (GEN_GEN <= 5 || depth_irb)) {
We should just always do the depth_irb check.
Post by Rafael Antognolli
ds->DepthTestEnable = true;
ds->DepthBufferWriteEnable = brw_depth_writes_enabled(brw);
ds->DepthTestFunction = intel_translate_compare_func(depth->Func);
@@ -1214,9 +1214,11 @@ set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
intel_translate_stencil_op(stencil->ZFailFunc[b]);
}
-#if GEN_GEN >= 9
+#if GEN_GEN <= 5 || GEN_GEN >= 9
ds->StencilReferenceValue = _mesa_get_stencil_ref(ctx, 0);
- ds->BackfaceStencilReferenceValue = _mesa_get_stencil_ref(ctx, b);
+ ds->BackfaceStencilReferenceValue =
+ GEN_GEN >= 9 || stencil->_TestTwoSide ?
+ _mesa_get_stencil_ref(ctx, b) : 0;
It should be harmless to program the backface ref value - it should
be ignored/unused if _TestTwoSide isn't set. So we could leave this as is.
Post by Rafael Antognolli
#endif
}
}
@@ -2527,6 +2529,28 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_factor(x) brw_translate_blend_factor(x)
#define blend_eqn(x) brw_translate_blend_equation(x)
+/**
+ * Modify blend function to force destination alpha to 1.0
+ *
+ * If \c function specifies a blend function that uses destination alpha,
+ * replace it with a function that hard-wires destination alpha to 1.0. This
+ * is used when rendering to xRGB targets.
+ */
+static GLenum
+brw_fix_xRGB_alpha(GLenum function)
+{
+ switch (function) {
+ return GL_ONE;
+
+ return GL_ZERO;
+ }
+
+ return function;
+}
+
#if GEN_GEN >= 6
typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
#else
@@ -2552,6 +2576,9 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
*/
const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+ const unsigned blend_enabled = GEN_GEN >= 6 ?
+ ctx->Color.BlendEnabled & (1 << i) : ctx->Color.BlendEnabled;
+
I think always using ctx->Color.BlendEnabled & (1 << i) should be fine.
That corresponds to the enable bit for blend entry 0, which is the
only one we're handling here. (Gen4-5 only support a single entry.)
Post by Rafael Antognolli
/* _NEW_COLOR */
if (ctx->Color.ColorLogicOpEnabled) {
GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
@@ -2567,8 +2594,8 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
entry->LogicOpFunction =
intel_translate_logic_op(ctx->Color.LogicOp);
}
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
+ } else if (blend_enabled && !ctx->Color._AdvancedBlendMode
+ && (GEN_GEN <= 5 || !integer)) {
The GEN_GEN <= 5 || !integer seems bizarre, and I wonder whether it's
correct. However, you're just preserving the existing behavior, so
that's fine - we may want to revisit it in the future.
Post by Rafael Antognolli
GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
GLenum eqA = ctx->Color.Blend[i].EquationA;
GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
@@ -2994,17 +3021,40 @@ static const struct brw_tracked_state genX(multisample_state) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
static void
genX(upload_color_calc_state)(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;
brw_state_emit(brw, GENX(COLOR_CALC_STATE), 64, &brw->cc.state_offset, cc) {
+#if GEN_GEN <= 5
+ cc.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &cc, 0, false);
+ set_depth_stencil_bits(brw, &cc);
+
+ if (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
+ cc.AlphaTestEnable = true;
+ cc.AlphaTestFunction =
+ intel_translate_compare_func(ctx->Color.AlphaFunc);
+ }
We should probably make this consistent across generations:

if (ctx->Color.AlphaEnabled &&
(GEN_GEN >= 6 || ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
cc.AlphaTestEnable = true;
cc.AlphaTestFunction =
intel_translate_compare_func(ctx->Color.AlphaFunc);
cc.AlphaTestFormat = ALPHATEST_UNORM8;
UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
ctx->Color.AlphaRef);
}

Alternatively, it should be harmless to set AlphaTestFormat,
AlphaReferenceValueAsUNORM8, and AlphaTestFunction even if alpha testing
is disabled, so we could also do:

cc.AlphaTestEnable = ctx->Color.AlphaEnabled &&
(GEN_GEN >= 6 || ctx->DrawBuffer->_NumColorDrawBuffers <= 1);

cc.AlphaTestFormat = ALPHATEST_UNORM8;
cc.AlphaTestFunction =
intel_translate_compare_func(ctx->Color.AlphaFunc);
UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
ctx->Color.AlphaRef);1

I suppose that does a bit of extra work when alpha testing is disabled.
Post by Rafael Antognolli
+
+ if (ctx->Color.DitherFlag) {
+ cc.ColorDitherEnable = true;
+ cc.XDitherOffset = 0;
+ cc.YDitherOffset = 0;
+ }
I'd probably write this as:

cc.ColorDitherEnable = ctx->Color.DitherFlag;

(the offset values will be zero-initialized by default).
Post by Rafael Antognolli
+
+ cc.StatisticsEnable = brw->stats_wm;
+
+ cc.CCViewportStatePointer =
+ instruction_ro_bo(brw->batch.bo, brw->cc.vp_offset);
+#else
/* _NEW_COLOR */
- cc.AlphaTestFormat = ALPHATEST_UNORM8;
- UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
- ctx->Color.AlphaRef);
+ cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
+ cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
+ cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
+ cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
#if GEN_GEN < 9
/* _NEW_STENCIL */
@@ -3013,34 +3063,47 @@ genX(upload_color_calc_state)(struct brw_context *brw)
_mesa_get_stencil_ref(ctx, ctx->Stencil._BackFace);
#endif
+#endif
+
/* _NEW_COLOR */
- cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
- cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
- cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
- cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
+ if (GEN_GEN >= 6 ||
+ (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1)) {
+ cc.AlphaTestFormat = ALPHATEST_UNORM8;
+ UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
+ ctx->Color.AlphaRef);
+ }
}
+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_CC_STATE_POINTERS), ptr) {
ptr.ColorCalcStatePointer = brw->cc.state_offset;
#if GEN_GEN != 7
ptr.ColorCalcStatePointerValid = true;
#endif
}
+#endif
+
+ brw->ctx.NewDriverState |= GEN_GEN <= 5 ? BRW_NEW_GEN4_UNIT_STATE : 0;
Perhaps just do:

#if GEN_GEN >= 6
...
#else
ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
#endif

since we've already got generation-specific code blocks here.
Post by Rafael Antognolli
}
static const struct brw_tracked_state genX(color_calc_state) = {
.dirty = {
.mesa = _NEW_COLOR |
- _NEW_STENCIL,
+ _NEW_STENCIL |
+ (GEN_GEN <= 5 ? _NEW_BUFFERS |
+ _NEW_DEPTH
+ : 0),
.brw = BRW_NEW_BATCH |
BRW_NEW_BLORP |
- BRW_NEW_CC_STATE |
- BRW_NEW_STATE_BASE_ADDRESS,
+ (GEN_GEN <= 5 ? BRW_NEW_CC_VP |
+ BRW_NEW_STATS_WM
+ : BRW_NEW_CC_STATE |
+ BRW_NEW_STATE_BASE_ADDRESS),
},
.emit = genX(upload_color_calc_state),
};
-#endif
/* ---------------------------------------------------------------------- */
@@ -4252,7 +4315,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_recalculate_urb_fence,
&genX(cc_vp),
- &brw_cc_unit,
+ &genX(color_calc_state),
/* Surface state setup. Must come before the VS/WM unit. The binding
* table upload must be last.
Rafael Antognolli
2017-06-19 18:13:05 UTC
Permalink
Raw Message
Post by Kenneth Graunke
Post by Rafael Antognolli
Use set_blend_entry_bits and set_depth_stencil_bits to fill most of the
color calc struct, and then manually update the rest.
---
src/mesa/drivers/dri/i965/brw_cc.c | 174 --------------------------
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 92 --------------
src/mesa/drivers/dri/i965/brw_util.h | 1 -
src/mesa/drivers/dri/i965/genX_state_upload.c | 99 ++++++++++++---
5 files changed, 81 insertions(+), 286 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index cdaa696..503ec83 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -39,180 +39,6 @@
#include "main/stencil.h"
#include "intel_batchbuffer.h"
-/**
- * Modify blend function to force destination alpha to 1.0
- *
- * If \c function specifies a blend function that uses destination alpha,
- * replace it with a function that hard-wires destination alpha to 1.0. This
- * is used when rendering to xRGB targets.
- */
-GLenum
-brw_fix_xRGB_alpha(GLenum function)
-{
- switch (function) {
- return GL_ONE;
-
- return GL_ZERO;
- }
-
- return function;
-}
-
-/**
- * Creates a CC unit packet from the current blend state.
- */
-static void upload_cc_unit(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- struct brw_cc_unit_state *cc;
-
- cc = brw_state_batch(brw, sizeof(*cc), 64, &brw->cc.state_offset);
- memset(cc, 0, sizeof(*cc));
-
- /* _NEW_STENCIL | _NEW_BUFFERS */
- if (ctx->Stencil._Enabled) {
- const unsigned back = ctx->Stencil._BackFace;
-
- cc->cc0.stencil_enable = 1;
- cc->cc0.stencil_func =
- intel_translate_compare_func(ctx->Stencil.Function[0]);
- cc->cc0.stencil_fail_op =
- intel_translate_stencil_op(ctx->Stencil.FailFunc[0]);
- cc->cc0.stencil_pass_depth_fail_op =
- intel_translate_stencil_op(ctx->Stencil.ZFailFunc[0]);
- cc->cc0.stencil_pass_depth_pass_op =
- intel_translate_stencil_op(ctx->Stencil.ZPassFunc[0]);
- cc->cc1.stencil_ref = _mesa_get_stencil_ref(ctx, 0);
- cc->cc1.stencil_write_mask = ctx->Stencil.WriteMask[0];
- cc->cc1.stencil_test_mask = ctx->Stencil.ValueMask[0];
-
- if (ctx->Stencil._TestTwoSide) {
- cc->cc0.bf_stencil_enable = 1;
- cc->cc0.bf_stencil_func =
- intel_translate_compare_func(ctx->Stencil.Function[back]);
- cc->cc0.bf_stencil_fail_op =
- intel_translate_stencil_op(ctx->Stencil.FailFunc[back]);
- cc->cc0.bf_stencil_pass_depth_fail_op =
- intel_translate_stencil_op(ctx->Stencil.ZFailFunc[back]);
- cc->cc0.bf_stencil_pass_depth_pass_op =
- intel_translate_stencil_op(ctx->Stencil.ZPassFunc[back]);
- cc->cc1.bf_stencil_ref = _mesa_get_stencil_ref(ctx, back);
- cc->cc2.bf_stencil_write_mask = ctx->Stencil.WriteMask[back];
- cc->cc2.bf_stencil_test_mask = ctx->Stencil.ValueMask[back];
- }
-
- */
- if (ctx->Stencil.WriteMask[0] ||
- (ctx->Stencil._TestTwoSide && ctx->Stencil.WriteMask[back]))
- cc->cc0.stencil_write_enable = 1;
- }
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled && ctx->Color.LogicOp != GL_COPY) {
- cc->cc2.logicop_enable = 1;
- cc->cc5.logicop_func = intel_translate_logic_op(ctx->Color.LogicOp);
- } else if (ctx->Color.BlendEnabled && !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[0].EquationRGB;
- GLenum eqA = ctx->Color.Blend[0].EquationA;
- GLenum srcRGB = ctx->Color.Blend[0].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[0].DstRGB;
- GLenum srcA = ctx->Color.Blend[0].SrcA;
- GLenum dstA = ctx->Color.Blend[0].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
- srcRGB = dstRGB = GL_ONE;
- }
-
- if (eqA == GL_MIN || eqA == GL_MAX) {
- srcA = dstA = GL_ONE;
- }
-
- /* If the renderbuffer is XRGB, we have to frob the blend function to
- * force the destination alpha to 1.0. This means replacing GL_DST_ALPHA
- * with GL_ONE and GL_ONE_MINUS_DST_ALPHA with GL_ZERO.
- */
- const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- cc->cc6.dest_blend_factor = brw_translate_blend_factor(dstRGB);
- cc->cc6.src_blend_factor = brw_translate_blend_factor(srcRGB);
- cc->cc6.blend_function = brw_translate_blend_equation(eqRGB);
-
- cc->cc5.ia_dest_blend_factor = brw_translate_blend_factor(dstA);
- cc->cc5.ia_src_blend_factor = brw_translate_blend_factor(srcA);
- cc->cc5.ia_blend_function = brw_translate_blend_equation(eqA);
-
- cc->cc3.blend_enable = 1;
- cc->cc3.ia_blend_enable = (srcA != srcRGB ||
- dstA != dstRGB ||
- eqA != eqRGB);
- }
-
- /* _NEW_BUFFERS */
- if (ctx->Color.AlphaEnabled && ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
- cc->cc3.alpha_test = 1;
- cc->cc3.alpha_test_func =
- intel_translate_compare_func(ctx->Color.AlphaFunc);
- cc->cc3.alpha_test_format = BRW_ALPHATEST_FORMAT_UNORM8;
-
- UNCLAMPED_FLOAT_TO_UBYTE(cc->cc7.alpha_ref.ub[0], ctx->Color.AlphaRef);
- }
-
- if (ctx->Color.DitherFlag) {
- cc->cc5.dither_enable = 1;
- cc->cc6.y_dither_offset = 0;
- cc->cc6.x_dither_offset = 0;
- }
-
- /* _NEW_DEPTH */
- if (ctx->Depth.Test) {
- cc->cc2.depth_test = 1;
- cc->cc2.depth_test_function =
- intel_translate_compare_func(ctx->Depth.Func);
- cc->cc2.depth_write_enable = brw_depth_writes_enabled(brw);
- }
-
- if (brw->stats_wm)
- cc->cc5.statistics_enable = 1;
-
- /* BRW_NEW_CC_VP */
- cc->cc4.cc_viewport_state_offset = (brw->batch.bo->offset64 +
- brw->cc.vp_offset) >> 5; /* reloc */
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* Emit CC viewport relocation */
- brw_emit_reloc(&brw->batch,
- (brw->cc.state_offset +
- offsetof(struct brw_cc_unit_state, cc4)),
- brw->batch.bo, brw->cc.vp_offset,
- I915_GEM_DOMAIN_INSTRUCTION, 0);
-}
-
-const struct brw_tracked_state brw_cc_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_DEPTH |
- _NEW_STENCIL,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_CC_VP |
- BRW_NEW_STATS_WM,
- },
- .emit = upload_cc_unit,
-};
-
static void upload_blend_constant_color(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 5f5ba64..ead0078 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -42,7 +42,6 @@ extern "C" {
enum intel_msaa_layout;
extern const struct brw_tracked_state brw_blend_constant_color;
-extern const struct brw_tracked_state brw_cc_unit;
extern const struct brw_tracked_state brw_clip_unit;
extern const struct brw_tracked_state brw_vs_pull_constants;
extern const struct brw_tracked_state brw_tcs_pull_constants;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 6d3f80d..12f3024 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -180,98 +180,6 @@ struct brw_clip_unit_state
float viewport_ymax;
};
-struct brw_cc_unit_state
-{
- struct
- {
- unsigned pad0:3;
- unsigned bf_stencil_pass_depth_pass_op:3;
- unsigned bf_stencil_pass_depth_fail_op:3;
- unsigned bf_stencil_fail_op:3;
- unsigned bf_stencil_func:3;
- unsigned bf_stencil_enable:1;
- unsigned pad1:2;
- unsigned stencil_write_enable:1;
- unsigned stencil_pass_depth_pass_op:3;
- unsigned stencil_pass_depth_fail_op:3;
- unsigned stencil_fail_op:3;
- unsigned stencil_func:3;
- unsigned stencil_enable:1;
- } cc0;
-
-
- struct
- {
- unsigned bf_stencil_ref:8;
- unsigned stencil_write_mask:8;
- unsigned stencil_test_mask:8;
- unsigned stencil_ref:8;
- } cc1;
-
-
- struct
- {
- unsigned logicop_enable:1;
- unsigned pad0:10;
- unsigned depth_write_enable:1;
- unsigned depth_test_function:3;
- unsigned depth_test:1;
- unsigned bf_stencil_write_mask:8;
- unsigned bf_stencil_test_mask:8;
- } cc2;
-
-
- struct
- {
- unsigned pad0:8;
- unsigned alpha_test_func:3;
- unsigned alpha_test:1;
- unsigned blend_enable:1;
- unsigned ia_blend_enable:1;
- unsigned pad1:1;
- unsigned alpha_test_format:1;
- unsigned pad2:16;
- } cc3;
-
- struct
- {
- unsigned pad0:5;
- unsigned cc_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */
- } cc4;
-
- struct
- {
- unsigned pad0:2;
- unsigned ia_dest_blend_factor:5;
- unsigned ia_src_blend_factor:5;
- unsigned ia_blend_function:3;
- unsigned statistics_enable:1;
- unsigned logicop_func:4;
- unsigned pad1:11;
- unsigned dither_enable:1;
- } cc5;
-
- struct
- {
- unsigned clamp_post_alpha_blend:1;
- unsigned clamp_pre_alpha_blend:1;
- unsigned clamp_range:2;
- unsigned pad0:11;
- unsigned y_dither_offset:2;
- unsigned x_dither_offset:2;
- unsigned dest_blend_factor:5;
- unsigned src_blend_factor:5;
- unsigned blend_function:3;
- } cc6;
-
- struct {
- union {
- float f;
- uint8_t ub[4];
- } alpha_ref;
- } cc7;
-};
-
struct brw_gs_unit_state
{
struct thread0 thread0;
diff --git a/src/mesa/drivers/dri/i965/brw_util.h b/src/mesa/drivers/dri/i965/brw_util.h
index 8142860..095c43a 100644
--- a/src/mesa/drivers/dri/i965/brw_util.h
+++ b/src/mesa/drivers/dri/i965/brw_util.h
@@ -38,7 +38,6 @@
extern GLuint brw_translate_blend_factor( GLenum factor );
extern GLuint brw_translate_blend_equation( GLenum mode );
-extern GLenum brw_fix_xRGB_alpha(GLenum function);
static inline float
brw_get_line_width(struct brw_context *brw)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 8e99c89..d8dcaf4 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1177,7 +1177,7 @@ set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
struct gl_stencil_attrib *stencil = &ctx->Stencil;
const int b = stencil->_BackFace;
- if (depth->Test && depth_irb) {
+ if (depth->Test && (GEN_GEN <= 5 || depth_irb)) {
We should just always do the depth_irb check.
Post by Rafael Antognolli
ds->DepthTestEnable = true;
ds->DepthBufferWriteEnable = brw_depth_writes_enabled(brw);
ds->DepthTestFunction = intel_translate_compare_func(depth->Func);
@@ -1214,9 +1214,11 @@ set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
intel_translate_stencil_op(stencil->ZFailFunc[b]);
}
-#if GEN_GEN >= 9
+#if GEN_GEN <= 5 || GEN_GEN >= 9
ds->StencilReferenceValue = _mesa_get_stencil_ref(ctx, 0);
- ds->BackfaceStencilReferenceValue = _mesa_get_stencil_ref(ctx, b);
+ ds->BackfaceStencilReferenceValue =
+ GEN_GEN >= 9 || stencil->_TestTwoSide ?
+ _mesa_get_stencil_ref(ctx, b) : 0;
It should be harmless to program the backface ref value - it should
be ignored/unused if _TestTwoSide isn't set. So we could leave this as is.
Post by Rafael Antognolli
#endif
}
}
@@ -2527,6 +2529,28 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_factor(x) brw_translate_blend_factor(x)
#define blend_eqn(x) brw_translate_blend_equation(x)
+/**
+ * Modify blend function to force destination alpha to 1.0
+ *
+ * If \c function specifies a blend function that uses destination alpha,
+ * replace it with a function that hard-wires destination alpha to 1.0. This
+ * is used when rendering to xRGB targets.
+ */
+static GLenum
+brw_fix_xRGB_alpha(GLenum function)
+{
+ switch (function) {
+ return GL_ONE;
+
+ return GL_ZERO;
+ }
+
+ return function;
+}
+
#if GEN_GEN >= 6
typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
#else
@@ -2552,6 +2576,9 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
*/
const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+ const unsigned blend_enabled = GEN_GEN >= 6 ?
+ ctx->Color.BlendEnabled & (1 << i) : ctx->Color.BlendEnabled;
+
I think always using ctx->Color.BlendEnabled & (1 << i) should be fine.
That corresponds to the enable bit for blend entry 0, which is the
only one we're handling here. (Gen4-5 only support a single entry.)
I think I tried doing that, but I remember having some test failing, and that
was because BlendEnabled was set to something like 0x2, and this test would
catch it. But it could have been some other mistake on my side.

I'll double check this and see if it works like this.
Post by Kenneth Graunke
Post by Rafael Antognolli
/* _NEW_COLOR */
if (ctx->Color.ColorLogicOpEnabled) {
GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
@@ -2567,8 +2594,8 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
entry->LogicOpFunction =
intel_translate_logic_op(ctx->Color.LogicOp);
}
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
+ } else if (blend_enabled && !ctx->Color._AdvancedBlendMode
+ && (GEN_GEN <= 5 || !integer)) {
The GEN_GEN <= 5 || !integer seems bizarre, and I wonder whether it's
correct. However, you're just preserving the existing behavior, so
that's fine - we may want to revisit it in the future.
Post by Rafael Antognolli
GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
GLenum eqA = ctx->Color.Blend[i].EquationA;
GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
@@ -2994,17 +3021,40 @@ static const struct brw_tracked_state genX(multisample_state) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
static void
genX(upload_color_calc_state)(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;
brw_state_emit(brw, GENX(COLOR_CALC_STATE), 64, &brw->cc.state_offset, cc) {
+#if GEN_GEN <= 5
+ cc.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &cc, 0, false);
+ set_depth_stencil_bits(brw, &cc);
+
+ if (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
+ cc.AlphaTestEnable = true;
+ cc.AlphaTestFunction =
+ intel_translate_compare_func(ctx->Color.AlphaFunc);
+ }
if (ctx->Color.AlphaEnabled &&
(GEN_GEN >= 6 || ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
cc.AlphaTestEnable = true;
cc.AlphaTestFunction =
intel_translate_compare_func(ctx->Color.AlphaFunc);
cc.AlphaTestFormat = ALPHATEST_UNORM8;
UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
ctx->Color.AlphaRef);
}
Hmm... COLOR_CALC_STATE on gen >= 6 doesn't have the fields AlphaTestEnable
and AlphaTestFunction. Instead, they are inside the BLEND_STATE. Are you
suggesting that I use some kind of polymorphism here too, to set these fields
on both the BLEND_STATE and COLOR_CALC_STATE, depending on the generation?
Post by Kenneth Graunke
Alternatively, it should be harmless to set AlphaTestFormat,
AlphaReferenceValueAsUNORM8, and AlphaTestFunction even if alpha testing
cc.AlphaTestEnable = ctx->Color.AlphaEnabled &&
(GEN_GEN >= 6 || ctx->DrawBuffer->_NumColorDrawBuffers <= 1);
cc.AlphaTestFormat = ALPHATEST_UNORM8;
cc.AlphaTestFunction =
intel_translate_compare_func(ctx->Color.AlphaFunc);
UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
ctx->Color.AlphaRef);1
I suppose that does a bit of extra work when alpha testing is disabled.
Post by Rafael Antognolli
+
+ if (ctx->Color.DitherFlag) {
+ cc.ColorDitherEnable = true;
+ cc.XDitherOffset = 0;
+ cc.YDitherOffset = 0;
+ }
cc.ColorDitherEnable = ctx->Color.DitherFlag;
(the offset values will be zero-initialized by default).
Post by Rafael Antognolli
+
+ cc.StatisticsEnable = brw->stats_wm;
+
+ cc.CCViewportStatePointer =
+ instruction_ro_bo(brw->batch.bo, brw->cc.vp_offset);
+#else
/* _NEW_COLOR */
- cc.AlphaTestFormat = ALPHATEST_UNORM8;
- UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
- ctx->Color.AlphaRef);
+ cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
+ cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
+ cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
+ cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
#if GEN_GEN < 9
/* _NEW_STENCIL */
@@ -3013,34 +3063,47 @@ genX(upload_color_calc_state)(struct brw_context *brw)
_mesa_get_stencil_ref(ctx, ctx->Stencil._BackFace);
#endif
+#endif
+
/* _NEW_COLOR */
- cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
- cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
- cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
- cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
+ if (GEN_GEN >= 6 ||
+ (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1)) {
+ cc.AlphaTestFormat = ALPHATEST_UNORM8;
+ UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
+ ctx->Color.AlphaRef);
+ }
}
+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_CC_STATE_POINTERS), ptr) {
ptr.ColorCalcStatePointer = brw->cc.state_offset;
#if GEN_GEN != 7
ptr.ColorCalcStatePointerValid = true;
#endif
}
+#endif
+
+ brw->ctx.NewDriverState |= GEN_GEN <= 5 ? BRW_NEW_GEN4_UNIT_STATE : 0;
#if GEN_GEN >= 6
...
#else
ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
#endif
since we've already got generation-specific code blocks here.
Post by Rafael Antognolli
}
static const struct brw_tracked_state genX(color_calc_state) = {
.dirty = {
.mesa = _NEW_COLOR |
- _NEW_STENCIL,
+ _NEW_STENCIL |
+ (GEN_GEN <= 5 ? _NEW_BUFFERS |
+ _NEW_DEPTH
+ : 0),
.brw = BRW_NEW_BATCH |
BRW_NEW_BLORP |
- BRW_NEW_CC_STATE |
- BRW_NEW_STATE_BASE_ADDRESS,
+ (GEN_GEN <= 5 ? BRW_NEW_CC_VP |
+ BRW_NEW_STATS_WM
+ : BRW_NEW_CC_STATE |
+ BRW_NEW_STATE_BASE_ADDRESS),
},
.emit = genX(upload_color_calc_state),
};
-#endif
/* ---------------------------------------------------------------------- */
@@ -4252,7 +4315,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_recalculate_urb_fence,
&genX(cc_vp),
- &brw_cc_unit,
+ &genX(color_calc_state),
/* Surface state setup. Must come before the VS/WM unit. The binding
* table upload must be last.
Kenneth Graunke
2017-06-19 22:25:17 UTC
Permalink
Raw Message
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
Use set_blend_entry_bits and set_depth_stencil_bits to fill most of the
color calc struct, and then manually update the rest.
---
src/mesa/drivers/dri/i965/brw_cc.c | 174 --------------------------
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 92 --------------
src/mesa/drivers/dri/i965/brw_util.h | 1 -
src/mesa/drivers/dri/i965/genX_state_upload.c | 99 ++++++++++++---
5 files changed, 81 insertions(+), 286 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index cdaa696..503ec83 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
[snip]
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
@@ -2552,6 +2576,9 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
*/
const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+ const unsigned blend_enabled = GEN_GEN >= 6 ?
+ ctx->Color.BlendEnabled & (1 << i) : ctx->Color.BlendEnabled;
+
I think always using ctx->Color.BlendEnabled & (1 << i) should be fine.
That corresponds to the enable bit for blend entry 0, which is the
only one we're handling here. (Gen4-5 only support a single entry.)
I think I tried doing that, but I remember having some test failing, and that
was because BlendEnabled was set to something like 0x2, and this test would
catch it. But it could have been some other mistake on my side.
I'll double check this and see if it works like this.
Okay. We can always leave it as is and revisit it later, if you prefer.
It's simpler, and I thought it would be safe, but maybe it isn't.
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
/* _NEW_COLOR */
if (ctx->Color.ColorLogicOpEnabled) {
GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
@@ -2567,8 +2594,8 @@ set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
entry->LogicOpFunction =
intel_translate_logic_op(ctx->Color.LogicOp);
}
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
+ } else if (blend_enabled && !ctx->Color._AdvancedBlendMode
+ && (GEN_GEN <= 5 || !integer)) {
The GEN_GEN <= 5 || !integer seems bizarre, and I wonder whether it's
correct. However, you're just preserving the existing behavior, so
that's fine - we may want to revisit it in the future.
Post by Rafael Antognolli
GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
GLenum eqA = ctx->Color.Blend[i].EquationA;
GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
@@ -2994,17 +3021,40 @@ static const struct brw_tracked_state genX(multisample_state) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
static void
genX(upload_color_calc_state)(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;
brw_state_emit(brw, GENX(COLOR_CALC_STATE), 64, &brw->cc.state_offset, cc) {
+#if GEN_GEN <= 5
+ cc.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &cc, 0, false);
+ set_depth_stencil_bits(brw, &cc);
+
+ if (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
+ cc.AlphaTestEnable = true;
+ cc.AlphaTestFunction =
+ intel_translate_compare_func(ctx->Color.AlphaFunc);
+ }
if (ctx->Color.AlphaEnabled &&
(GEN_GEN >= 6 || ctx->DrawBuffer->_NumColorDrawBuffers <= 1) {
cc.AlphaTestEnable = true;
cc.AlphaTestFunction =
intel_translate_compare_func(ctx->Color.AlphaFunc);
cc.AlphaTestFormat = ALPHATEST_UNORM8;
UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
ctx->Color.AlphaRef);
}
Hmm... COLOR_CALC_STATE on gen >= 6 doesn't have the fields AlphaTestEnable
and AlphaTestFunction. Instead, they are inside the BLEND_STATE. Are you
suggesting that I use some kind of polymorphism here too, to set these fields
on both the BLEND_STATE and COLOR_CALC_STATE, depending on the generation?
No, definitely not suggesting anything complicated. I must have misread
something.

I guess what I'm suggesting is...
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
+#else
/* _NEW_COLOR */
- cc.AlphaTestFormat = ALPHATEST_UNORM8;
- UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
- ctx->Color.AlphaRef);
...we should continue setting these unconditionally.

ALPHATEST_UNORM8 is actually 0, so whether or not we execute that line has
no effect at all.

It should be fine to program the ref value regardless of whether alpha
testing is enabled or not. The GPU will ignore the ref if testing is off.
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
+ cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
+ cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
+ cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
+ cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
#if GEN_GEN < 9
/* _NEW_STENCIL */
@@ -3013,34 +3063,47 @@ genX(upload_color_calc_state)(struct brw_context *brw)
_mesa_get_stencil_ref(ctx, ctx->Stencil._BackFace);
#endif
+#endif
+
/* _NEW_COLOR */
- cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
- cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
- cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
- cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
+ if (GEN_GEN >= 6 ||
+ (ctx->Color.AlphaEnabled &&
+ ctx->DrawBuffer->_NumColorDrawBuffers <= 1)) {
+ cc.AlphaTestFormat = ALPHATEST_UNORM8;
+ UNCLAMPED_FLOAT_TO_UBYTE(cc.AlphaReferenceValueAsUNORM8,
+ ctx->Color.AlphaRef);
+ }
...which means we don't need to duplicate these conditions.
Rafael Antognolli
2017-06-16 23:31:27 UTC
Permalink
Raw Message
This function only emits a particular case of 3DSTATE_GS. Instead, we can do
that inside genX(upload_gs_state), and later reuse part of that code for
emitting gen4-5 state.

There's the additional benefit of allowing us to remove gen6_gs_state.c, which
was only left because of this function.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_state.h | 2 -
src/mesa/drivers/dri/i965/gen6_gs_state.c | 56 ---------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 17 +++++++-
4 files changed, 16 insertions(+), 60 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/gen6_gs_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index b2edba9..8af9a7c 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -69,7 +69,6 @@ i965_FILES = \
gen6_clip_state.c \
gen6_constant_state.c \
gen6_depth_state.c \
- gen6_gs_state.c \
gen6_multisample_state.c \
gen6_queryobj.c \
gen6_sampler_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index ead0078..af70464 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -361,8 +361,6 @@ void gen8_init_atoms(struct brw_context *brw);
void gen9_init_atoms(struct brw_context *brw);
void gen10_init_atoms(struct brw_context *brw);

-void upload_gs_state_for_tf(struct brw_context *brw);
-
/* Memory Object Control State:
* Specifying zero for L3 means "uncached in L3", at least on Haswell
* and Baytrail, since there are no PTE flags for setting L3 cacheability.
diff --git a/src/mesa/drivers/dri/i965/gen6_gs_state.c b/src/mesa/drivers/dri/i965/gen6_gs_state.c
deleted file mode 100644
index 6450c76..0000000
--- a/src/mesa/drivers/dri/i965/gen6_gs_state.c
+++ /dev/null
@@ -1,56 +0,0 @@
-/*
- * Copyright © 2009 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- *
- * Authors:
- * Eric Anholt <***@anholt.net>
- *
- */
-
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "intel_batchbuffer.h"
-#include "main/shaderapi.h"
-
-void
-upload_gs_state_for_tf(struct brw_context *brw)
-{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
-
- BEGIN_BATCH(7);
- OUT_BATCH(_3DSTATE_GS << 16 | (7 - 2));
- OUT_BATCH(brw->ff_gs.prog_offset);
- OUT_BATCH(GEN6_GS_SPF_MODE | GEN6_GS_VECTOR_MASK_ENABLE);
- OUT_BATCH(0); /* no scratch space */
- OUT_BATCH((2 << GEN6_GS_DISPATCH_START_GRF_SHIFT) |
- (brw->ff_gs.prog_data->urb_read_length << GEN6_GS_URB_READ_LENGTH_SHIFT));
- OUT_BATCH(((devinfo->max_gs_threads - 1) << GEN6_GS_MAX_THREADS_SHIFT) |
- GEN6_GS_STATISTICS_ENABLE |
- GEN6_GS_SO_STATISTICS_ENABLE |
- GEN6_GS_RENDERING_ENABLE);
- OUT_BATCH(GEN6_GS_SVBI_PAYLOAD_ENABLE |
- GEN6_GS_SVBI_POSTINCREMENT_ENABLE |
- (brw->ff_gs.prog_data->svbi_postincrement_value <<
- GEN6_GS_SVBI_POSTINCREMENT_VALUE_SHIFT) |
- GEN6_GS_ENABLE);
- ADVANCE_BATCH();
-}
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 766ceaa..1eeb24b 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2474,7 +2474,22 @@ genX(upload_gs_state)(struct brw_context *brw)
/* In gen6, transform feedback for the VS stage is done with an ad-hoc GS
* program. This function provides the needed 3DSTATE_GS for this.
*/
- upload_gs_state_for_tf(brw);
+ brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+ gs.KernelStartPointer = KSP(brw, brw->ff_gs.prog_offset);
+ gs.SingleProgramFlow = true;
+ gs.VectorMaskEnable = true;
+ gs.DispatchGRFStartRegisterForURBData = 2;
+ gs.VertexURBEntryReadLength = brw->ff_gs.prog_data->urb_read_length;
+ gs.MaximumNumberofThreads = devinfo->max_gs_threads - 1;
+ gs.StatisticsEnable = true;
+ gs.SOStatisticsEnable = true;
+ gs.RenderingEnabled = true;
+ gs.SVBIPayloadEnable = true;
+ gs.SVBIPostIncrementEnable = true;
+ gs.SVBIPostIncrementValue =
+ brw->ff_gs.prog_data->svbi_postincrement_value;
+ gs.Enable = true;
+ }
#endif
} else {
brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:21 UTC
Permalink
Raw Message
In newer gens, this field has a prefix and the non-IEEEE-745 mode is called
"Alternate", instead of simply "Alt".

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen6.xml | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 66b45ca..04986be 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -1409,9 +1409,9 @@
<field name="Thread Priority" start="81" end="81" type="uint">
<value name="High" value="1"/>
</field>
- <field name="Floating Point Mode" start="80" end="80" type="uint">
+ <field name="Floating Point Mode" start="80" end="80" type="uint" prefix="FLOATING_POINT_MODE">
<value name="IEEE-745" value="0"/>
- <value name="Alt" value="1"/>
+ <value name="Alternate" value="1"/>
</field>
<field name="Illegal Opcode Exception Enable" start="77" end="77" type="bool"/>
<field name="MaskStack Exception Enable" start="75" end="75" type="bool"/>
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:29 UTC
Permalink
Raw Message
Merge the code with gen6+ 3DSTATE_GS, and delete brw_gs_state.c,
together with brw_gs_unit_state.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_gs_state.c | 101 --------------------------
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 44 -----------
src/mesa/drivers/dri/i965/genX_state_upload.c | 80 +++++++++++++-------
5 files changed, 55 insertions(+), 172 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_gs_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 8af9a7c..a06a8c1 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -24,7 +24,6 @@ i965_FILES = \
brw_formatquery.c \
brw_gs.c \
brw_gs.h \
- brw_gs_state.c \
brw_gs_surface_state.c \
brw_link.cpp \
brw_meta_util.c \
diff --git a/src/mesa/drivers/dri/i965/brw_gs_state.c b/src/mesa/drivers/dri/i965/brw_gs_state.c
deleted file mode 100644
index bc3d2e5..0000000
--- a/src/mesa/drivers/dri/i965/brw_gs_state.c
+++ /dev/null
@@ -1,101 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
- the following conditions:
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- * Authors:
- * Keith Whitwell <***@vmware.com>
- */
-
-
-
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "intel_batchbuffer.h"
-
-static void
-brw_upload_gs_unit(struct brw_context *brw)
-{
- struct brw_gs_unit_state *gs;
-
- gs = brw_state_batch(brw, sizeof(*gs), 32, &brw->ff_gs.state_offset);
-
- memset(gs, 0, sizeof(*gs));
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_GS_PROG_DATA */
- if (brw->ff_gs.prog_active) {
- gs->thread0.grf_reg_count = (ALIGN(brw->ff_gs.prog_data->total_grf, 16) /
- 16 - 1);
-
- gs->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->ff_gs.state_offset +
- offsetof(struct brw_gs_unit_state, thread0),
- brw->ff_gs.prog_offset +
- (gs->thread0.grf_reg_count << 1)) >> 6;
-
- gs->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- gs->thread1.single_program_flow = 1;
-
- gs->thread3.dispatch_grf_start_reg = 1;
- gs->thread3.const_urb_entry_read_offset = 0;
- gs->thread3.const_urb_entry_read_length = 0;
- gs->thread3.urb_entry_read_offset = 0;
- gs->thread3.urb_entry_read_length =
- brw->ff_gs.prog_data->urb_read_length;
-
- /* BRW_NEW_URB_FENCE */
- gs->thread4.nr_urb_entries = brw->urb.nr_gs_entries;
- gs->thread4.urb_entry_allocation_size = brw->urb.vsize - 1;
-
- if (brw->urb.nr_gs_entries >= 8)
- gs->thread4.max_threads = 1;
- else
- gs->thread4.max_threads = 0;
- }
-
- if (brw->gen == 5)
- gs->thread4.rendering_enable = 1;
-
- /* BRW_NEW_VIEWPORT_COUNT */
- gs->gs6.max_vp_index = brw->clip.viewport_count - 1;
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-}
-
-const struct brw_tracked_state brw_gs_unit = {
- .dirty = {
- .mesa = 0,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_FF_GS_PROG_DATA |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_URB_FENCE |
- BRW_NEW_VIEWPORT_COUNT,
- },
- .emit = brw_upload_gs_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index af70464..8f3bd7f 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -53,7 +53,6 @@ extern const struct brw_tracked_state brw_constant_buffer;
extern const struct brw_tracked_state brw_curbe_offsets;
extern const struct brw_tracked_state brw_invariant_state;
extern const struct brw_tracked_state brw_fs_samplers;
-extern const struct brw_tracked_state brw_gs_unit;
extern const struct brw_tracked_state brw_binding_table_pointers;
extern const struct brw_tracked_state brw_depthbuffer;
extern const struct brw_tracked_state brw_recalculate_urb_fence;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 12f3024..6feab0d 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -180,50 +180,6 @@ struct brw_clip_unit_state
float viewport_ymax;
};

-struct brw_gs_unit_state
-{
- struct thread0 thread0;
- struct thread1 thread1;
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct
- {
- unsigned pad0:8;
- unsigned rendering_enable:1; /* for Ironlake */
- unsigned pad4:1;
- unsigned stats_enable:1;
- unsigned nr_urb_entries:7;
- unsigned pad1:1;
- unsigned urb_entry_allocation_size:5;
- unsigned pad2:1;
- unsigned max_threads:5;
- unsigned pad3:2;
- } thread4;
-
- struct
- {
- unsigned sampler_count:3;
- unsigned pad0:2;
- unsigned sampler_state_pointer:27;
- } gs5;
-
-
- struct
- {
- unsigned max_vp_index:4;
- unsigned pad0:12;
- unsigned svbi_post_inc_value:10;
- unsigned pad1:1;
- unsigned svbi_post_inc_enable:1;
- unsigned svbi_payload:1;
- unsigned discard_adjaceny:1;
- unsigned reorder_enable:1;
- unsigned pad2:1;
- } gs6;
-};
-
-
struct brw_wm_unit_state
{
struct thread0 thread0;
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 3666f68..424d2f8 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2338,18 +2338,18 @@ static const struct brw_tracked_state genX(sf_clip_viewport) = {

/* ---------------------------------------------------------------------- */

-#if GEN_GEN >= 6
static void
genX(upload_gs_state)(struct brw_context *brw)
{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
+ UNUSED struct gl_context *ctx = &brw->ctx;
+ UNUSED const struct gen_device_info *devinfo = &brw->screen->devinfo;
const struct brw_stage_state *stage_state = &brw->gs.base;
/* BRW_NEW_GEOMETRY_PROGRAM */
- bool active = brw->geometry_program;
+ bool active = GEN_GEN >= 6 && brw->geometry_program;

/* BRW_NEW_GS_PROG_DATA */
struct brw_stage_prog_data *stage_prog_data = stage_state->prog_data;
- const struct brw_vue_prog_data *vue_prog_data =
+ UNUSED const struct brw_vue_prog_data *vue_prog_data =
brw_vue_prog_data(stage_prog_data);
#if GEN_GEN >= 7
const struct brw_gs_prog_data *gs_prog_data =
@@ -2383,7 +2383,14 @@ genX(upload_gs_state)(struct brw_context *brw)
gen7_emit_cs_stall_flush(brw);
#endif

+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(GS_STATE), 32, &brw->ff_gs.state_offset, gs) {
+#endif
+
+#if GEN_GEN >= 6
if (active) {
INIT_THREAD_DISPATCH_FIELDS(gs, Vertex);

@@ -2434,7 +2441,6 @@ genX(upload_gs_state)(struct brw_context *brw)

#if GEN_GEN < 7
gs.SOStatisticsEnable = true;
- gs.RenderingEnabled = 1;
if (brw->geometry_program->info.has_transform_feedback_varyings)
gs.SVBIPayloadEnable = true;

@@ -2468,33 +2474,41 @@ genX(upload_gs_state)(struct brw_context *brw)
gs.VertexURBEntryOutputReadOffset = urb_entry_write_offset;
gs.VertexURBEntryOutputLength = MAX2(urb_entry_output_length, 1);
#endif
-#if GEN_GEN < 7
- } else if (brw->ff_gs.prog_active) {
+ }
+#endif
+
+#if GEN_GEN <= 6
+ if (!active && brw->ff_gs.prog_active) {
/* In gen6, transform feedback for the VS stage is done with an
* ad-hoc GS program. This function provides the needed 3DSTATE_GS
* for this.
*/
gs.KernelStartPointer = KSP(brw, brw->ff_gs.prog_offset);
gs.SingleProgramFlow = true;
- gs.VectorMaskEnable = true;
- gs.DispatchGRFStartRegisterForURBData = 2;
+ gs.DispatchGRFStartRegisterForURBData = GEN_GEN == 6 ? 2 : 1;
gs.VertexURBEntryReadLength = brw->ff_gs.prog_data->urb_read_length;
- gs.MaximumNumberofThreads = devinfo->max_gs_threads - 1;
- gs.StatisticsEnable = true;
- gs.SOStatisticsEnable = true;
- gs.RenderingEnabled = true;
+
+#if GEN_GEN <= 5
+ gs.GRFRegisterCount =
+ DIV_ROUND_UP(brw->ff_gs.prog_data->total_grf, 16) - 1;
+ /* BRW_NEW_URB_FENCE */
+ gs.NumberofURBEntries = brw->urb.nr_gs_entries;
+ gs.URBEntryAllocationSize = brw->urb.vsize - 1;
+ gs.MaximumNumberofThreads = brw->urb.nr_gs_entries >= 8 ? 1 : 0;
+ gs.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+#else
+ gs.Enable = true;
+ gs.VectorMaskEnable = true;
gs.SVBIPayloadEnable = true;
gs.SVBIPostIncrementEnable = true;
gs.SVBIPostIncrementValue =
brw->ff_gs.prog_data->svbi_postincrement_value;
- gs.Enable = true;
+ gs.SOStatisticsEnable = true;
+ gs.MaximumNumberofThreads = devinfo->max_gs_threads - 1;
#endif
- } else {
- gs.StatisticsEnable = true;
-#if GEN_GEN < 7
- gs.RenderingEnabled = true;
+ }
#endif
-
+ if (!active && !brw->ff_gs.prog_active) {
#if GEN_GEN < 8
gs.DispatchGRFStartRegisterForURBData = 1;
#if GEN_GEN >= 7
@@ -2502,6 +2516,16 @@ genX(upload_gs_state)(struct brw_context *brw)
#endif
#endif
}
+
+#if GEN_GEN >= 6
+ gs.StatisticsEnable = true;
+#endif
+#if GEN_GEN == 5 || GEN_GEN == 6
+ gs.RenderingEnabled = true;
+#endif
+#if GEN_GEN <= 5
+ gs.MaximumVPIndex = brw->clip.viewport_count - 1;
+#endif
}

#if GEN_GEN == 6
@@ -2511,17 +2535,23 @@ genX(upload_gs_state)(struct brw_context *brw)

static const struct brw_tracked_state genX(gs_state) = {
.dirty = {
- .mesa = (GEN_GEN < 7 ? _NEW_PROGRAM_CONSTANTS : 0),
+ .mesa = (GEN_GEN == 6 ? _NEW_PROGRAM_CONSTANTS : 0),
.brw = BRW_NEW_BATCH |
BRW_NEW_BLORP |
- BRW_NEW_CONTEXT |
- BRW_NEW_GEOMETRY_PROGRAM |
- BRW_NEW_GS_PROG_DATA |
+ (GEN_GEN <= 5 ? BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_FF_GS_PROG_DATA |
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_URB_FENCE |
+ BRW_NEW_VIEWPORT_COUNT
+ : 0) |
+ (GEN_GEN >= 6 ? BRW_NEW_CONTEXT |
+ BRW_NEW_GEOMETRY_PROGRAM |
+ BRW_NEW_GS_PROG_DATA
+ : 0) |
(GEN_GEN < 7 ? BRW_NEW_FF_GS_PROG_DATA : 0),
},
.emit = genX(upload_gs_state),
};
-#endif

/* ---------------------------------------------------------------------- */

@@ -4376,7 +4406,7 @@ genX(init_atoms)(struct brw_context *brw)
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
&brw_clip_unit,
- &brw_gs_unit,
+ &genX(gs_state),

/* Command packets:
*/
--
2.9.4
Kenneth Graunke
2017-07-12 21:42:49 UTC
Permalink
Raw Message
Post by Rafael Antognolli
Merge the code with gen6+ 3DSTATE_GS, and delete brw_gs_state.c,
together with brw_gs_unit_state.
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_gs_state.c | 101 --------------------------
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 44 -----------
src/mesa/drivers/dri/i965/genX_state_upload.c | 80 +++++++++++++-------
5 files changed, 55 insertions(+), 172 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_gs_state.c
diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 8af9a7c..a06a8c1 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -24,7 +24,6 @@ i965_FILES = \
brw_formatquery.c \
brw_gs.c \
brw_gs.h \
- brw_gs_state.c \
brw_gs_surface_state.c \
brw_link.cpp \
brw_meta_util.c \
diff --git a/src/mesa/drivers/dri/i965/brw_gs_state.c b/src/mesa/drivers/dri/i965/brw_gs_state.c
deleted file mode 100644
index bc3d2e5..0000000
--- a/src/mesa/drivers/dri/i965/brw_gs_state.c
+++ /dev/null
@@ -1,101 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- */
-
-
-
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "intel_batchbuffer.h"
-
-static void
-brw_upload_gs_unit(struct brw_context *brw)
-{
- struct brw_gs_unit_state *gs;
-
- gs = brw_state_batch(brw, sizeof(*gs), 32, &brw->ff_gs.state_offset);
-
- memset(gs, 0, sizeof(*gs));
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_GS_PROG_DATA */
- if (brw->ff_gs.prog_active) {
- gs->thread0.grf_reg_count = (ALIGN(brw->ff_gs.prog_data->total_grf, 16) /
- 16 - 1);
-
- gs->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->ff_gs.state_offset +
- offsetof(struct brw_gs_unit_state, thread0),
- brw->ff_gs.prog_offset +
- (gs->thread0.grf_reg_count << 1)) >> 6;
-
- gs->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- gs->thread1.single_program_flow = 1;
-
- gs->thread3.dispatch_grf_start_reg = 1;
- gs->thread3.const_urb_entry_read_offset = 0;
- gs->thread3.const_urb_entry_read_length = 0;
- gs->thread3.urb_entry_read_offset = 0;
- gs->thread3.urb_entry_read_length =
- brw->ff_gs.prog_data->urb_read_length;
-
- /* BRW_NEW_URB_FENCE */
- gs->thread4.nr_urb_entries = brw->urb.nr_gs_entries;
- gs->thread4.urb_entry_allocation_size = brw->urb.vsize - 1;
-
- if (brw->urb.nr_gs_entries >= 8)
- gs->thread4.max_threads = 1;
- else
- gs->thread4.max_threads = 0;
- }
-
- if (brw->gen == 5)
- gs->thread4.rendering_enable = 1;
-
- /* BRW_NEW_VIEWPORT_COUNT */
- gs->gs6.max_vp_index = brw->clip.viewport_count - 1;
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-}
-
-const struct brw_tracked_state brw_gs_unit = {
- .dirty = {
- .mesa = 0,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_FF_GS_PROG_DATA |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_URB_FENCE |
- BRW_NEW_VIEWPORT_COUNT,
- },
- .emit = brw_upload_gs_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index af70464..8f3bd7f 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -53,7 +53,6 @@ extern const struct brw_tracked_state brw_constant_buffer;
extern const struct brw_tracked_state brw_curbe_offsets;
extern const struct brw_tracked_state brw_invariant_state;
extern const struct brw_tracked_state brw_fs_samplers;
-extern const struct brw_tracked_state brw_gs_unit;
extern const struct brw_tracked_state brw_binding_table_pointers;
extern const struct brw_tracked_state brw_depthbuffer;
extern const struct brw_tracked_state brw_recalculate_urb_fence;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 12f3024..6feab0d 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -180,50 +180,6 @@ struct brw_clip_unit_state
float viewport_ymax;
};
-struct brw_gs_unit_state
-{
- struct thread0 thread0;
- struct thread1 thread1;
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct
- {
- unsigned pad0:8;
- unsigned rendering_enable:1; /* for Ironlake */
- unsigned pad4:1;
- unsigned stats_enable:1;
- unsigned nr_urb_entries:7;
- unsigned pad1:1;
- unsigned urb_entry_allocation_size:5;
- unsigned pad2:1;
- unsigned max_threads:5;
- unsigned pad3:2;
- } thread4;
-
- struct
- {
- unsigned sampler_count:3;
- unsigned pad0:2;
- unsigned sampler_state_pointer:27;
- } gs5;
-
-
- struct
- {
- unsigned max_vp_index:4;
- unsigned pad0:12;
- unsigned svbi_post_inc_value:10;
- unsigned pad1:1;
- unsigned svbi_post_inc_enable:1;
- unsigned svbi_payload:1;
- unsigned discard_adjaceny:1;
- unsigned reorder_enable:1;
- unsigned pad2:1;
- } gs6;
-};
-
-
struct brw_wm_unit_state
{
struct thread0 thread0;
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 3666f68..424d2f8 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2338,18 +2338,18 @@ static const struct brw_tracked_state genX(sf_clip_viewport) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
static void
genX(upload_gs_state)(struct brw_context *brw)
{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
+ UNUSED struct gl_context *ctx = &brw->ctx;
+ UNUSED const struct gen_device_info *devinfo = &brw->screen->devinfo;
const struct brw_stage_state *stage_state = &brw->gs.base;
/* BRW_NEW_GEOMETRY_PROGRAM */
- bool active = brw->geometry_program;
+ bool active = GEN_GEN >= 6 && brw->geometry_program;
/* BRW_NEW_GS_PROG_DATA */
struct brw_stage_prog_data *stage_prog_data = stage_state->prog_data;
- const struct brw_vue_prog_data *vue_prog_data =
+ UNUSED const struct brw_vue_prog_data *vue_prog_data =
brw_vue_prog_data(stage_prog_data);
#if GEN_GEN >= 7
const struct brw_gs_prog_data *gs_prog_data =
@@ -2383,7 +2383,14 @@ genX(upload_gs_state)(struct brw_context *brw)
gen7_emit_cs_stall_flush(brw);
#endif
+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(GS_STATE), 32, &brw->ff_gs.state_offset, gs) {
+#endif
+
+#if GEN_GEN >= 6
if (active) {
INIT_THREAD_DISPATCH_FIELDS(gs, Vertex);
@@ -2434,7 +2441,6 @@ genX(upload_gs_state)(struct brw_context *brw)
#if GEN_GEN < 7
gs.SOStatisticsEnable = true;
- gs.RenderingEnabled = 1;
if (brw->geometry_program->info.has_transform_feedback_varyings)
gs.SVBIPayloadEnable = true;
@@ -2468,33 +2474,41 @@ genX(upload_gs_state)(struct brw_context *brw)
gs.VertexURBEntryOutputReadOffset = urb_entry_write_offset;
gs.VertexURBEntryOutputLength = MAX2(urb_entry_output_length, 1);
#endif
-#if GEN_GEN < 7
- } else if (brw->ff_gs.prog_active) {
+ }
+#endif
+
+#if GEN_GEN <= 6
+ if (!active && brw->ff_gs.prog_active) {
/* In gen6, transform feedback for the VS stage is done with an
* ad-hoc GS program. This function provides the needed 3DSTATE_GS
* for this.
*/
gs.KernelStartPointer = KSP(brw, brw->ff_gs.prog_offset);
gs.SingleProgramFlow = true;
- gs.VectorMaskEnable = true;
- gs.DispatchGRFStartRegisterForURBData = 2;
+ gs.DispatchGRFStartRegisterForURBData = GEN_GEN == 6 ? 2 : 1;
gs.VertexURBEntryReadLength = brw->ff_gs.prog_data->urb_read_length;
- gs.MaximumNumberofThreads = devinfo->max_gs_threads - 1;
- gs.StatisticsEnable = true;
- gs.SOStatisticsEnable = true;
- gs.RenderingEnabled = true;
+
+#if GEN_GEN <= 5
+ gs.GRFRegisterCount =
+ DIV_ROUND_UP(brw->ff_gs.prog_data->total_grf, 16) - 1;
+ /* BRW_NEW_URB_FENCE */
+ gs.NumberofURBEntries = brw->urb.nr_gs_entries;
+ gs.URBEntryAllocationSize = brw->urb.vsize - 1;
+ gs.MaximumNumberofThreads = brw->urb.nr_gs_entries >= 8 ? 1 : 0;
+ gs.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+#else
+ gs.Enable = true;
+ gs.VectorMaskEnable = true;
gs.SVBIPayloadEnable = true;
gs.SVBIPostIncrementEnable = true;
gs.SVBIPostIncrementValue =
brw->ff_gs.prog_data->svbi_postincrement_value;
- gs.Enable = true;
+ gs.SOStatisticsEnable = true;
+ gs.MaximumNumberofThreads = devinfo->max_gs_threads - 1;
#endif
- } else {
- gs.StatisticsEnable = true;
-#if GEN_GEN < 7
- gs.RenderingEnabled = true;
+ }
#endif
-
+ if (!active && !brw->ff_gs.prog_active) {
#if GEN_GEN < 8
gs.DispatchGRFStartRegisterForURBData = 1;
#if GEN_GEN >= 7
@@ -2502,6 +2516,16 @@ genX(upload_gs_state)(struct brw_context *brw)
#endif
#endif
}
+
+#if GEN_GEN >= 6
+ gs.StatisticsEnable = true;
+#endif
+#if GEN_GEN == 5 || GEN_GEN == 6
+ gs.RenderingEnabled = true;
+#endif
+#if GEN_GEN <= 5
+ gs.MaximumVPIndex = brw->clip.viewport_count - 1;
+#endif
}
#if GEN_GEN == 6
@@ -2511,17 +2535,23 @@ genX(upload_gs_state)(struct brw_context *brw)
static const struct brw_tracked_state genX(gs_state) = {
.dirty = {
- .mesa = (GEN_GEN < 7 ? _NEW_PROGRAM_CONSTANTS : 0),
+ .mesa = (GEN_GEN == 6 ? _NEW_PROGRAM_CONSTANTS : 0),
.brw = BRW_NEW_BATCH |
BRW_NEW_BLORP |
- BRW_NEW_CONTEXT |
- BRW_NEW_GEOMETRY_PROGRAM |
- BRW_NEW_GS_PROG_DATA |
+ (GEN_GEN <= 5 ? BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_FF_GS_PROG_DATA |
You don't need BRW_NEW_FF_GS_PROG_DATA here, because...
Post by Rafael Antognolli
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_URB_FENCE |
+ BRW_NEW_VIEWPORT_COUNT
+ : 0) |
+ (GEN_GEN >= 6 ? BRW_NEW_CONTEXT |
+ BRW_NEW_GEOMETRY_PROGRAM |
+ BRW_NEW_GS_PROG_DATA
+ : 0) |
(GEN_GEN < 7 ? BRW_NEW_FF_GS_PROG_DATA : 0),
...you already have it here. Otherwise,
Post by Rafael Antognolli
},
.emit = genX(upload_gs_state),
};
-#endif
/* ---------------------------------------------------------------------- */
@@ -4376,7 +4406,7 @@ genX(init_atoms)(struct brw_context *brw)
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
&brw_clip_unit,
- &brw_gs_unit,
+ &genX(gs_state),
*/
Rafael Antognolli
2017-06-16 23:31:19 UTC
Permalink
Raw Message
"Pixel Shader Kill Pixel" -> "Pixel Shader Kills Pixel", which is how it's
called on newer gens.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen4.xml | 2 +-
src/intel/genxml/gen45.xml | 2 +-
src/intel/genxml/gen5.xml | 2 +-
src/mesa/drivers/dri/i965/gen4_blorp_exec.h | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
index 33b4871..f4ca3f0 100644
--- a/src/intel/genxml/gen4.xml
+++ b/src/intel/genxml/gen4.xml
@@ -789,7 +789,7 @@
<field name="Statistics Enable" start="128" end="128" type="bool"/>
<field name="Maximum Number of Threads" start="185" end="191" type="uint"/>
<field name="Legacy Diamond Line Rasterization" start="183" end="183" type="bool"/>
- <field name="Pixel Shader Kill Pixel" start="182" end="182" type="bool"/>
+ <field name="Pixel Shader Kills Pixel" start="182" end="182" type="bool"/>
<field name="Pixel Shader Computed Depth" start="181" end="181" type="bool"/>
<field name="Pixel Shader Uses Source Depth" start="180" end="180" type="bool"/>
<field name="Thread Dispatch Enable" start="179" end="179" type="bool"/>
diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index b708c4f..5c221b7 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -804,7 +804,7 @@
<field name="Statistics Enable" start="128" end="128" type="bool"/>
<field name="Maximum Number of Threads" start="185" end="191" type="uint"/>
<field name="Legacy Diamond Line Rasterization" start="183" end="183" type="bool"/>
- <field name="Pixel Shader Kill Pixel" start="182" end="182" type="bool"/>
+ <field name="Pixel Shader Kills Pixel" start="182" end="182" type="bool"/>
<field name="Pixel Shader Computed Depth" start="181" end="181" type="bool"/>
<field name="Pixel Shader Uses Source Depth" start="180" end="180" type="bool"/>
<field name="Thread Dispatch Enable" start="179" end="179" type="bool"/>
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index d6b2662..28d9dd5 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -899,7 +899,7 @@
<field name="Statistics Enable" start="128" end="128" type="bool"/>
<field name="Maximum Number of Threads" start="185" end="191" type="uint"/>
<field name="Legacy Diamond Line Rasterization" start="183" end="183" type="bool"/>
- <field name="Pixel Shader Kill Pixel" start="182" end="182" type="bool"/>
+ <field name="Pixel Shader Kills Pixel" start="182" end="182" type="bool"/>
<field name="Pixel Shader Computed Depth" start="181" end="181" type="bool"/>
<field name="Pixel Shader Uses Source Depth" start="180" end="180" type="bool"/>
<field name="Thread Dispatch Enable" start="179" end="179" type="bool"/>
diff --git a/src/mesa/drivers/dri/i965/gen4_blorp_exec.h b/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
index 86c5e54..f1d9394 100644
--- a/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
+++ b/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
@@ -131,7 +131,7 @@ blorp_emit_wm_state(struct blorp_batch *batch,
wm.SetupURBEntryReadOffset = 0;

wm.DepthCoefficientURBReadOffset = 1;
- wm.PixelShaderKillPixel = prog_data->uses_kill;
+ wm.PixelShaderKillsPixel = prog_data->uses_kill;
wm.ThreadDispatchEnable = true;
wm.EarlyDepthTestEnable = true;
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:23 UTC
Permalink
Raw Message
Add a helper function to reuse code that fills blend entry related
state, and make genX(upload_blend_state) use it. This function can later
be used by gen4-5 color calc state to set the blend related bits.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 182 ++++++++++++++------------
1 file changed, 101 insertions(+), 81 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 5e5dc48..8e99c89 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2528,6 +2528,104 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_eqn(x) brw_translate_blend_equation(x)

#if GEN_GEN >= 6
+typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
+#else
+typedef struct GENX(COLOR_CALC_STATE) BLEND_ENTRY_GENXML;
+#endif
+
+UNUSED static bool
+set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
+ bool alpha_to_one)
+{
+ struct gl_context *ctx = &brw->ctx;
+
+ /* _NEW_BUFFERS */
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+
+ bool independent_alpha_blend = false;
+
+ /* Used for implementing the following bit of GL_EXT_texture_integer:
+ * "Per-fragment operations that require floating-point color
+ * components, including multisample alpha operations, alpha test,
+ * blending, and dithering, have no effect when the corresponding
+ * colors are written to an integer color buffer."
+ */
+ const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+
+ /* _NEW_COLOR */
+ if (ctx->Color.ColorLogicOpEnabled) {
+ GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
+ : GL_UNSIGNED_NORMALIZED;
+ WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
+ rb_type != GL_UNSIGNED_NORMALIZED &&
+ rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
+ "renderbuffer\n",
+ _mesa_enum_to_string(ctx->Color.LogicOp),
+ _mesa_enum_to_string(rb_type));
+ if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
+ entry->LogicOpEnable = true;
+ entry->LogicOpFunction =
+ intel_translate_logic_op(ctx->Color.LogicOp);
+ }
+ } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
+ !ctx->Color._AdvancedBlendMode) {
+ GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
+ GLenum eqA = ctx->Color.Blend[i].EquationA;
+ GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
+ GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
+ GLenum srcA = ctx->Color.Blend[i].SrcA;
+ GLenum dstA = ctx->Color.Blend[i].DstA;
+
+ if (eqRGB == GL_MIN || eqRGB == GL_MAX)
+ srcRGB = dstRGB = GL_ONE;
+
+ if (eqA == GL_MIN || eqA == GL_MAX)
+ srcA = dstA = GL_ONE;
+
+ /* Due to hardware limitations, the destination may have information
+ * in an alpha channel even when the format specifies no alpha
+ * channel. In order to avoid getting any incorrect blending due to
+ * that alpha channel, coerce the blend factors to values that will
+ * not read the alpha channel, but will instead use the correct
+ * implicit value for alpha.
+ */
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
+ srcRGB = brw_fix_xRGB_alpha(srcRGB);
+ srcA = brw_fix_xRGB_alpha(srcA);
+ dstRGB = brw_fix_xRGB_alpha(dstRGB);
+ dstA = brw_fix_xRGB_alpha(dstA);
+ }
+
+ /* From the BLEND_STATE docs, DWord 0, Bit 29 (AlphaToOne Enable):
+ * "If Dual Source Blending is enabled, this bit must be disabled."
+ *
+ * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
+ * and leave it enabled anyway.
+ */
+ if (GEN_GEN >= 6 && ctx->Color.Blend[i]._UsesDualSrc && alpha_to_one) {
+ srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
+ srcA = fix_dual_blend_alpha_to_one(srcA);
+ dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
+ dstA = fix_dual_blend_alpha_to_one(dstA);
+ }
+
+ entry->ColorBufferBlendEnable = true;
+ entry->DestinationBlendFactor = blend_factor(dstRGB);
+ entry->SourceBlendFactor = blend_factor(srcRGB);
+ entry->DestinationAlphaBlendFactor = blend_factor(dstA);
+ entry->SourceAlphaBlendFactor = blend_factor(srcA);
+ entry->ColorBlendFunction = blend_eqn(eqRGB);
+ entry->AlphaBlendFunction = blend_eqn(eqA);
+
+ if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
+ independent_alpha_blend = true;
+ }
+
+ return independent_alpha_blend;
+}
+
+#if GEN_GEN >= 6
static void
genX(upload_blend_state)(struct brw_context *brw)
{
@@ -2594,87 +2692,9 @@ genX(upload_blend_state)(struct brw_context *brw)
#else
{
#endif
-
- /* _NEW_BUFFERS */
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
-
- /* Used for implementing the following bit of GL_EXT_texture_integer:
- * "Per-fragment operations that require floating-point color
- * components, including multisample alpha operations, alpha test,
- * blending, and dithering, have no effect when the corresponding
- * colors are written to an integer color buffer."
- */
- bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled) {
- GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
- : GL_UNSIGNED_NORMALIZED;
- WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
- rb_type != GL_UNSIGNED_NORMALIZED &&
- rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
- "renderbuffer\n",
- _mesa_enum_to_string(ctx->Color.LogicOp),
- _mesa_enum_to_string(rb_type));
- if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
- entry.LogicOpEnable = true;
- entry.LogicOpFunction =
- intel_translate_logic_op(ctx->Color.LogicOp);
- }
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
- GLenum eqA = ctx->Color.Blend[i].EquationA;
- GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
- GLenum srcA = ctx->Color.Blend[i].SrcA;
- GLenum dstA = ctx->Color.Blend[i].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX)
- srcRGB = dstRGB = GL_ONE;
-
- if (eqA == GL_MIN || eqA == GL_MAX)
- srcA = dstA = GL_ONE;
-
- /* Due to hardware limitations, the destination may have information
- * in an alpha channel even when the format specifies no alpha
- * channel. In order to avoid getting any incorrect blending due to
- * that alpha channel, coerce the blend factors to values that will
- * not read the alpha channel, but will instead use the correct
- * implicit value for alpha.
- */
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- /* From the BLEND_STATE docs, DWord 0, Bit 29 (AlphaToOne Enable):
- * "If Dual Source Blending is enabled, this bit must be disabled."
- *
- * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
- * and leave it enabled anyway.
- */
- if (ctx->Color.Blend[i]._UsesDualSrc && blend.AlphaToOneEnable) {
- srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
- srcA = fix_dual_blend_alpha_to_one(srcA);
- dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
- dstA = fix_dual_blend_alpha_to_one(dstA);
- }
-
- entry.ColorBufferBlendEnable = true;
- entry.DestinationBlendFactor = blend_factor(dstRGB);
- entry.SourceBlendFactor = blend_factor(srcRGB);
- entry.DestinationAlphaBlendFactor = blend_factor(dstA);
- entry.SourceAlphaBlendFactor = blend_factor(srcA);
- entry.ColorBlendFunction = blend_eqn(eqRGB);
- entry.AlphaBlendFunction = blend_eqn(eqA);
-
- if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
- blend.IndependentAlphaBlendEnable = true;
- }
+ blend.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable) ||
+ blend.IndependentAlphaBlendEnable;

/* See section 8.1.6 "Pre-Blend Color Clamping" of the
* SandyBridge PRM Volume 2 Part 1 for HW requirements.
--
2.9.4
Kenneth Graunke
2017-06-17 17:32:44 UTC
Permalink
Raw Message
Post by Rafael Antognolli
Add a helper function to reuse code that fills blend entry related
state, and make genX(upload_blend_state) use it. This function can later
be used by gen4-5 color calc state to set the blend related bits.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 182 ++++++++++++++------------
1 file changed, 101 insertions(+), 81 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 5e5dc48..8e99c89 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2528,6 +2528,104 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_eqn(x) brw_translate_blend_equation(x)
#if GEN_GEN >= 6
+typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
+#else
+typedef struct GENX(COLOR_CALC_STATE) BLEND_ENTRY_GENXML;
+#endif
+
+UNUSED static bool
+set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
+ bool alpha_to_one)
+{
+ struct gl_context *ctx = &brw->ctx;
+
+ /* _NEW_BUFFERS */
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+
+ bool independent_alpha_blend = false;
+
+ * "Per-fragment operations that require floating-point color
+ * components, including multisample alpha operations, alpha test,
+ * blending, and dithering, have no effect when the corresponding
+ * colors are written to an integer color buffer."
+ */
+ const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+
+ /* _NEW_COLOR */
+ if (ctx->Color.ColorLogicOpEnabled) {
+ GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
+ : GL_UNSIGNED_NORMALIZED;
+ WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
+ rb_type != GL_UNSIGNED_NORMALIZED &&
+ rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
+ "renderbuffer\n",
+ _mesa_enum_to_string(ctx->Color.LogicOp),
+ _mesa_enum_to_string(rb_type));
+ if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
+ entry->LogicOpEnable = true;
+ entry->LogicOpFunction =
+ intel_translate_logic_op(ctx->Color.LogicOp);
+ }
+ } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
+ !ctx->Color._AdvancedBlendMode) {
+ GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
+ GLenum eqA = ctx->Color.Blend[i].EquationA;
+ GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
+ GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
+ GLenum srcA = ctx->Color.Blend[i].SrcA;
+ GLenum dstA = ctx->Color.Blend[i].DstA;
+
+ if (eqRGB == GL_MIN || eqRGB == GL_MAX)
+ srcRGB = dstRGB = GL_ONE;
+
+ if (eqA == GL_MIN || eqA == GL_MAX)
+ srcA = dstA = GL_ONE;
+
+ /* Due to hardware limitations, the destination may have information
+ * in an alpha channel even when the format specifies no alpha
+ * channel. In order to avoid getting any incorrect blending due to
+ * that alpha channel, coerce the blend factors to values that will
+ * not read the alpha channel, but will instead use the correct
+ * implicit value for alpha.
+ */
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
+ srcRGB = brw_fix_xRGB_alpha(srcRGB);
+ srcA = brw_fix_xRGB_alpha(srcA);
+ dstRGB = brw_fix_xRGB_alpha(dstRGB);
+ dstA = brw_fix_xRGB_alpha(dstA);
+ }
+
+ * "If Dual Source Blending is enabled, this bit must be disabled."
+ *
+ * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
+ * and leave it enabled anyway.
+ */
+ if (GEN_GEN >= 6 && ctx->Color.Blend[i]._UsesDualSrc && alpha_to_one) {
+ srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
+ srcA = fix_dual_blend_alpha_to_one(srcA);
+ dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
+ dstA = fix_dual_blend_alpha_to_one(dstA);
+ }
+
+ entry->ColorBufferBlendEnable = true;
+ entry->DestinationBlendFactor = blend_factor(dstRGB);
+ entry->SourceBlendFactor = blend_factor(srcRGB);
+ entry->DestinationAlphaBlendFactor = blend_factor(dstA);
+ entry->SourceAlphaBlendFactor = blend_factor(srcA);
+ entry->ColorBlendFunction = blend_eqn(eqRGB);
+ entry->AlphaBlendFunction = blend_eqn(eqA);
+
+ if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
+ independent_alpha_blend = true;
+ }
+
+ return independent_alpha_blend;
+}
+
+#if GEN_GEN >= 6
static void
genX(upload_blend_state)(struct brw_context *brw)
{
@@ -2594,87 +2692,9 @@ genX(upload_blend_state)(struct brw_context *brw)
#else
{
#endif
-
- /* _NEW_BUFFERS */
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
-
- * "Per-fragment operations that require floating-point color
- * components, including multisample alpha operations, alpha test,
- * blending, and dithering, have no effect when the corresponding
- * colors are written to an integer color buffer."
- */
- bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled) {
- GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
- : GL_UNSIGNED_NORMALIZED;
- WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
- rb_type != GL_UNSIGNED_NORMALIZED &&
- rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
- "renderbuffer\n",
- _mesa_enum_to_string(ctx->Color.LogicOp),
- _mesa_enum_to_string(rb_type));
- if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
- entry.LogicOpEnable = true;
- entry.LogicOpFunction =
- intel_translate_logic_op(ctx->Color.LogicOp);
- }
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
- GLenum eqA = ctx->Color.Blend[i].EquationA;
- GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
- GLenum srcA = ctx->Color.Blend[i].SrcA;
- GLenum dstA = ctx->Color.Blend[i].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX)
- srcRGB = dstRGB = GL_ONE;
-
- if (eqA == GL_MIN || eqA == GL_MAX)
- srcA = dstA = GL_ONE;
-
- /* Due to hardware limitations, the destination may have information
- * in an alpha channel even when the format specifies no alpha
- * channel. In order to avoid getting any incorrect blending due to
- * that alpha channel, coerce the blend factors to values that will
- * not read the alpha channel, but will instead use the correct
- * implicit value for alpha.
- */
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- * "If Dual Source Blending is enabled, this bit must be disabled."
- *
- * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
- * and leave it enabled anyway.
- */
- if (ctx->Color.Blend[i]._UsesDualSrc && blend.AlphaToOneEnable) {
- srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
- srcA = fix_dual_blend_alpha_to_one(srcA);
- dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
- dstA = fix_dual_blend_alpha_to_one(dstA);
- }
-
- entry.ColorBufferBlendEnable = true;
- entry.DestinationBlendFactor = blend_factor(dstRGB);
- entry.SourceBlendFactor = blend_factor(srcRGB);
- entry.DestinationAlphaBlendFactor = blend_factor(dstA);
- entry.SourceAlphaBlendFactor = blend_factor(srcA);
- entry.ColorBlendFunction = blend_eqn(eqRGB);
- entry.AlphaBlendFunction = blend_eqn(eqA);
-
- if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
- blend.IndependentAlphaBlendEnable = true;
- }
+ blend.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable) ||
+ blend.IndependentAlphaBlendEnable;
It looks like this is the only place blend.IndependentAlphaBlendEnable
is set, so OR'ing in the existing value (of false / 0) should be a no-op.

blend.IndependentAlphaBlendEnable =
set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable);
Post by Rafael Antognolli
/* See section 8.1.6 "Pre-Blend Color Clamping" of the
* SandyBridge PRM Volume 2 Part 1 for HW requirements.
Rafael Antognolli
2017-06-19 16:23:41 UTC
Permalink
Raw Message
Post by Kenneth Graunke
Post by Rafael Antognolli
Add a helper function to reuse code that fills blend entry related
state, and make genX(upload_blend_state) use it. This function can later
be used by gen4-5 color calc state to set the blend related bits.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 182 ++++++++++++++------------
1 file changed, 101 insertions(+), 81 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 5e5dc48..8e99c89 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2528,6 +2528,104 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_eqn(x) brw_translate_blend_equation(x)
#if GEN_GEN >= 6
+typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
+#else
+typedef struct GENX(COLOR_CALC_STATE) BLEND_ENTRY_GENXML;
+#endif
+
+UNUSED static bool
+set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
+ bool alpha_to_one)
+{
+ struct gl_context *ctx = &brw->ctx;
+
+ /* _NEW_BUFFERS */
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+
+ bool independent_alpha_blend = false;
+
+ * "Per-fragment operations that require floating-point color
+ * components, including multisample alpha operations, alpha test,
+ * blending, and dithering, have no effect when the corresponding
+ * colors are written to an integer color buffer."
+ */
+ const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+
+ /* _NEW_COLOR */
+ if (ctx->Color.ColorLogicOpEnabled) {
+ GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
+ : GL_UNSIGNED_NORMALIZED;
+ WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
+ rb_type != GL_UNSIGNED_NORMALIZED &&
+ rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
+ "renderbuffer\n",
+ _mesa_enum_to_string(ctx->Color.LogicOp),
+ _mesa_enum_to_string(rb_type));
+ if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
+ entry->LogicOpEnable = true;
+ entry->LogicOpFunction =
+ intel_translate_logic_op(ctx->Color.LogicOp);
+ }
+ } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
+ !ctx->Color._AdvancedBlendMode) {
+ GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
+ GLenum eqA = ctx->Color.Blend[i].EquationA;
+ GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
+ GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
+ GLenum srcA = ctx->Color.Blend[i].SrcA;
+ GLenum dstA = ctx->Color.Blend[i].DstA;
+
+ if (eqRGB == GL_MIN || eqRGB == GL_MAX)
+ srcRGB = dstRGB = GL_ONE;
+
+ if (eqA == GL_MIN || eqA == GL_MAX)
+ srcA = dstA = GL_ONE;
+
+ /* Due to hardware limitations, the destination may have information
+ * in an alpha channel even when the format specifies no alpha
+ * channel. In order to avoid getting any incorrect blending due to
+ * that alpha channel, coerce the blend factors to values that will
+ * not read the alpha channel, but will instead use the correct
+ * implicit value for alpha.
+ */
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
+ srcRGB = brw_fix_xRGB_alpha(srcRGB);
+ srcA = brw_fix_xRGB_alpha(srcA);
+ dstRGB = brw_fix_xRGB_alpha(dstRGB);
+ dstA = brw_fix_xRGB_alpha(dstA);
+ }
+
+ * "If Dual Source Blending is enabled, this bit must be disabled."
+ *
+ * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
+ * and leave it enabled anyway.
+ */
+ if (GEN_GEN >= 6 && ctx->Color.Blend[i]._UsesDualSrc && alpha_to_one) {
+ srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
+ srcA = fix_dual_blend_alpha_to_one(srcA);
+ dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
+ dstA = fix_dual_blend_alpha_to_one(dstA);
+ }
+
+ entry->ColorBufferBlendEnable = true;
+ entry->DestinationBlendFactor = blend_factor(dstRGB);
+ entry->SourceBlendFactor = blend_factor(srcRGB);
+ entry->DestinationAlphaBlendFactor = blend_factor(dstA);
+ entry->SourceAlphaBlendFactor = blend_factor(srcA);
+ entry->ColorBlendFunction = blend_eqn(eqRGB);
+ entry->AlphaBlendFunction = blend_eqn(eqA);
+
+ if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
+ independent_alpha_blend = true;
+ }
+
+ return independent_alpha_blend;
+}
+
+#if GEN_GEN >= 6
static void
genX(upload_blend_state)(struct brw_context *brw)
{
@@ -2594,87 +2692,9 @@ genX(upload_blend_state)(struct brw_context *brw)
#else
{
#endif
-
- /* _NEW_BUFFERS */
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
-
- * "Per-fragment operations that require floating-point color
- * components, including multisample alpha operations, alpha test,
- * blending, and dithering, have no effect when the corresponding
- * colors are written to an integer color buffer."
- */
- bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled) {
- GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
- : GL_UNSIGNED_NORMALIZED;
- WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
- rb_type != GL_UNSIGNED_NORMALIZED &&
- rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
- "renderbuffer\n",
- _mesa_enum_to_string(ctx->Color.LogicOp),
- _mesa_enum_to_string(rb_type));
- if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
- entry.LogicOpEnable = true;
- entry.LogicOpFunction =
- intel_translate_logic_op(ctx->Color.LogicOp);
- }
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
- GLenum eqA = ctx->Color.Blend[i].EquationA;
- GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
- GLenum srcA = ctx->Color.Blend[i].SrcA;
- GLenum dstA = ctx->Color.Blend[i].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX)
- srcRGB = dstRGB = GL_ONE;
-
- if (eqA == GL_MIN || eqA == GL_MAX)
- srcA = dstA = GL_ONE;
-
- /* Due to hardware limitations, the destination may have information
- * in an alpha channel even when the format specifies no alpha
- * channel. In order to avoid getting any incorrect blending due to
- * that alpha channel, coerce the blend factors to values that will
- * not read the alpha channel, but will instead use the correct
- * implicit value for alpha.
- */
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- * "If Dual Source Blending is enabled, this bit must be disabled."
- *
- * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
- * and leave it enabled anyway.
- */
- if (ctx->Color.Blend[i]._UsesDualSrc && blend.AlphaToOneEnable) {
- srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
- srcA = fix_dual_blend_alpha_to_one(srcA);
- dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
- dstA = fix_dual_blend_alpha_to_one(dstA);
- }
-
- entry.ColorBufferBlendEnable = true;
- entry.DestinationBlendFactor = blend_factor(dstRGB);
- entry.SourceBlendFactor = blend_factor(srcRGB);
- entry.DestinationAlphaBlendFactor = blend_factor(dstA);
- entry.SourceAlphaBlendFactor = blend_factor(srcA);
- entry.ColorBlendFunction = blend_eqn(eqRGB);
- entry.AlphaBlendFunction = blend_eqn(eqA);
-
- if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
- blend.IndependentAlphaBlendEnable = true;
- }
+ blend.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable) ||
+ blend.IndependentAlphaBlendEnable;
It looks like this is the only place blend.IndependentAlphaBlendEnable
is set, so OR'ing in the existing value (of false / 0) should be a no-op.
blend.IndependentAlphaBlendEnable =
set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable);
It is the only place where it is set, but it's a loop. So if we set it on one
iteration of the loop, we could later unset on the next iteration. And that's
not the original behavior, as far as I understood.
Post by Kenneth Graunke
Post by Rafael Antognolli
/* See section 8.1.6 "Pre-Blend Color Clamping" of the
* SandyBridge PRM Volume 2 Part 1 for HW requirements.
Kenneth Graunke
2017-06-19 16:34:39 UTC
Permalink
Raw Message
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
Add a helper function to reuse code that fills blend entry related
state, and make genX(upload_blend_state) use it. This function can later
be used by gen4-5 color calc state to set the blend related bits.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 182 ++++++++++++++------------
1 file changed, 101 insertions(+), 81 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 5e5dc48..8e99c89 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2528,6 +2528,104 @@ fix_dual_blend_alpha_to_one(GLenum function)
#define blend_eqn(x) brw_translate_blend_equation(x)
#if GEN_GEN >= 6
+typedef struct GENX(BLEND_STATE_ENTRY) BLEND_ENTRY_GENXML;
+#else
+typedef struct GENX(COLOR_CALC_STATE) BLEND_ENTRY_GENXML;
+#endif
+
+UNUSED static bool
+set_blend_entry_bits(struct brw_context *brw, BLEND_ENTRY_GENXML *entry, int i,
+ bool alpha_to_one)
+{
+ struct gl_context *ctx = &brw->ctx;
+
+ /* _NEW_BUFFERS */
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+
+ bool independent_alpha_blend = false;
+
+ * "Per-fragment operations that require floating-point color
+ * components, including multisample alpha operations, alpha test,
+ * blending, and dithering, have no effect when the corresponding
+ * colors are written to an integer color buffer."
+ */
+ const bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
+
+ /* _NEW_COLOR */
+ if (ctx->Color.ColorLogicOpEnabled) {
+ GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
+ : GL_UNSIGNED_NORMALIZED;
+ WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
+ rb_type != GL_UNSIGNED_NORMALIZED &&
+ rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
+ "renderbuffer\n",
+ _mesa_enum_to_string(ctx->Color.LogicOp),
+ _mesa_enum_to_string(rb_type));
+ if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
+ entry->LogicOpEnable = true;
+ entry->LogicOpFunction =
+ intel_translate_logic_op(ctx->Color.LogicOp);
+ }
+ } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
+ !ctx->Color._AdvancedBlendMode) {
+ GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
+ GLenum eqA = ctx->Color.Blend[i].EquationA;
+ GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
+ GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
+ GLenum srcA = ctx->Color.Blend[i].SrcA;
+ GLenum dstA = ctx->Color.Blend[i].DstA;
+
+ if (eqRGB == GL_MIN || eqRGB == GL_MAX)
+ srcRGB = dstRGB = GL_ONE;
+
+ if (eqA == GL_MIN || eqA == GL_MAX)
+ srcA = dstA = GL_ONE;
+
+ /* Due to hardware limitations, the destination may have information
+ * in an alpha channel even when the format specifies no alpha
+ * channel. In order to avoid getting any incorrect blending due to
+ * that alpha channel, coerce the blend factors to values that will
+ * not read the alpha channel, but will instead use the correct
+ * implicit value for alpha.
+ */
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
+ srcRGB = brw_fix_xRGB_alpha(srcRGB);
+ srcA = brw_fix_xRGB_alpha(srcA);
+ dstRGB = brw_fix_xRGB_alpha(dstRGB);
+ dstA = brw_fix_xRGB_alpha(dstA);
+ }
+
+ * "If Dual Source Blending is enabled, this bit must be disabled."
+ *
+ * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
+ * and leave it enabled anyway.
+ */
+ if (GEN_GEN >= 6 && ctx->Color.Blend[i]._UsesDualSrc && alpha_to_one) {
+ srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
+ srcA = fix_dual_blend_alpha_to_one(srcA);
+ dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
+ dstA = fix_dual_blend_alpha_to_one(dstA);
+ }
+
+ entry->ColorBufferBlendEnable = true;
+ entry->DestinationBlendFactor = blend_factor(dstRGB);
+ entry->SourceBlendFactor = blend_factor(srcRGB);
+ entry->DestinationAlphaBlendFactor = blend_factor(dstA);
+ entry->SourceAlphaBlendFactor = blend_factor(srcA);
+ entry->ColorBlendFunction = blend_eqn(eqRGB);
+ entry->AlphaBlendFunction = blend_eqn(eqA);
+
+ if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
+ independent_alpha_blend = true;
+ }
+
+ return independent_alpha_blend;
+}
+
+#if GEN_GEN >= 6
static void
genX(upload_blend_state)(struct brw_context *brw)
{
@@ -2594,87 +2692,9 @@ genX(upload_blend_state)(struct brw_context *brw)
#else
{
#endif
-
- /* _NEW_BUFFERS */
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
-
- * "Per-fragment operations that require floating-point color
- * components, including multisample alpha operations, alpha test,
- * blending, and dithering, have no effect when the corresponding
- * colors are written to an integer color buffer."
- */
- bool integer = ctx->DrawBuffer->_IntegerBuffers & (0x1 << i);
-
- /* _NEW_COLOR */
- if (ctx->Color.ColorLogicOpEnabled) {
- GLenum rb_type = rb ? _mesa_get_format_datatype(rb->Format)
- : GL_UNSIGNED_NORMALIZED;
- WARN_ONCE(ctx->Color.LogicOp != GL_COPY &&
- rb_type != GL_UNSIGNED_NORMALIZED &&
- rb_type != GL_FLOAT, "Ignoring %s logic op on %s "
- "renderbuffer\n",
- _mesa_enum_to_string(ctx->Color.LogicOp),
- _mesa_enum_to_string(rb_type));
- if (GEN_GEN >= 8 || rb_type == GL_UNSIGNED_NORMALIZED) {
- entry.LogicOpEnable = true;
- entry.LogicOpFunction =
- intel_translate_logic_op(ctx->Color.LogicOp);
- }
- } else if (ctx->Color.BlendEnabled & (1 << i) && !integer &&
- !ctx->Color._AdvancedBlendMode) {
- GLenum eqRGB = ctx->Color.Blend[i].EquationRGB;
- GLenum eqA = ctx->Color.Blend[i].EquationA;
- GLenum srcRGB = ctx->Color.Blend[i].SrcRGB;
- GLenum dstRGB = ctx->Color.Blend[i].DstRGB;
- GLenum srcA = ctx->Color.Blend[i].SrcA;
- GLenum dstA = ctx->Color.Blend[i].DstA;
-
- if (eqRGB == GL_MIN || eqRGB == GL_MAX)
- srcRGB = dstRGB = GL_ONE;
-
- if (eqA == GL_MIN || eqA == GL_MAX)
- srcA = dstA = GL_ONE;
-
- /* Due to hardware limitations, the destination may have information
- * in an alpha channel even when the format specifies no alpha
- * channel. In order to avoid getting any incorrect blending due to
- * that alpha channel, coerce the blend factors to values that will
- * not read the alpha channel, but will instead use the correct
- * implicit value for alpha.
- */
- if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
- GL_TEXTURE_ALPHA_TYPE)) {
- srcRGB = brw_fix_xRGB_alpha(srcRGB);
- srcA = brw_fix_xRGB_alpha(srcA);
- dstRGB = brw_fix_xRGB_alpha(dstRGB);
- dstA = brw_fix_xRGB_alpha(dstA);
- }
-
- * "If Dual Source Blending is enabled, this bit must be disabled."
- *
- * We override SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO,
- * and leave it enabled anyway.
- */
- if (ctx->Color.Blend[i]._UsesDualSrc && blend.AlphaToOneEnable) {
- srcRGB = fix_dual_blend_alpha_to_one(srcRGB);
- srcA = fix_dual_blend_alpha_to_one(srcA);
- dstRGB = fix_dual_blend_alpha_to_one(dstRGB);
- dstA = fix_dual_blend_alpha_to_one(dstA);
- }
-
- entry.ColorBufferBlendEnable = true;
- entry.DestinationBlendFactor = blend_factor(dstRGB);
- entry.SourceBlendFactor = blend_factor(srcRGB);
- entry.DestinationAlphaBlendFactor = blend_factor(dstA);
- entry.SourceAlphaBlendFactor = blend_factor(srcA);
- entry.ColorBlendFunction = blend_eqn(eqRGB);
- entry.AlphaBlendFunction = blend_eqn(eqA);
-
- if (srcA != srcRGB || dstA != dstRGB || eqA != eqRGB)
- blend.IndependentAlphaBlendEnable = true;
- }
+ blend.IndependentAlphaBlendEnable =
+ set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable) ||
+ blend.IndependentAlphaBlendEnable;
It looks like this is the only place blend.IndependentAlphaBlendEnable
is set, so OR'ing in the existing value (of false / 0) should be a no-op.
blend.IndependentAlphaBlendEnable =
set_blend_entry_bits(brw, &entry, i, blend.AlphaToOneEnable);
It is the only place where it is set, but it's a loop. So if we set it on one
iteration of the loop, we could later unset on the next iteration. And that's
not the original behavior, as far as I understood.
Yikes, I missed that this is blend.* and not entry.*. You're right of
course, nevermind that comment.
Post by Rafael Antognolli
Post by Kenneth Graunke
Post by Rafael Antognolli
/* See section 8.1.6 "Pre-Blend Color Clamping" of the
* SandyBridge PRM Volume 2 Part 1 for HW requirements.
Rafael Antognolli
2017-06-16 23:31:18 UTC
Permalink
Raw Message
On gen4, WM_STATE only has one Kernel Start Pointer and one GRF Register
Count, but we can make the code that handles this on multiple gens simpler if
we add an index 0 to it too.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen4.xml | 4 ++--
src/intel/genxml/gen45.xml | 4 ++--
src/mesa/drivers/dri/i965/gen4_blorp_exec.h | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
index 5fcd6c9..33b4871 100644
--- a/src/intel/genxml/gen4.xml
+++ b/src/intel/genxml/gen4.xml
@@ -762,8 +762,8 @@
</struct>

<struct name="WM_STATE" length="8">
- <field name="Kernel Start Pointer" start="6" end="31" type="address"/>
- <field name="GRF Register Count" start="1" end="3" type="uint"/>
+ <field name="Kernel Start Pointer 0" start="6" end="31" type="address"/>
+ <field name="GRF Register Count 0" start="1" end="3" type="uint"/>
<field name="Single Program Flow" start="63" end="63" type="bool"/>
<field name="Binding Table Entry Count" start="50" end="57" type="uint"/>
<field name="Thread Priority" start="49" end="49" type="uint">
diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index 864946a..b708c4f 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -776,8 +776,8 @@
</struct>

<struct name="WM_STATE" length="8">
- <field name="Kernel Start Pointer" start="6" end="31" type="address"/>
- <field name="GRF Register Count" start="1" end="3" type="uint"/>
+ <field name="Kernel Start Pointer 0" start="6" end="31" type="address"/>
+ <field name="GRF Register Count 0" start="1" end="3" type="uint"/>
<field name="Single Program Flow" start="63" end="63" type="bool"/>
<field name="Binding Table Entry Count" start="50" end="57" type="uint"/>
<field name="Thread Priority" start="49" end="49" type="uint">
diff --git a/src/mesa/drivers/dri/i965/gen4_blorp_exec.h b/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
index 183c0da..86c5e54 100644
--- a/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
+++ b/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
@@ -139,9 +139,9 @@ blorp_emit_wm_state(struct blorp_batch *batch,
wm._16PixelDispatchEnable = prog_data->dispatch_16;

#if GEN_GEN == 4
- wm.KernelStartPointer =
+ wm.KernelStartPointer0 =
instruction_state_address(batch, params->wm_prog_kernel);
- wm.GRFRegisterCount = prog_data->reg_blocks_0;
+ wm.GRFRegisterCount0 = prog_data->reg_blocks_0;
#else
wm.KernelStartPointer0 = params->wm_prog_kernel;
wm.GRFRegisterCount0 = prog_data->reg_blocks_0;
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:20 UTC
Permalink
Raw Message
On gen6+, this is called "Dispatch GRF Start Register For Constant/Setup Data
0", while on gen5 and lower it's called only "Dispatch GRF Start Register For
URB Data", but it's essentially the same thing (URB data), so rename it to
match newer gens and simplify the C code that handles it.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/intel/genxml/gen4.xml | 2 +-
src/intel/genxml/gen45.xml | 2 +-
src/intel/genxml/gen5.xml | 2 +-
src/mesa/drivers/dri/i965/gen4_blorp_exec.h | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
index f4ca3f0..6f6f1bf 100644
--- a/src/intel/genxml/gen4.xml
+++ b/src/intel/genxml/gen4.xml
@@ -783,7 +783,7 @@
<field name="Constant URB Entry Read Offset" start="114" end="119" type="uint"/>
<field name="Setup URB Entry Read Length" start="107" end="112" type="uint"/>
<field name="Setup URB Entry Read Offset" start="100" end="105" type="uint"/>
- <field name="Dispatch GRF Start Register For URB Data" start="96" end="99" type="uint"/>
+ <field name="Dispatch GRF Start Register For Constant/Setup Data 0" start="96" end="99" type="uint"/>
<field name="Sampler State Pointer" start="133" end="159" type="address"/>
<field name="Sampler Count" start="130" end="132" type="uint"/>
<field name="Statistics Enable" start="128" end="128" type="bool"/>
diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index 5c221b7..7b2f769 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -798,7 +798,7 @@
<field name="Constant URB Entry Read Offset" start="114" end="119" type="uint"/>
<field name="Setup URB Entry Read Length" start="107" end="112" type="uint"/>
<field name="Setup URB Entry Read Offset" start="100" end="105" type="uint"/>
- <field name="Dispatch GRF Start Register For URB Data" start="96" end="99" type="uint"/>
+ <field name="Dispatch GRF Start Register For Constant/Setup Data 0" start="96" end="99" type="uint"/>
<field name="Sampler State Pointer" start="133" end="159" type="address"/>
<field name="Sampler Count" start="130" end="132" type="uint"/>
<field name="Statistics Enable" start="128" end="128" type="bool"/>
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index 28d9dd5..44dd0d1 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -893,7 +893,7 @@
<field name="Constant URB Entry Read Offset" start="114" end="119" type="uint"/>
<field name="Setup URB Entry Read Length" start="107" end="112" type="uint"/>
<field name="Setup URB Entry Read Offset" start="100" end="105" type="uint"/>
- <field name="Dispatch GRF Start Register For URB Data" start="96" end="99" type="uint"/>
+ <field name="Dispatch GRF Start Register For Constant/Setup Data 0" start="96" end="99" type="uint"/>
<field name="Sampler State Pointer" start="133" end="159" type="address"/>
<field name="Sampler Count" start="130" end="132" type="uint"/>
<field name="Statistics Enable" start="128" end="128" type="bool"/>
diff --git a/src/mesa/drivers/dri/i965/gen4_blorp_exec.h b/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
index f1d9394..764b198 100644
--- a/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
+++ b/src/mesa/drivers/dri/i965/gen4_blorp_exec.h
@@ -125,7 +125,7 @@ blorp_emit_wm_state(struct blorp_batch *batch,
}

if (prog_data) {
- wm.DispatchGRFStartRegisterForURBData =
+ wm.DispatchGRFStartRegisterForConstantSetupData0 =
prog_data->base.dispatch_grf_start_reg;
wm.SetupURBEntryReadLength = prog_data->num_varying_inputs * 2;
wm.SetupURBEntryReadOffset = 0;
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:28 UTC
Permalink
Raw Message
Since we always call brw_batch_emit anyways, we can hopefully make things
simpler by calling it only once, and then branching inside its body. This
can be helpful when bringing the gen4-5 code into this function.

Additionally, check for GEN_GEN == 6 instead of < 7 in cases that won't apply
to lower gens.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 1eeb24b..3666f68 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2356,7 +2356,7 @@ genX(upload_gs_state)(struct brw_context *brw)
brw_gs_prog_data(stage_prog_data);
#endif

-#if GEN_GEN < 7
+#if GEN_GEN == 6
brw_batch_emit(brw, GENX(3DSTATE_CONSTANT_GS), cgs) {
if (active && stage_state->push_const_size != 0) {
cgs.Buffer0Valid = true;
@@ -2383,8 +2383,8 @@ genX(upload_gs_state)(struct brw_context *brw)
gen7_emit_cs_stall_flush(brw);
#endif

- if (active) {
- brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+ brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+ if (active) {
INIT_THREAD_DISPATCH_FIELDS(gs, Vertex);

#if GEN_GEN >= 7
@@ -2468,13 +2468,12 @@ genX(upload_gs_state)(struct brw_context *brw)
gs.VertexURBEntryOutputReadOffset = urb_entry_write_offset;
gs.VertexURBEntryOutputLength = MAX2(urb_entry_output_length, 1);
#endif
- }
#if GEN_GEN < 7
- } else if (brw->ff_gs.prog_active) {
- /* In gen6, transform feedback for the VS stage is done with an ad-hoc GS
- * program. This function provides the needed 3DSTATE_GS for this.
- */
- brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+ } else if (brw->ff_gs.prog_active) {
+ /* In gen6, transform feedback for the VS stage is done with an
+ * ad-hoc GS program. This function provides the needed 3DSTATE_GS
+ * for this.
+ */
gs.KernelStartPointer = KSP(brw, brw->ff_gs.prog_offset);
gs.SingleProgramFlow = true;
gs.VectorMaskEnable = true;
@@ -2489,10 +2488,8 @@ genX(upload_gs_state)(struct brw_context *brw)
gs.SVBIPostIncrementValue =
brw->ff_gs.prog_data->svbi_postincrement_value;
gs.Enable = true;
- }
#endif
- } else {
- brw_batch_emit(brw, GENX(3DSTATE_GS), gs) {
+ } else {
gs.StatisticsEnable = true;
#if GEN_GEN < 7
gs.RenderingEnabled = true;
@@ -2506,7 +2503,8 @@ genX(upload_gs_state)(struct brw_context *brw)
#endif
}
}
-#if GEN_GEN < 7
+
+#if GEN_GEN == 6
brw->gs.enabled = active;
#endif
}
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:30 UTC
Permalink
Raw Message
The code doesn't get exactly a lot simpler but at least it is in a
single place, and we delete more than we add.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_clip_state.c | 147 -----------------------
src/mesa/drivers/dri/i965/brw_structs.h | 65 ----------
src/mesa/drivers/dri/i965/genX_state_upload.c | 164 +++++++++++++++++++-------
4 files changed, 119 insertions(+), 258 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_clip_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index a06a8c1..89be92e 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -6,7 +6,6 @@ i965_FILES = \
brw_bufmgr.h \
brw_clear.c \
brw_clip.c \
- brw_clip_state.c \
brw_compute.c \
brw_conditional_render.c \
brw_context.c \
diff --git a/src/mesa/drivers/dri/i965/brw_clip_state.c b/src/mesa/drivers/dri/i965/brw_clip_state.c
deleted file mode 100644
index 8f22c0f..0000000
--- a/src/mesa/drivers/dri/i965/brw_clip_state.c
+++ /dev/null
@@ -1,147 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
- the following conditions:
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- * Authors:
- * Keith Whitwell <***@vmware.com>
- */
-
-#include "intel_batchbuffer.h"
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "main/framebuffer.h"
-
-static void
-brw_upload_clip_unit(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- struct brw_clip_unit_state *clip;
-
- clip = brw_state_batch(brw, sizeof(*clip), 32, &brw->clip.state_offset);
- memset(clip, 0, sizeof(*clip));
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_CLIP_PROG_DATA */
- clip->thread0.grf_reg_count = (ALIGN(brw->clip.prog_data->total_grf, 16) /
- 16 - 1);
- clip->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->clip.state_offset +
- offsetof(struct brw_clip_unit_state, thread0),
- brw->clip.prog_offset +
- (clip->thread0.grf_reg_count << 1)) >> 6;
-
- clip->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- clip->thread1.single_program_flow = 1;
-
- clip->thread3.urb_entry_read_length = brw->clip.prog_data->urb_read_length;
- clip->thread3.const_urb_entry_read_length =
- brw->clip.prog_data->curb_read_length;
-
- /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
- clip->thread3.const_urb_entry_read_offset = brw->curbe.clip_start * 2;
- clip->thread3.dispatch_grf_start_reg = 1;
- clip->thread3.urb_entry_read_offset = 0;
-
- /* BRW_NEW_URB_FENCE */
- clip->thread4.nr_urb_entries = brw->urb.nr_clip_entries;
- clip->thread4.urb_entry_allocation_size = brw->urb.vsize - 1;
- /* If we have enough clip URB entries to run two threads, do so.
- */
- if (brw->urb.nr_clip_entries >= 10) {
- /* Half of the URB entries go to each thread, and it has to be an
- * even number.
- */
- assert(brw->urb.nr_clip_entries % 2 == 0);
-
- /* Although up to 16 concurrent Clip threads are allowed on Ironlake,
- * only 2 threads can output VUEs at a time.
- */
- if (brw->gen == 5)
- clip->thread4.max_threads = 16 - 1;
- else
- clip->thread4.max_threads = 2 - 1;
- } else {
- assert(brw->urb.nr_clip_entries >= 5);
- clip->thread4.max_threads = 1 - 1;
- }
-
- /* _NEW_TRANSFORM */
- if (brw->gen == 5 || brw->is_g4x)
- clip->clip5.userclip_enable_flags = ctx->Transform.ClipPlanesEnabled;
- else
- /* Up to 6 actual clip flags, plus the 7th for negative RHW workaround. */
- clip->clip5.userclip_enable_flags = (ctx->Transform.ClipPlanesEnabled & 0x3f) | 0x40;
-
- clip->clip5.userclip_must_clip = 1;
-
- /* enable guardband clipping if we can */
- clip->clip5.guard_band_enable = 1;
- clip->clip6.clipper_viewport_state_ptr =
- (brw->batch.bo->offset64 + brw->clip.vp_offset) >> 5;
-
- /* emit clip viewport relocation */
- brw_emit_reloc(&brw->batch,
- (brw->clip.state_offset +
- offsetof(struct brw_clip_unit_state, clip6)),
- brw->batch.bo, brw->clip.vp_offset,
- I915_GEM_DOMAIN_INSTRUCTION, 0);
-
- /* _NEW_TRANSFORM */
- if (!ctx->Transform.DepthClamp)
- clip->clip5.viewport_z_clip_enable = 1;
- clip->clip5.viewport_xy_clip_enable = 1;
- clip->clip5.vertex_position_space = BRW_CLIP_NDCSPACE;
- if (ctx->Transform.ClipDepthMode == GL_ZERO_TO_ONE)
- clip->clip5.api_mode = BRW_CLIP_API_DX;
- else
- clip->clip5.api_mode = BRW_CLIP_API_OGL;
- clip->clip5.clip_mode = brw->clip.prog_data->clip_mode;
-
- if (brw->is_g4x)
- clip->clip5.negative_w_clip_test = 1;
-
- clip->viewport_xmin = -1;
- clip->viewport_xmax = 1;
- clip->viewport_ymin = -1;
- clip->viewport_ymax = 1;
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-}
-
-const struct brw_tracked_state brw_clip_unit = {
- .dirty = {
- .mesa = _NEW_TRANSFORM |
- _NEW_VIEWPORT,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_CLIP_PROG_DATA |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_URB_FENCE,
- },
- .emit = brw_upload_clip_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 6feab0d..5a0d91d 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -115,71 +115,6 @@ struct thread3
unsigned pad3:1;
};

-
-
-struct brw_clip_unit_state
-{
- struct thread0 thread0;
- struct
- {
- unsigned pad0:7;
- unsigned sw_exception_enable:1;
- unsigned pad1:3;
- unsigned mask_stack_exception_enable:1;
- unsigned pad2:1;
- unsigned illegal_op_exception_enable:1;
- unsigned pad3:2;
- unsigned floating_point_mode:1;
- unsigned thread_priority:1;
- unsigned binding_table_entry_count:8;
- unsigned pad4:5;
- unsigned single_program_flow:1;
- } thread1;
-
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct
- {
- unsigned pad0:9;
- unsigned gs_output_stats:1; /* not always */
- unsigned stats_enable:1;
- unsigned nr_urb_entries:7;
- unsigned pad1:1;
- unsigned urb_entry_allocation_size:5;
- unsigned pad2:1;
- unsigned max_threads:5; /* may be less */
- unsigned pad3:2;
- } thread4;
-
- struct
- {
- unsigned pad0:13;
- unsigned clip_mode:3;
- unsigned userclip_enable_flags:8;
- unsigned userclip_must_clip:1;
- unsigned negative_w_clip_test:1;
- unsigned guard_band_enable:1;
- unsigned viewport_z_clip_enable:1;
- unsigned viewport_xy_clip_enable:1;
- unsigned vertex_position_space:1;
- unsigned api_mode:1;
- unsigned pad2:1;
- } clip5;
-
- struct
- {
- unsigned pad0:5;
- unsigned clipper_viewport_state_ptr:27;
- } clip6;
-
-
- float viewport_xmin;
- float viewport_xmax;
- float viewport_ymin;
- float viewport_ymax;
-};
-
struct brw_wm_unit_state
{
struct thread0 thread0;
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 424d2f8..4ff5394 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1267,26 +1267,104 @@ static const struct brw_tracked_state genX(depth_stencil_state) = {

/* ---------------------------------------------------------------------- */

-#if GEN_GEN >= 6
static void
genX(upload_clip_state)(struct brw_context *brw)
{
struct gl_context *ctx = &brw->ctx;

/* _NEW_BUFFERS */
- struct gl_framebuffer *fb = ctx->DrawBuffer;
+ UNUSED struct gl_framebuffer *fb = ctx->DrawBuffer;

/* BRW_NEW_FS_PROG_DATA */
- struct brw_wm_prog_data *wm_prog_data =
+ UNUSED struct brw_wm_prog_data *wm_prog_data =
brw_wm_prog_data(brw->wm.base.prog_data);

+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_CLIP), clip) {
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(CLIP_STATE), 32, &brw->clip.state_offset, clip) {
+#endif
+
+#if GEN_GEN <= 5
+ clip.KernelStartPointer = KSP_ro(brw, brw->clip.prog_offset);
+ clip.GRFRegisterCount =
+ DIV_ROUND_UP(brw->clip.prog_data->total_grf, 16) - 1;
+ clip.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+ clip.SingleProgramFlow = true;
+ clip.VertexURBEntryReadLength = brw->clip.prog_data->urb_read_length;
+ clip.ConstantURBEntryReadLength = brw->clip.prog_data->curb_read_length;
+
+ /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
+ clip.ConstantURBEntryReadOffset = brw->curbe.clip_start * 2;
+ clip.DispatchGRFStartRegisterForURBData = 1;
+ clip.VertexURBEntryReadOffset = 0;
+
+ /* BRW_NEW_URB_FENCE */
+ clip.NumberofURBEntries = brw->urb.nr_clip_entries;
+ clip.URBEntryAllocationSize = brw->urb.vsize - 1;
+
+ if (brw->urb.nr_clip_entries >= 10) {
+ /* Half of the URB entries go to each thread, and it has to be an
+ * even number.
+ */
+ assert(brw->urb.nr_clip_entries % 2 == 0);
+
+ /* Although up to 16 concurrent Clip threads are allowed on Ironlake,
+ * only 2 threads can output VUEs at a time.
+ */
+ clip.MaximumNumberofThreads = GEN_GEN == 5 ? 16 - 1 : 2 - 1;
+ } else {
+ assert(brw->urb.nr_clip_entries >= 5);
+ clip.MaximumNumberofThreads = 1 - 1;
+ }
+
+ clip.VertexPositionSpace = VPOS_NDCSPACE;
+ clip.UserClipFlagsMustClipEnable = true;
+ clip.GuardbandClipTestEnable = true;
+
+ clip.ClipperViewportStatePointer =
+ instruction_ro_bo(brw->batch.bo, brw->clip.vp_offset);
+
+ clip.ScreenSpaceViewportXMin = -1;
+ clip.ScreenSpaceViewportXMax = 1;
+ clip.ScreenSpaceViewportYMin = -1;
+ clip.ScreenSpaceViewportYMax = 1;
+#else
+ /* _NEW_LIGHT */
+ if (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION) {
+ clip.TriangleStripListProvokingVertexSelect = 0;
+ clip.TriangleFanProvokingVertexSelect = 1;
+ clip.LineStripListProvokingVertexSelect = 0;
+ } else {
+ clip.TriangleStripListProvokingVertexSelect = 2;
+ clip.TriangleFanProvokingVertexSelect = 2;
+ clip.LineStripListProvokingVertexSelect = 1;
+ }
+
clip.StatisticsEnable = !brw->meta_in_progress;

if (wm_prog_data->barycentric_interp_modes &
BRW_BARYCENTRIC_NONPERSPECTIVE_BITS)
clip.NonPerspectiveBarycentricEnable = true;

+ clip.ClipEnable = true;
+
+ clip.MinimumPointWidth = 0.125;
+ clip.MaximumPointWidth = 255.875;
+
+ /* BRW_NEW_VIEWPORT_COUNT */
+ const unsigned viewport_count = brw->clip.viewport_count;
+ clip.MaximumVPIndex = viewport_count - 1;
+ if (_mesa_geometric_layers(fb) == 0)
+ clip.ForceZeroRTAIndexEnable = true;
+
+#if GEN_GEN <= 8
+ clip.UserClipDistanceCullTestEnableBitmask =
+ brw_vue_prog_data(brw->vs.base.prog_data)->cull_distance_mask;
+#endif
+#endif
+
#if GEN_GEN >= 7
clip.EarlyCullEnable = true;
#endif
@@ -1314,27 +1392,18 @@ genX(upload_clip_state)(struct brw_context *brw)
#endif

#if GEN_GEN < 8
- clip.UserClipDistanceCullTestEnableBitmask =
- brw_vue_prog_data(brw->vs.base.prog_data)->cull_distance_mask;
-
clip.ViewportZClipTestEnable = !ctx->Transform.DepthClamp;
#endif

- /* _NEW_LIGHT */
- if (ctx->Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION) {
- clip.TriangleStripListProvokingVertexSelect = 0;
- clip.TriangleFanProvokingVertexSelect = 1;
- clip.LineStripListProvokingVertexSelect = 0;
+ /* _NEW_TRANSFORM */
+ if (GEN_GEN >= 5 || GEN_IS_G4X) {
+ clip.UserClipDistanceClipTestEnableBitmask =
+ ctx->Transform.ClipPlanesEnabled;
} else {
- clip.TriangleStripListProvokingVertexSelect = 2;
- clip.TriangleFanProvokingVertexSelect = 2;
- clip.LineStripListProvokingVertexSelect = 1;
+ clip.UserClipDistanceClipTestEnableBitmask =
+ (ctx->Transform.ClipPlanesEnabled & 0x3f) | 0x40;
}

- /* _NEW_TRANSFORM */
- clip.UserClipDistanceClipTestEnableBitmask =
- ctx->Transform.ClipPlanesEnabled;
-
#if GEN_GEN >= 8
clip.ForceUserClipDistanceClipTestEnableBitmask = true;
#endif
@@ -1346,10 +1415,9 @@ genX(upload_clip_state)(struct brw_context *brw)

clip.GuardbandClipTestEnable = true;

- /* BRW_NEW_VIEWPORT_COUNT */
- const unsigned viewport_count = brw->clip.viewport_count;
-
- if (ctx->RasterDiscard) {
+ if (GEN_GEN <= 5) {
+ clip.ClipMode = brw->clip.prog_data->clip_mode;
+ } else if (ctx->RasterDiscard) {
clip.ClipMode = CLIPMODE_REJECT_ALL;
#if GEN_GEN == 6
perf_debug("Rasterizer discard is currently implemented via the "
@@ -1360,42 +1428,48 @@ genX(upload_clip_state)(struct brw_context *brw)
clip.ClipMode = CLIPMODE_NORMAL;
}

- clip.ClipEnable = true;
+#if GEN_IS_G4X
+ clip.NegativeWClipTestEnable = true;
+#endif

/* _NEW_POLYGON,
* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_TES_PROG_DATA | BRW_NEW_PRIMITIVE
*/
- if (!brw_is_drawing_points(brw) && !brw_is_drawing_lines(brw))
+ if (GEN_GEN <= 5 ||
+ (!brw_is_drawing_points(brw) && !brw_is_drawing_lines(brw)))
clip.ViewportXYClipTestEnable = true;
-
- clip.MinimumPointWidth = 0.125;
- clip.MaximumPointWidth = 255.875;
- clip.MaximumVPIndex = viewport_count - 1;
- if (_mesa_geometric_layers(fb) == 0)
- clip.ForceZeroRTAIndexEnable = true;
}
}

static const struct brw_tracked_state genX(clip_state) = {
.dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_LIGHT |
- _NEW_POLYGON |
- _NEW_TRANSFORM,
+ .mesa = _NEW_TRANSFORM |
+ (GEN_GEN <= 5 ? _NEW_VIEWPORT : 0) |
+ (GEN_GEN >= 6 ? _NEW_BUFFERS |
+ _NEW_LIGHT |
+ _NEW_POLYGON
+ : 0),
.brw = BRW_NEW_BLORP |
- BRW_NEW_CONTEXT |
- BRW_NEW_FS_PROG_DATA |
- BRW_NEW_GS_PROG_DATA |
- BRW_NEW_VS_PROG_DATA |
- BRW_NEW_META_IN_PROGRESS |
- BRW_NEW_PRIMITIVE |
- BRW_NEW_RASTERIZER_DISCARD |
- BRW_NEW_TES_PROG_DATA |
- BRW_NEW_VIEWPORT_COUNT,
+ (GEN_GEN <= 5 ? BRW_NEW_BATCH |
+ BRW_NEW_BLORP |
+ BRW_NEW_CLIP_PROG_DATA |
+ BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_URB_FENCE
+ : 0) |
+ (GEN_GEN >= 6 ? BRW_NEW_CONTEXT |
+ BRW_NEW_FS_PROG_DATA |
+ BRW_NEW_GS_PROG_DATA |
+ BRW_NEW_VS_PROG_DATA |
+ BRW_NEW_META_IN_PROGRESS |
+ BRW_NEW_PRIMITIVE |
+ BRW_NEW_RASTERIZER_DISCARD |
+ BRW_NEW_TES_PROG_DATA |
+ BRW_NEW_VIEWPORT_COUNT
+ : 0),
},
.emit = genX(upload_clip_state),
};
-#endif

/* ---------------------------------------------------------------------- */

@@ -4405,7 +4479,7 @@ genX(init_atoms)(struct brw_context *brw)
&genX(sf_clip_viewport),
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
- &brw_clip_unit,
+ &genX(clip_state),
&genX(gs_state),

/* Command packets:
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:26 UTC
Permalink
Raw Message
It's a very simple conversion, and it allows us to delete brw_cc.c.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_cc.c | 62 ---------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 28 +++++++++++-
3 files changed, 27 insertions(+), 64 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_cc.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 8bac803..b2edba9 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -4,7 +4,6 @@ i965_FILES = \
brw_blorp.h \
brw_bufmgr.c \
brw_bufmgr.h \
- brw_cc.c \
brw_clear.c \
brw_clip.c \
brw_clip_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
deleted file mode 100644
index 503ec83..0000000
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ /dev/null
@@ -1,62 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
- the following conditions:
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- * Authors:
- * Keith Whitwell <***@vmware.com>
- */
-
-
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_util.h"
-#include "main/glformats.h"
-#include "main/macros.h"
-#include "main/stencil.h"
-#include "intel_batchbuffer.h"
-
-static void upload_blend_constant_color(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
-
- BEGIN_BATCH(5);
- OUT_BATCH(_3DSTATE_BLEND_CONSTANT_COLOR << 16 | (5-2));
- OUT_BATCH_F(ctx->Color.BlendColorUnclamped[0]);
- OUT_BATCH_F(ctx->Color.BlendColorUnclamped[1]);
- OUT_BATCH_F(ctx->Color.BlendColorUnclamped[2]);
- OUT_BATCH_F(ctx->Color.BlendColorUnclamped[3]);
- ADVANCE_BATCH();
-}
-
-const struct brw_tracked_state brw_blend_constant_color = {
- .dirty = {
- .mesa = _NEW_COLOR,
- .brw = BRW_NEW_CONTEXT |
- BRW_NEW_BLORP,
- },
- .emit = upload_blend_constant_color
-};
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index d8dcaf4..766ceaa 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -4301,6 +4301,32 @@ static const struct brw_tracked_state genX(vf_topology) = {

/* ---------------------------------------------------------------------- */

+#if GEN_GEN <= 5
+
+static void genX(upload_blend_constant_color)(struct brw_context *brw)
+{
+ struct gl_context *ctx = &brw->ctx;
+
+ brw_batch_emit(brw, GENX(3DSTATE_CONSTANT_COLOR), blend_cc) {
+ blend_cc.BlendConstantColorRed = ctx->Color.BlendColorUnclamped[0];
+ blend_cc.BlendConstantColorGreen = ctx->Color.BlendColorUnclamped[1];
+ blend_cc.BlendConstantColorBlue = ctx->Color.BlendColorUnclamped[2];
+ blend_cc.BlendConstantColorAlpha = ctx->Color.BlendColorUnclamped[3];
+ }
+}
+
+static const struct brw_tracked_state genX(blend_constant_color) = {
+ .dirty = {
+ .mesa = _NEW_COLOR,
+ .brw = BRW_NEW_CONTEXT |
+ BRW_NEW_BLORP,
+ },
+ .emit = genX(upload_blend_constant_color)
+};
+#endif
+
+/* ---------------------------------------------------------------------- */
+
void
genX(init_atoms)(struct brw_context *brw)
{
@@ -4344,7 +4370,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_invariant_state,

&brw_binding_table_pointers,
- &brw_blend_constant_color,
+ &genX(blend_constant_color),

&brw_depthbuffer,
--
2.9.4
Kenneth Graunke
2017-06-17 18:34:46 UTC
Permalink
Raw Message
Post by Rafael Antognolli
It's a very simple conversion, and it allows us to delete brw_cc.c.
Patches 13-15 are:
Reviewed-by: Kenneth Graunke <***@whitecape.org>
Rafael Antognolli
2017-06-16 23:31:22 UTC
Permalink
Raw Message
From: Kenneth Graunke <***@whitecape.org>

Gen4-5 basically glue DEPTH_STENCIL_STATE, COLOR_CALC_STATE, and
BLEND_STATE together into a single COLOR_CALC_STATE structure.

By making a helper function, we'll be able to reuse it when filling
out Gen4-5 COLOR_CALC_STATE without replicating any actual logic.

We use generation-defined typedef to handle the polymorphism.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 113 +++++++++++++++-----------
1 file changed, 65 insertions(+), 48 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index a5a9d51..5e5dc48 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1153,9 +1153,16 @@ genX(calculate_attr_overrides)(const struct brw_context *brw,

/* ---------------------------------------------------------------------- */

-#if GEN_GEN >= 6
-static void
-genX(upload_depth_stencil_state)(struct brw_context *brw)
+#if GEN_GEN >= 8
+typedef struct GENX(3DSTATE_WM_DEPTH_STENCIL) DEPTH_STENCIL_GENXML;
+#elif GEN_GEN >= 6
+typedef struct GENX(DEPTH_STENCIL_STATE) DEPTH_STENCIL_GENXML;
+#else
+typedef struct GENX(COLOR_CALC_STATE) DEPTH_STENCIL_GENXML;
+#endif
+
+static inline void
+set_depth_stencil_bits(struct brw_context *brw, DEPTH_STENCIL_GENXML *ds)
{
struct gl_context *ctx = &brw->ctx;

@@ -1170,66 +1177,76 @@ genX(upload_depth_stencil_state)(struct brw_context *brw)
struct gl_stencil_attrib *stencil = &ctx->Stencil;
const int b = stencil->_BackFace;

-#if GEN_GEN >= 8
- brw_batch_emit(brw, GENX(3DSTATE_WM_DEPTH_STENCIL), wmds) {
-#else
- uint32_t ds_offset;
- brw_state_emit(brw, GENX(DEPTH_STENCIL_STATE), 64, &ds_offset, wmds) {
-#endif
- if (depth->Test && depth_irb) {
- wmds.DepthTestEnable = true;
- wmds.DepthBufferWriteEnable = brw_depth_writes_enabled(brw);
- wmds.DepthTestFunction = intel_translate_compare_func(depth->Func);
- }
+ if (depth->Test && depth_irb) {
+ ds->DepthTestEnable = true;
+ ds->DepthBufferWriteEnable = brw_depth_writes_enabled(brw);
+ ds->DepthTestFunction = intel_translate_compare_func(depth->Func);
+ }

- if (stencil->_Enabled) {
- wmds.StencilTestEnable = true;
- wmds.StencilWriteMask = stencil->WriteMask[0] & 0xff;
- wmds.StencilTestMask = stencil->ValueMask[0] & 0xff;
-
- wmds.StencilTestFunction =
- intel_translate_compare_func(stencil->Function[0]);
- wmds.StencilFailOp =
- intel_translate_stencil_op(stencil->FailFunc[0]);
- wmds.StencilPassDepthPassOp =
- intel_translate_stencil_op(stencil->ZPassFunc[0]);
- wmds.StencilPassDepthFailOp =
- intel_translate_stencil_op(stencil->ZFailFunc[0]);
-
- wmds.StencilBufferWriteEnable = stencil->_WriteEnabled;
-
- if (stencil->_TestTwoSide) {
- wmds.DoubleSidedStencilEnable = true;
- wmds.BackfaceStencilWriteMask = stencil->WriteMask[b] & 0xff;
- wmds.BackfaceStencilTestMask = stencil->ValueMask[b] & 0xff;
-
- wmds.BackfaceStencilTestFunction =
- intel_translate_compare_func(stencil->Function[b]);
- wmds.BackfaceStencilFailOp =
- intel_translate_stencil_op(stencil->FailFunc[b]);
- wmds.BackfaceStencilPassDepthPassOp =
- intel_translate_stencil_op(stencil->ZPassFunc[b]);
- wmds.BackfaceStencilPassDepthFailOp =
- intel_translate_stencil_op(stencil->ZFailFunc[b]);
- }
+ if (stencil->_Enabled) {
+ ds->StencilTestEnable = true;
+ ds->StencilWriteMask = stencil->WriteMask[0] & 0xff;
+ ds->StencilTestMask = stencil->ValueMask[0] & 0xff;
+
+ ds->StencilTestFunction =
+ intel_translate_compare_func(stencil->Function[0]);
+ ds->StencilFailOp =
+ intel_translate_stencil_op(stencil->FailFunc[0]);
+ ds->StencilPassDepthPassOp =
+ intel_translate_stencil_op(stencil->ZPassFunc[0]);
+ ds->StencilPassDepthFailOp =
+ intel_translate_stencil_op(stencil->ZFailFunc[0]);
+
+ ds->StencilBufferWriteEnable = stencil->_WriteEnabled;
+
+ if (stencil->_TestTwoSide) {
+ ds->DoubleSidedStencilEnable = true;
+ ds->BackfaceStencilWriteMask = stencil->WriteMask[b] & 0xff;
+ ds->BackfaceStencilTestMask = stencil->ValueMask[b] & 0xff;
+
+ ds->BackfaceStencilTestFunction =
+ intel_translate_compare_func(stencil->Function[b]);
+ ds->BackfaceStencilFailOp =
+ intel_translate_stencil_op(stencil->FailFunc[b]);
+ ds->BackfaceStencilPassDepthPassOp =
+ intel_translate_stencil_op(stencil->ZPassFunc[b]);
+ ds->BackfaceStencilPassDepthFailOp =
+ intel_translate_stencil_op(stencil->ZFailFunc[b]);
+ }

#if GEN_GEN >= 9
- wmds.StencilReferenceValue = _mesa_get_stencil_ref(ctx, 0);
- wmds.BackfaceStencilReferenceValue = _mesa_get_stencil_ref(ctx, b);
+ ds->StencilReferenceValue = _mesa_get_stencil_ref(ctx, 0);
+ ds->BackfaceStencilReferenceValue = _mesa_get_stencil_ref(ctx, b);
#endif
- }
+ }
+}
+
+#if GEN_GEN >= 6
+static void
+genX(upload_depth_stencil_state)(struct brw_context *brw)
+{
+#if GEN_GEN >= 8
+ brw_batch_emit(brw, GENX(3DSTATE_WM_DEPTH_STENCIL), wmds) {
+ set_depth_stencil_bits(brw, &wmds);
+ }
+#else
+ uint32_t ds_offset;
+ brw_state_emit(brw, GENX(DEPTH_STENCIL_STATE), 64, &ds_offset, ds) {
+ set_depth_stencil_bits(brw, &ds);
}

+ /* Now upload a pointer to the indirect state */
#if GEN_GEN == 6
brw_batch_emit(brw, GENX(3DSTATE_CC_STATE_POINTERS), ptr) {
ptr.PointertoDEPTH_STENCIL_STATE = ds_offset;
ptr.DEPTH_STENCIL_STATEChange = true;
}
-#elif GEN_GEN == 7
+#else
brw_batch_emit(brw, GENX(3DSTATE_DEPTH_STENCIL_STATE_POINTERS), ptr) {
ptr.PointertoDEPTH_STENCIL_STATE = ds_offset;
}
#endif
+#endif
}

static const struct brw_tracked_state genX(depth_stencil_state) = {
--
2.9.4
Rafael Antognolli
2017-06-16 23:31:24 UTC
Permalink
Raw Message
gen6+ uses _mesa_base_format_has_channel() to check for the alpha
channel, while gen4-5 use ctx->DrawBuffer->Visual.alphaBits. By using
_mesa_base_format_has_channel() here we keep the same behavior accross
all gen.

While initially both ways of checking the alpha channel seemed correct
to me, this change also seems to fix fbo-blending-formats piglit test on
gen4.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/brw_cc.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index 78d3bc8..cdaa696 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -34,6 +34,7 @@
#include "brw_state.h"
#include "brw_defines.h"
#include "brw_util.h"
+#include "main/glformats.h"
#include "main/macros.h"
#include "main/stencil.h"
#include "intel_batchbuffer.h"
@@ -122,25 +123,27 @@ static void upload_cc_unit(struct brw_context *brw)
GLenum srcA = ctx->Color.Blend[0].SrcA;
GLenum dstA = ctx->Color.Blend[0].DstA;

+ if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
+ srcRGB = dstRGB = GL_ONE;
+ }
+
+ if (eqA == GL_MIN || eqA == GL_MAX) {
+ srcA = dstA = GL_ONE;
+ }
+
/* If the renderbuffer is XRGB, we have to frob the blend function to
* force the destination alpha to 1.0. This means replacing GL_DST_ALPHA
* with GL_ONE and GL_ONE_MINUS_DST_ALPHA with GL_ZERO.
*/
- if (ctx->DrawBuffer->Visual.alphaBits == 0) {
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
srcRGB = brw_fix_xRGB_alpha(srcRGB);
srcA = brw_fix_xRGB_alpha(srcA);
dstRGB = brw_fix_xRGB_alpha(dstRGB);
dstA = brw_fix_xRGB_alpha(dstA);
}

- if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
- srcRGB = dstRGB = GL_ONE;
- }
-
- if (eqA == GL_MIN || eqA == GL_MAX) {
- srcA = dstA = GL_ONE;
- }
-
cc->cc6.dest_blend_factor = brw_translate_blend_factor(dstRGB);
cc->cc6.src_blend_factor = brw_translate_blend_factor(srcRGB);
cc->cc6.blend_function = brw_translate_blend_equation(eqRGB);
--
2.9.4
Kenneth Graunke
2017-06-17 17:38:26 UTC
Permalink
Raw Message
Post by Rafael Antognolli
gen6+ uses _mesa_base_format_has_channel() to check for the alpha
channel, while gen4-5 use ctx->DrawBuffer->Visual.alphaBits. By using
_mesa_base_format_has_channel() here we keep the same behavior accross
all gen.
While initially both ways of checking the alpha channel seemed correct
to me, this change also seems to fix fbo-blending-formats piglit test on
gen4.
---
src/mesa/drivers/dri/i965/brw_cc.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index 78d3bc8..cdaa696 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -34,6 +34,7 @@
#include "brw_state.h"
#include "brw_defines.h"
#include "brw_util.h"
+#include "main/glformats.h"
#include "main/macros.h"
#include "main/stencil.h"
#include "intel_batchbuffer.h"
@@ -122,25 +123,27 @@ static void upload_cc_unit(struct brw_context *brw)
GLenum srcA = ctx->Color.Blend[0].SrcA;
GLenum dstA = ctx->Color.Blend[0].DstA;
+ if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
+ srcRGB = dstRGB = GL_ONE;
+ }
+
+ if (eqA == GL_MIN || eqA == GL_MAX) {
+ srcA = dstA = GL_ONE;
+ }
+
/* If the renderbuffer is XRGB, we have to frob the blend function to
* force the destination alpha to 1.0. This means replacing GL_DST_ALPHA
* with GL_ONE and GL_ONE_MINUS_DST_ALPHA with GL_ZERO.
*/
- if (ctx->DrawBuffer->Visual.alphaBits == 0) {
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
Mesa core in framebuffer.c does:

fb->Visual.alphaBits = _mesa_get_format_bits(fmt, GL_ALPHA_BITS);

which uses the actual Mesa format we chose for the buffer. In contrast,
_mesa_base_format_has_channel() looks at the GL format enums - the format
they actually requested. In other words, they might have asked for GL_RGB,
but we promoted them to MESA_FORMAT_RGBA*.

I think this is a good change.
Post by Rafael Antognolli
srcRGB = brw_fix_xRGB_alpha(srcRGB);
srcA = brw_fix_xRGB_alpha(srcA);
dstRGB = brw_fix_xRGB_alpha(dstRGB);
dstA = brw_fix_xRGB_alpha(dstA);
}
- if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
- srcRGB = dstRGB = GL_ONE;
- }
-
- if (eqA == GL_MIN || eqA == GL_MAX) {
- srcA = dstA = GL_ONE;
- }
-
Why are these moved? It seems harmless, but unrelated to what the
patch claims to do. Is it just for consistency across the code?
Post by Rafael Antognolli
cc->cc6.dest_blend_factor = brw_translate_blend_factor(dstRGB);
cc->cc6.src_blend_factor = brw_translate_blend_factor(srcRGB);
cc->cc6.blend_function = brw_translate_blend_equation(eqRGB);
Rafael Antognolli
2017-06-19 16:09:33 UTC
Permalink
Raw Message
Post by Kenneth Graunke
Post by Rafael Antognolli
gen6+ uses _mesa_base_format_has_channel() to check for the alpha
channel, while gen4-5 use ctx->DrawBuffer->Visual.alphaBits. By using
_mesa_base_format_has_channel() here we keep the same behavior accross
all gen.
While initially both ways of checking the alpha channel seemed correct
to me, this change also seems to fix fbo-blending-formats piglit test on
gen4.
---
src/mesa/drivers/dri/i965/brw_cc.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c
index 78d3bc8..cdaa696 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -34,6 +34,7 @@
#include "brw_state.h"
#include "brw_defines.h"
#include "brw_util.h"
+#include "main/glformats.h"
#include "main/macros.h"
#include "main/stencil.h"
#include "intel_batchbuffer.h"
@@ -122,25 +123,27 @@ static void upload_cc_unit(struct brw_context *brw)
GLenum srcA = ctx->Color.Blend[0].SrcA;
GLenum dstA = ctx->Color.Blend[0].DstA;
+ if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
+ srcRGB = dstRGB = GL_ONE;
+ }
+
+ if (eqA == GL_MIN || eqA == GL_MAX) {
+ srcA = dstA = GL_ONE;
+ }
+
/* If the renderbuffer is XRGB, we have to frob the blend function to
* force the destination alpha to 1.0. This means replacing GL_DST_ALPHA
* with GL_ONE and GL_ONE_MINUS_DST_ALPHA with GL_ZERO.
*/
- if (ctx->DrawBuffer->Visual.alphaBits == 0) {
+ const struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[0];
+ if (rb && !_mesa_base_format_has_channel(rb->_BaseFormat,
+ GL_TEXTURE_ALPHA_TYPE)) {
fb->Visual.alphaBits = _mesa_get_format_bits(fmt, GL_ALPHA_BITS);
which uses the actual Mesa format we chose for the buffer. In contrast,
_mesa_base_format_has_channel() looks at the GL format enums - the format
they actually requested. In other words, they might have asked for GL_RGB,
but we promoted them to MESA_FORMAT_RGBA*.
I think this is a good change.
Post by Rafael Antognolli
srcRGB = brw_fix_xRGB_alpha(srcRGB);
srcA = brw_fix_xRGB_alpha(srcA);
dstRGB = brw_fix_xRGB_alpha(dstRGB);
dstA = brw_fix_xRGB_alpha(dstA);
}
- if (eqRGB == GL_MIN || eqRGB == GL_MAX) {
- srcRGB = dstRGB = GL_ONE;
- }
-
- if (eqA == GL_MIN || eqA == GL_MAX) {
- srcA = dstA = GL_ONE;
- }
-
Why are these moved? It seems harmless, but unrelated to what the
patch claims to do. Is it just for consistency across the code?
Yeah, just for consistency. OK, I'll split them into separate patches.
Post by Kenneth Graunke
Post by Rafael Antognolli
cc->cc6.dest_blend_factor = brw_translate_blend_factor(dstRGB);
cc->cc6.src_blend_factor = brw_translate_blend_factor(srcRGB);
cc->cc6.blend_function = brw_translate_blend_equation(eqRGB);
Rafael Antognolli
2017-06-16 23:31:31 UTC
Permalink
Raw Message
The code doesn't get exactly a lot simpler but at least it is in a single
place, and we delete more than we add.

Signed-off-by: Rafael Antognolli <***@intel.com>
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 121 ------------
src/mesa/drivers/dri/i965/brw_wm.h | 2 -
src/mesa/drivers/dri/i965/brw_wm_state.c | 274 --------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 191 ++++++++++++++----
6 files changed, 153 insertions(+), 437 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_wm_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 89be92e..c15b3ef 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -61,7 +61,6 @@ i965_FILES = \
brw_vs_surface_state.c \
brw_wm.c \
brw_wm.h \
- brw_wm_state.c \
brw_wm_surface_state.c \
gen4_blorp_exec.h \
gen6_clip_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 8f3bd7f..9588a51 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -89,7 +89,6 @@ extern const struct brw_tracked_state brw_wm_image_surfaces;
extern const struct brw_tracked_state brw_cs_ubo_surfaces;
extern const struct brw_tracked_state brw_cs_abo_surfaces;
extern const struct brw_tracked_state brw_cs_image_surfaces;
-extern const struct brw_tracked_state brw_wm_unit;

extern const struct brw_tracked_state brw_psp_urb_cbs;

diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 5a0d91d..fb592be 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -65,127 +65,6 @@ struct brw_urb_fence
} bits1;
};

-/* State structs for the various fixed function units:
- */
-
-
-struct thread0
-{
- unsigned pad0:1;
- unsigned grf_reg_count:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
-};
-
-struct thread1
-{
- unsigned ext_halt_exception_enable:1;
- unsigned sw_exception_enable:1;
- unsigned mask_stack_exception_enable:1;
- unsigned timeout_exception_enable:1;
- unsigned illegal_op_exception_enable:1;
- unsigned pad0:3;
- unsigned depth_coef_urb_read_offset:6; /* WM only */
- unsigned pad1:2;
- unsigned floating_point_mode:1;
- unsigned thread_priority:1;
- unsigned binding_table_entry_count:8;
- unsigned pad3:5;
- unsigned single_program_flow:1;
-};
-
-struct thread2
-{
- unsigned per_thread_scratch_space:4;
- unsigned pad0:6;
- unsigned scratch_space_base_pointer:22;
-};
-
-
-struct thread3
-{
- unsigned dispatch_grf_start_reg:4;
- unsigned urb_entry_read_offset:6;
- unsigned pad0:1;
- unsigned urb_entry_read_length:6;
- unsigned pad1:1;
- unsigned const_urb_entry_read_offset:6;
- unsigned pad2:1;
- unsigned const_urb_entry_read_length:6;
- unsigned pad3:1;
-};
-
-struct brw_wm_unit_state
-{
- struct thread0 thread0;
- struct thread1 thread1;
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct {
- unsigned stats_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned sampler_count:3;
- unsigned sampler_state_pointer:27;
- } wm4;
-
- struct
- {
- unsigned enable_8_pix:1;
- unsigned enable_16_pix:1;
- unsigned enable_32_pix:1;
- unsigned enable_con_32_pix:1;
- unsigned enable_con_64_pix:1;
- unsigned pad0:1;
-
- /* These next four bits are for Ironlake+ */
- unsigned fast_span_coverage_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned depth_buffer_resolve_enable:1;
- unsigned hierarchical_depth_buffer_resolve_enable:1;
-
- unsigned legacy_global_depth_bias:1;
- unsigned line_stipple:1;
- unsigned depth_offset:1;
- unsigned polygon_stipple:1;
- unsigned line_aa_region_width:2;
- unsigned line_endcap_aa_region_width:2;
- unsigned early_depth_test:1;
- unsigned thread_dispatch_enable:1;
- unsigned program_uses_depth:1;
- unsigned program_computes_depth:1;
- unsigned program_uses_killpixel:1;
- unsigned legacy_line_rast: 1;
- unsigned transposed_urb_read_enable:1;
- unsigned max_threads:7;
- } wm5;
-
- float global_depth_offset_constant;
- float global_depth_offset_scale;
-
- /* for Ironlake only */
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_1:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_1:26;
- } wm8;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_2:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_2:26;
- } wm9;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_3:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_3:26;
- } wm10;
-};
-
struct gen5_sampler_default_color {
uint8_t ub[4];
float f[4];
diff --git a/src/mesa/drivers/dri/i965/brw_wm.h b/src/mesa/drivers/dri/i965/brw_wm.h
index 613172a..113cdf3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.h
+++ b/src/mesa/drivers/dri/i965/brw_wm.h
@@ -41,8 +41,6 @@
extern "C" {
#endif

-bool brw_color_buffer_write_enabled(struct brw_context *brw);
-
void
brw_upload_wm_prog(struct brw_context *brw);

diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c
deleted file mode 100644
index 69bbeb2..0000000
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ /dev/null
@@ -1,274 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
- the following conditions:
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- * Authors:
- * Keith Whitwell <***@vmware.com>
- */
-
-
-
-#include "intel_batchbuffer.h"
-#include "intel_fbo.h"
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_wm.h"
-#include "compiler/nir/nir.h"
-
-/***********************************************************************
- * WM unit - fragment programs and rasterization
- */
-
-bool
-brw_color_buffer_write_enabled(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- unsigned i;
-
- /* _NEW_BUFFERS */
- for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
- uint64_t outputs_written = fp->info.outputs_written;
-
- /* _NEW_COLOR */
- if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
- outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
- (ctx->Color.ColorMask[i][0] ||
- ctx->Color.ColorMask[i][1] ||
- ctx->Color.ColorMask[i][2] ||
- ctx->Color.ColorMask[i][3])) {
- return true;
- }
- }
-
- return false;
-}
-
-/**
- * Setup wm hardware state. See page 225 of Volume 2
- */
-static void
-brw_upload_wm_unit(struct brw_context *brw)
-{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- /* BRW_NEW_FS_PROG_DATA */
- const struct brw_wm_prog_data *prog_data =
- brw_wm_prog_data(brw->wm.base.prog_data);
- struct brw_wm_unit_state *wm;
-
- wm = brw_state_batch(brw, sizeof(*wm), 32, &brw->wm.base.state_offset);
- memset(wm, 0, sizeof(*wm));
-
- if (prog_data->dispatch_8 && prog_data->dispatch_16) {
- /* These two fields should be the same pre-gen6, which is why we
- * only have one hardware field to program for both dispatch
- * widths.
- */
- assert(prog_data->base.dispatch_grf_start_reg ==
- prog_data->dispatch_grf_start_reg_2);
- }
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_FS_PROG_DATA */
- wm->wm5.enable_8_pix = prog_data->dispatch_8;
- wm->wm5.enable_16_pix = prog_data->dispatch_16;
-
- if (prog_data->dispatch_8 || prog_data->dispatch_16) {
- wm->thread0.grf_reg_count = prog_data->reg_blocks_0;
- wm->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread0),
- brw->wm.base.prog_offset +
- (wm->thread0.grf_reg_count << 1)) >> 6;
- }
-
- if (prog_data->prog_offset_2) {
- wm->wm9.grf_reg_count_2 = prog_data->reg_blocks_2;
- wm->wm9.kernel_start_pointer_2 =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm9),
- brw->wm.base.prog_offset +
- prog_data->prog_offset_2 +
- (wm->wm9.grf_reg_count_2 << 1)) >> 6;
- }
-
- wm->thread1.depth_coef_urb_read_offset = 1;
- if (prog_data->base.use_alt_mode)
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- else
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754;
-
- wm->thread1.binding_table_entry_count =
- prog_data->base.binding_table.size_bytes / 4;
-
- if (prog_data->base.total_scratch != 0) {
- wm->thread2.scratch_space_base_pointer =
- brw->wm.base.scratch_bo->offset64 >> 10; /* reloc */
- wm->thread2.per_thread_scratch_space =
- ffs(brw->wm.base.per_thread_scratch) - 11;
- } else {
- wm->thread2.scratch_space_base_pointer = 0;
- wm->thread2.per_thread_scratch_space = 0;
- }
-
- wm->thread3.dispatch_grf_start_reg =
- prog_data->base.dispatch_grf_start_reg;
- wm->thread3.urb_entry_read_length =
- prog_data->num_varying_inputs * 2;
- wm->thread3.urb_entry_read_offset = 0;
- wm->thread3.const_urb_entry_read_length =
- prog_data->base.curb_read_length;
- /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
- wm->thread3.const_urb_entry_read_offset = brw->curbe.wm_start * 2;
-
- if (brw->gen == 5)
- wm->wm4.sampler_count = 0; /* hardware requirement */
- else {
- wm->wm4.sampler_count = (brw->wm.base.sampler_count + 1) / 4;
- }
-
- if (brw->wm.base.sampler_count) {
- /* BRW_NEW_SAMPLER_STATE_TABLE - reloc */
- wm->wm4.sampler_state_pointer = (brw->batch.bo->offset64 +
- brw->wm.base.sampler_offset) >> 5;
- } else {
- wm->wm4.sampler_state_pointer = 0;
- }
-
- /* BRW_NEW_FRAGMENT_PROGRAM */
- wm->wm5.program_uses_depth = prog_data->uses_src_depth;
- wm->wm5.program_computes_depth = (fp->info.outputs_written &
- BITFIELD64_BIT(FRAG_RESULT_DEPTH)) != 0;
- /* _NEW_BUFFERS
- * Override for NULL depthbuffer case, required by the Pixel Shader Computed
- * Depth field.
- */
- if (!intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH))
- wm->wm5.program_computes_depth = 0;
-
- /* _NEW_COLOR */
- wm->wm5.program_uses_killpixel =
- prog_data->uses_kill || ctx->Color.AlphaEnabled;
-
- wm->wm5.max_threads = devinfo->max_wm_threads - 1;
-
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) ||
- wm->wm5.program_uses_killpixel ||
- wm->wm5.program_computes_depth) {
- wm->wm5.thread_dispatch_enable = 1;
- }
-
- wm->wm5.legacy_line_rast = 0;
- wm->wm5.legacy_global_depth_bias = 0;
- wm->wm5.early_depth_test = 1; /* never need to disable */
- wm->wm5.line_aa_region_width = 0;
- wm->wm5.line_endcap_aa_region_width = 1;
-
- /* _NEW_POLYGONSTIPPLE */
- wm->wm5.polygon_stipple = ctx->Polygon.StippleFlag;
-
- /* _NEW_POLYGON */
- if (ctx->Polygon.OffsetFill) {
- wm->wm5.depth_offset = 1;
- /* Something weird going on with legacy_global_depth_bias,
- * offset_constant, scaling and MRD. This value passes glean
- * but gives some odd results elsewere (eg. the
- * quad-offset-units test).
- */
- wm->global_depth_offset_constant = ctx->Polygon.OffsetUnits * 2;
-
- /* This is the only value that passes glean:
- */
- wm->global_depth_offset_scale = ctx->Polygon.OffsetFactor;
- }
-
- /* _NEW_LINE */
- wm->wm5.line_stipple = ctx->Line.StippleFlag;
-
- /* BRW_NEW_STATS_WM */
- if (brw->stats_wm)
- wm->wm4.stats_enable = 1;
-
- /* Emit scratch space relocation */
- if (prog_data->base.total_scratch != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread2),
- brw->wm.base.scratch_bo,
- wm->thread2.per_thread_scratch_space,
- I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER);
- }
-
- /* Emit sampler state relocation */
- if (brw->wm.base.sampler_count != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm4),
- brw->batch.bo,
- brw->wm.base.sampler_offset | wm->wm4.stats_enable |
- (wm->wm4.sampler_count << 2),
- I915_GEM_DOMAIN_INSTRUCTION, 0);
- }
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* _NEW_POLGYON */
- if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
- BEGIN_BATCH(2);
- OUT_BATCH(_3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP << 16 | (2 - 2));
- OUT_BATCH_F(ctx->Polygon.OffsetClamp);
- ADVANCE_BATCH();
-
- brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
- }
-}
-
-const struct brw_tracked_state brw_wm_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_LINE |
- _NEW_POLYGON |
- _NEW_POLYGONSTIPPLE,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_FRAGMENT_PROGRAM |
- BRW_NEW_FS_PROG_DATA |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_SAMPLER_STATE_TABLE |
- BRW_NEW_STATS_WM,
- },
- .emit = brw_upload_wm_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 4ff5394..bc64c5d 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1713,7 +1713,33 @@ static const struct brw_tracked_state genX(sf_state) = {

/* ---------------------------------------------------------------------- */

-#if GEN_GEN >= 6
+static bool
+brw_color_buffer_write_enabled(struct brw_context *brw)
+{
+ struct gl_context *ctx = &brw->ctx;
+ /* BRW_NEW_FRAGMENT_PROGRAM */
+ const struct gl_program *fp = brw->fragment_program;
+ unsigned i;
+
+ /* _NEW_BUFFERS */
+ for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
+ struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+ uint64_t outputs_written = fp->info.outputs_written;
+
+ /* _NEW_COLOR */
+ if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
+ outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
+ (ctx->Color.ColorMask[i][0] ||
+ ctx->Color.ColorMask[i][1] ||
+ ctx->Color.ColorMask[i][2] ||
+ ctx->Color.ColorMask[i][3])) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
static void
genX(upload_wm)(struct brw_context *brw)
{
@@ -1725,11 +1751,10 @@ genX(upload_wm)(struct brw_context *brw)

UNUSED bool writes_depth =
wm_prog_data->computed_depth_mode != BRW_PSCDEPTH_OFF;
+ UNUSED struct brw_stage_state *stage_state = &brw->wm.base;
+ UNUSED const struct gen_device_info *devinfo = &brw->screen->devinfo;

-#if GEN_GEN < 7
- const struct brw_stage_state *stage_state = &brw->wm.base;
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
-
+#if GEN_GEN == 6
/* We can't fold this into gen6_upload_wm_push_constants(), because
* according to the SNB PRM, vol 2 part 1 section 7.2.2
* (3DSTATE_CONSTANT_PS [DevSNB]):
@@ -1748,27 +1773,94 @@ genX(upload_wm)(struct brw_context *brw)
}
#endif

+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_WM), wm) {
- wm.StatisticsEnable = true;
wm.LineAntialiasingRegionWidth = _10pixels;
wm.LineEndCapAntialiasingRegionWidth = _05pixels;

+ wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
+ wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(WM_STATE), 64, &stage_state->state_offset, wm) {
+ if (wm_prog_data->dispatch_8 && wm_prog_data->dispatch_16) {
+ /* These two fields should be the same pre-gen6, which is why we
+ * only have one hardware field to program for both dispatch
+ * widths.
+ */
+ assert(wm_prog_data->base.dispatch_grf_start_reg ==
+ wm_prog_data->dispatch_grf_start_reg_2);
+ }
+
+ if (wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16)
+ wm.GRFRegisterCount0 = wm_prog_data->reg_blocks_0;
+
+ if (stage_state->sampler_count)
+ wm.SamplerStatePointer =
+ instruction_ro_bo(brw->batch.bo, stage_state->sampler_offset);
+#if GEN_GEN == 5
+ if (wm_prog_data->prog_offset_2)
+ wm.GRFRegisterCount2 = wm_prog_data->reg_blocks_2;
+#endif
+
+ wm.SetupURBEntryReadLength = wm_prog_data->num_varying_inputs * 2;
+ wm.ConstantURBEntryReadLength = wm_prog_data->base.curb_read_length;
+ /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
+ wm.ConstantURBEntryReadOffset = brw->curbe.wm_start * 2;
+ wm.EarlyDepthTestEnable = true;
+ wm.LineAntialiasingRegionWidth = _05pixels;
+ wm.LineEndCapAntialiasingRegionWidth = _10pixels;
+
+ /* _NEW_POLYGON */
+ if (ctx->Polygon.OffsetFill) {
+ wm.GlobalDepthOffsetEnable = true;
+ /* Something weird going on with legacy_global_depth_bias,
+ * offset_constant, scaling and MRD. This value passes glean
+ * but gives some odd results elsewere (eg. the
+ * quad-offset-units test).
+ */
+ wm.GlobalDepthOffsetConstant = ctx->Polygon.OffsetUnits * 2;
+
+ /* This is the only value that passes glean:
+ */
+ wm.GlobalDepthOffsetScale = ctx->Polygon.OffsetFactor;
+ }
+
+ wm.DepthCoefficientURBReadOffset = 1;
+#endif
+
+ /* BRW_NEW_STATS_WM */
+ wm.StatisticsEnable = GEN_GEN >= 6 || brw->stats_wm;
+
#if GEN_GEN < 7
if (wm_prog_data->base.use_alt_mode)
- wm.FloatingPointMode = Alternate;
+ wm.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+
+ wm.SamplerCount = GEN_GEN == 5 ?
+ 0 : DIV_ROUND_UP(stage_state->sampler_count, 4);

- wm.SamplerCount = DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.BindingTableEntryCount = wm_prog_data->base.binding_table.size_bytes / 4;
+ wm.BindingTableEntryCount =
+ wm_prog_data->base.binding_table.size_bytes / 4;
wm.MaximumNumberofThreads = devinfo->max_wm_threads - 1;
wm._8PixelDispatchEnable = wm_prog_data->dispatch_8;
wm._16PixelDispatchEnable = wm_prog_data->dispatch_16;
wm.DispatchGRFStartRegisterForConstantSetupData0 =
wm_prog_data->base.dispatch_grf_start_reg;
- wm.DispatchGRFStartRegisterForConstantSetupData2 =
- wm_prog_data->dispatch_grf_start_reg_2;
- wm.KernelStartPointer0 = stage_state->prog_offset;
- wm.KernelStartPointer2 = stage_state->prog_offset +
- wm_prog_data->prog_offset_2;
+ if (GEN_GEN == 6 ||
+ wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16) {
+ wm.KernelStartPointer0 = KSP_ro(brw,
+ stage_state->prog_offset);
+ }
+
+#if GEN_GEN >= 5
+ if (GEN_GEN == 6 || wm_prog_data->prog_offset_2) {
+ wm.KernelStartPointer2 =
+ KSP_ro(brw, stage_state->prog_offset +
+ wm_prog_data->prog_offset_2);
+ }
+#endif
+
+#if GEN_GEN == 6
wm.DualSourceBlendEnable =
wm_prog_data->dual_src_blend && (ctx->Color.BlendEnabled & 1) &&
ctx->Color.Blend[0]._UsesDualSrc;
@@ -1792,42 +1884,34 @@ genX(upload_wm)(struct brw_context *brw)
else
wm.PositionXYOffsetSelect = POSOFFSET_NONE;

+ wm.DispatchGRFStartRegisterForConstantSetupData2 =
+ wm_prog_data->dispatch_grf_start_reg_2;
+#endif
+
if (wm_prog_data->base.total_scratch) {
wm.ScratchSpaceBasePointer =
- render_bo(stage_state->scratch_bo,
- ffs(stage_state->per_thread_scratch) - 11);
+ render_bo(stage_state->scratch_bo, 0);
+ wm.PerThreadScratchSpace =
+ ffs(stage_state->per_thread_scratch) - 11;
}

wm.PixelShaderComputedDepth = writes_depth;
#endif

- wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
-
/* _NEW_LINE */
wm.LineStippleEnable = ctx->Line.StippleFlag;

/* _NEW_POLYGON */
wm.PolygonStippleEnable = ctx->Polygon.StippleFlag;
- wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;

#if GEN_GEN < 8
- /* _NEW_BUFFERS */
- const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;

- wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+#if GEN_GEN >= 6
wm.PixelShaderUsesSourceW = wm_prog_data->uses_src_w;
- if (wm_prog_data->uses_kill ||
- _mesa_is_alpha_test_enabled(ctx) ||
- _mesa_is_alpha_to_coverage_enabled(ctx) ||
- wm_prog_data->uses_omask) {
- wm.PixelShaderKillsPixel = true;
- }

- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) || writes_depth ||
- wm_prog_data->has_side_effects || wm.PixelShaderKillsPixel) {
- wm.ThreadDispatchEnable = true;
- }
+ /* _NEW_BUFFERS */
+ const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+
if (multisampled_fbo) {
/* _NEW_MULTISAMPLE */
if (ctx->Multisample.Enabled)
@@ -1843,6 +1927,21 @@ genX(upload_wm)(struct brw_context *brw)
wm.MultisampleRasterizationMode = MSRASTMODE_OFF_PIXEL;
wm.MultisampleDispatchMode = MSDISPMODE_PERSAMPLE;
}
+#endif
+ wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+ if (wm_prog_data->uses_kill ||
+ _mesa_is_alpha_test_enabled(ctx) ||
+ _mesa_is_alpha_to_coverage_enabled(ctx) ||
+ (GEN_GEN >= 6 && wm_prog_data->uses_omask)) {
+ wm.PixelShaderKillsPixel = true;
+ }
+
+ /* _NEW_BUFFERS | _NEW_COLOR */
+ if (brw_color_buffer_write_enabled(brw) || writes_depth ||
+ wm.PixelShaderKillsPixel ||
+ (GEN_GEN >= 6 && wm_prog_data->has_side_effects)) {
+ wm.ThreadDispatchEnable = true;
+ }

#if GEN_GEN >= 7
wm.PixelShaderComputedDepthMode = wm_prog_data->computed_depth_mode;
@@ -1873,6 +1972,16 @@ genX(upload_wm)(struct brw_context *brw)
wm.EarlyDepthStencilControl = EDSC_PSEXEC;
#endif
}
+
+#if GEN_GEN <= 5
+ if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
+ brw_batch_emit(brw, GENX(3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP), clamp) {
+ clamp.GlobalDepthOffsetClamp = ctx->Polygon.OffsetClamp;
+ }
+
+ brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
+ }
+#endif
}

static const struct brw_tracked_state genX(wm_state) = {
@@ -1880,17 +1989,23 @@ static const struct brw_tracked_state genX(wm_state) = {
.mesa = _NEW_LINE |
_NEW_POLYGON |
(GEN_GEN < 8 ? _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_MULTISAMPLE :
+ _NEW_COLOR :
0) |
- (GEN_GEN < 7 ? _NEW_PROGRAM_CONSTANTS : 0),
+ (GEN_GEN == 6 ? _NEW_PROGRAM_CONSTANTS : 0) |
+ (GEN_GEN < 6 ? _NEW_POLYGONSTIPPLE : 0) |
+ (GEN_GEN < 8 && GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0),
.brw = BRW_NEW_BLORP |
BRW_NEW_FS_PROG_DATA |
+ (GEN_GEN < 6 ? BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_FRAGMENT_PROGRAM |
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_SAMPLER_STATE_TABLE |
+ BRW_NEW_STATS_WM
+ : 0) |
(GEN_GEN < 7 ? BRW_NEW_BATCH : BRW_NEW_CONTEXT),
},
.emit = genX(upload_wm),
};
-#endif

/* ---------------------------------------------------------------------- */

@@ -4475,7 +4590,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_vs_samplers,

/* These set up state for brw_psp_urb_cbs */
- &brw_wm_unit,
+ &genX(wm_state),
&genX(sf_clip_viewport),
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
--
2.9.4
Kristian Høgsberg
2017-06-19 16:46:30 UTC
Permalink
Raw Message
On Fri, Jun 16, 2017 at 4:31 PM, Rafael Antognolli
Post by Rafael Antognolli
The code doesn't get exactly a lot simpler but at least it is in a single
place, and we delete more than we add.
Another good point is that you get rid of struct brw_wm_unit_state
which was a third mechanism for encoding GEN state. We used to have
GENXML, manual packing and these bitfield structs. Now we're down to
just GENXML and some manual packing.

Kristian
Post by Rafael Antognolli
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 121 ------------
src/mesa/drivers/dri/i965/brw_wm.h | 2 -
src/mesa/drivers/dri/i965/brw_wm_state.c | 274 --------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 191 ++++++++++++++----
6 files changed, 153 insertions(+), 437 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_wm_state.c
diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 89be92e..c15b3ef 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -61,7 +61,6 @@ i965_FILES = \
brw_vs_surface_state.c \
brw_wm.c \
brw_wm.h \
- brw_wm_state.c \
brw_wm_surface_state.c \
gen4_blorp_exec.h \
gen6_clip_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 8f3bd7f..9588a51 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -89,7 +89,6 @@ extern const struct brw_tracked_state brw_wm_image_surfaces;
extern const struct brw_tracked_state brw_cs_ubo_surfaces;
extern const struct brw_tracked_state brw_cs_abo_surfaces;
extern const struct brw_tracked_state brw_cs_image_surfaces;
-extern const struct brw_tracked_state brw_wm_unit;
extern const struct brw_tracked_state brw_psp_urb_cbs;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 5a0d91d..fb592be 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -65,127 +65,6 @@ struct brw_urb_fence
} bits1;
};
- */
-
-
-struct thread0
-{
- unsigned pad0:1;
- unsigned grf_reg_count:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
-};
-
-struct thread1
-{
- unsigned ext_halt_exception_enable:1;
- unsigned sw_exception_enable:1;
- unsigned mask_stack_exception_enable:1;
- unsigned timeout_exception_enable:1;
- unsigned illegal_op_exception_enable:1;
- unsigned pad0:3;
- unsigned depth_coef_urb_read_offset:6; /* WM only */
- unsigned pad1:2;
- unsigned floating_point_mode:1;
- unsigned thread_priority:1;
- unsigned binding_table_entry_count:8;
- unsigned pad3:5;
- unsigned single_program_flow:1;
-};
-
-struct thread2
-{
- unsigned per_thread_scratch_space:4;
- unsigned pad0:6;
- unsigned scratch_space_base_pointer:22;
-};
-
-
-struct thread3
-{
- unsigned dispatch_grf_start_reg:4;
- unsigned urb_entry_read_offset:6;
- unsigned pad0:1;
- unsigned urb_entry_read_length:6;
- unsigned pad1:1;
- unsigned const_urb_entry_read_offset:6;
- unsigned pad2:1;
- unsigned const_urb_entry_read_length:6;
- unsigned pad3:1;
-};
-
-struct brw_wm_unit_state
-{
- struct thread0 thread0;
- struct thread1 thread1;
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct {
- unsigned stats_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned sampler_count:3;
- unsigned sampler_state_pointer:27;
- } wm4;
-
- struct
- {
- unsigned enable_8_pix:1;
- unsigned enable_16_pix:1;
- unsigned enable_32_pix:1;
- unsigned enable_con_32_pix:1;
- unsigned enable_con_64_pix:1;
- unsigned pad0:1;
-
- /* These next four bits are for Ironlake+ */
- unsigned fast_span_coverage_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned depth_buffer_resolve_enable:1;
- unsigned hierarchical_depth_buffer_resolve_enable:1;
-
- unsigned legacy_global_depth_bias:1;
- unsigned line_stipple:1;
- unsigned depth_offset:1;
- unsigned polygon_stipple:1;
- unsigned line_aa_region_width:2;
- unsigned line_endcap_aa_region_width:2;
- unsigned early_depth_test:1;
- unsigned thread_dispatch_enable:1;
- unsigned program_uses_depth:1;
- unsigned program_computes_depth:1;
- unsigned program_uses_killpixel:1;
- unsigned legacy_line_rast: 1;
- unsigned transposed_urb_read_enable:1;
- unsigned max_threads:7;
- } wm5;
-
- float global_depth_offset_constant;
- float global_depth_offset_scale;
-
- /* for Ironlake only */
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_1:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_1:26;
- } wm8;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_2:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_2:26;
- } wm9;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_3:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_3:26;
- } wm10;
-};
-
struct gen5_sampler_default_color {
uint8_t ub[4];
float f[4];
diff --git a/src/mesa/drivers/dri/i965/brw_wm.h b/src/mesa/drivers/dri/i965/brw_wm.h
index 613172a..113cdf3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.h
+++ b/src/mesa/drivers/dri/i965/brw_wm.h
@@ -41,8 +41,6 @@
extern "C" {
#endif
-bool brw_color_buffer_write_enabled(struct brw_context *brw);
-
void
brw_upload_wm_prog(struct brw_context *brw);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c
deleted file mode 100644
index 69bbeb2..0000000
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ /dev/null
@@ -1,274 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- */
-
-
-
-#include "intel_batchbuffer.h"
-#include "intel_fbo.h"
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_wm.h"
-#include "compiler/nir/nir.h"
-
-/***********************************************************************
- * WM unit - fragment programs and rasterization
- */
-
-bool
-brw_color_buffer_write_enabled(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- unsigned i;
-
- /* _NEW_BUFFERS */
- for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
- uint64_t outputs_written = fp->info.outputs_written;
-
- /* _NEW_COLOR */
- if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
- outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
- (ctx->Color.ColorMask[i][0] ||
- ctx->Color.ColorMask[i][1] ||
- ctx->Color.ColorMask[i][2] ||
- ctx->Color.ColorMask[i][3])) {
- return true;
- }
- }
-
- return false;
-}
-
-/**
- * Setup wm hardware state. See page 225 of Volume 2
- */
-static void
-brw_upload_wm_unit(struct brw_context *brw)
-{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- /* BRW_NEW_FS_PROG_DATA */
- const struct brw_wm_prog_data *prog_data =
- brw_wm_prog_data(brw->wm.base.prog_data);
- struct brw_wm_unit_state *wm;
-
- wm = brw_state_batch(brw, sizeof(*wm), 32, &brw->wm.base.state_offset);
- memset(wm, 0, sizeof(*wm));
-
- if (prog_data->dispatch_8 && prog_data->dispatch_16) {
- /* These two fields should be the same pre-gen6, which is why we
- * only have one hardware field to program for both dispatch
- * widths.
- */
- assert(prog_data->base.dispatch_grf_start_reg ==
- prog_data->dispatch_grf_start_reg_2);
- }
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_FS_PROG_DATA */
- wm->wm5.enable_8_pix = prog_data->dispatch_8;
- wm->wm5.enable_16_pix = prog_data->dispatch_16;
-
- if (prog_data->dispatch_8 || prog_data->dispatch_16) {
- wm->thread0.grf_reg_count = prog_data->reg_blocks_0;
- wm->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread0),
- brw->wm.base.prog_offset +
- (wm->thread0.grf_reg_count << 1)) >> 6;
- }
-
- if (prog_data->prog_offset_2) {
- wm->wm9.grf_reg_count_2 = prog_data->reg_blocks_2;
- wm->wm9.kernel_start_pointer_2 =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm9),
- brw->wm.base.prog_offset +
- prog_data->prog_offset_2 +
- (wm->wm9.grf_reg_count_2 << 1)) >> 6;
- }
-
- wm->thread1.depth_coef_urb_read_offset = 1;
- if (prog_data->base.use_alt_mode)
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- else
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754;
-
- wm->thread1.binding_table_entry_count =
- prog_data->base.binding_table.size_bytes / 4;
-
- if (prog_data->base.total_scratch != 0) {
- wm->thread2.scratch_space_base_pointer =
- brw->wm.base.scratch_bo->offset64 >> 10; /* reloc */
- wm->thread2.per_thread_scratch_space =
- ffs(brw->wm.base.per_thread_scratch) - 11;
- } else {
- wm->thread2.scratch_space_base_pointer = 0;
- wm->thread2.per_thread_scratch_space = 0;
- }
-
- wm->thread3.dispatch_grf_start_reg =
- prog_data->base.dispatch_grf_start_reg;
- wm->thread3.urb_entry_read_length =
- prog_data->num_varying_inputs * 2;
- wm->thread3.urb_entry_read_offset = 0;
- wm->thread3.const_urb_entry_read_length =
- prog_data->base.curb_read_length;
- /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
- wm->thread3.const_urb_entry_read_offset = brw->curbe.wm_start * 2;
-
- if (brw->gen == 5)
- wm->wm4.sampler_count = 0; /* hardware requirement */
- else {
- wm->wm4.sampler_count = (brw->wm.base.sampler_count + 1) / 4;
- }
-
- if (brw->wm.base.sampler_count) {
- /* BRW_NEW_SAMPLER_STATE_TABLE - reloc */
- wm->wm4.sampler_state_pointer = (brw->batch.bo->offset64 +
- brw->wm.base.sampler_offset) >> 5;
- } else {
- wm->wm4.sampler_state_pointer = 0;
- }
-
- /* BRW_NEW_FRAGMENT_PROGRAM */
- wm->wm5.program_uses_depth = prog_data->uses_src_depth;
- wm->wm5.program_computes_depth = (fp->info.outputs_written &
- BITFIELD64_BIT(FRAG_RESULT_DEPTH)) != 0;
- /* _NEW_BUFFERS
- * Override for NULL depthbuffer case, required by the Pixel Shader Computed
- * Depth field.
- */
- if (!intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH))
- wm->wm5.program_computes_depth = 0;
-
- /* _NEW_COLOR */
- wm->wm5.program_uses_killpixel =
- prog_data->uses_kill || ctx->Color.AlphaEnabled;
-
- wm->wm5.max_threads = devinfo->max_wm_threads - 1;
-
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) ||
- wm->wm5.program_uses_killpixel ||
- wm->wm5.program_computes_depth) {
- wm->wm5.thread_dispatch_enable = 1;
- }
-
- wm->wm5.legacy_line_rast = 0;
- wm->wm5.legacy_global_depth_bias = 0;
- wm->wm5.early_depth_test = 1; /* never need to disable */
- wm->wm5.line_aa_region_width = 0;
- wm->wm5.line_endcap_aa_region_width = 1;
-
- /* _NEW_POLYGONSTIPPLE */
- wm->wm5.polygon_stipple = ctx->Polygon.StippleFlag;
-
- /* _NEW_POLYGON */
- if (ctx->Polygon.OffsetFill) {
- wm->wm5.depth_offset = 1;
- /* Something weird going on with legacy_global_depth_bias,
- * offset_constant, scaling and MRD. This value passes glean
- * but gives some odd results elsewere (eg. the
- * quad-offset-units test).
- */
- wm->global_depth_offset_constant = ctx->Polygon.OffsetUnits * 2;
-
- */
- wm->global_depth_offset_scale = ctx->Polygon.OffsetFactor;
- }
-
- /* _NEW_LINE */
- wm->wm5.line_stipple = ctx->Line.StippleFlag;
-
- /* BRW_NEW_STATS_WM */
- if (brw->stats_wm)
- wm->wm4.stats_enable = 1;
-
- /* Emit scratch space relocation */
- if (prog_data->base.total_scratch != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread2),
- brw->wm.base.scratch_bo,
- wm->thread2.per_thread_scratch_space,
- I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER);
- }
-
- /* Emit sampler state relocation */
- if (brw->wm.base.sampler_count != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm4),
- brw->batch.bo,
- brw->wm.base.sampler_offset | wm->wm4.stats_enable |
- (wm->wm4.sampler_count << 2),
- I915_GEM_DOMAIN_INSTRUCTION, 0);
- }
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* _NEW_POLGYON */
- if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
- BEGIN_BATCH(2);
- OUT_BATCH(_3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP << 16 | (2 - 2));
- OUT_BATCH_F(ctx->Polygon.OffsetClamp);
- ADVANCE_BATCH();
-
- brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
- }
-}
-
-const struct brw_tracked_state brw_wm_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_LINE |
- _NEW_POLYGON |
- _NEW_POLYGONSTIPPLE,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_FRAGMENT_PROGRAM |
- BRW_NEW_FS_PROG_DATA |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_SAMPLER_STATE_TABLE |
- BRW_NEW_STATS_WM,
- },
- .emit = brw_upload_wm_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 4ff5394..bc64c5d 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1713,7 +1713,33 @@ static const struct brw_tracked_state genX(sf_state) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
+static bool
+brw_color_buffer_write_enabled(struct brw_context *brw)
+{
+ struct gl_context *ctx = &brw->ctx;
+ /* BRW_NEW_FRAGMENT_PROGRAM */
+ const struct gl_program *fp = brw->fragment_program;
+ unsigned i;
+
+ /* _NEW_BUFFERS */
+ for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
+ struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+ uint64_t outputs_written = fp->info.outputs_written;
+
+ /* _NEW_COLOR */
+ if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
+ outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
+ (ctx->Color.ColorMask[i][0] ||
+ ctx->Color.ColorMask[i][1] ||
+ ctx->Color.ColorMask[i][2] ||
+ ctx->Color.ColorMask[i][3])) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
static void
genX(upload_wm)(struct brw_context *brw)
{
@@ -1725,11 +1751,10 @@ genX(upload_wm)(struct brw_context *brw)
UNUSED bool writes_depth =
wm_prog_data->computed_depth_mode != BRW_PSCDEPTH_OFF;
+ UNUSED struct brw_stage_state *stage_state = &brw->wm.base;
+ UNUSED const struct gen_device_info *devinfo = &brw->screen->devinfo;
-#if GEN_GEN < 7
- const struct brw_stage_state *stage_state = &brw->wm.base;
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
-
+#if GEN_GEN == 6
/* We can't fold this into gen6_upload_wm_push_constants(), because
* according to the SNB PRM, vol 2 part 1 section 7.2.2
@@ -1748,27 +1773,94 @@ genX(upload_wm)(struct brw_context *brw)
}
#endif
+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_WM), wm) {
- wm.StatisticsEnable = true;
wm.LineAntialiasingRegionWidth = _10pixels;
wm.LineEndCapAntialiasingRegionWidth = _05pixels;
+ wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
+ wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(WM_STATE), 64, &stage_state->state_offset, wm) {
+ if (wm_prog_data->dispatch_8 && wm_prog_data->dispatch_16) {
+ /* These two fields should be the same pre-gen6, which is why we
+ * only have one hardware field to program for both dispatch
+ * widths.
+ */
+ assert(wm_prog_data->base.dispatch_grf_start_reg ==
+ wm_prog_data->dispatch_grf_start_reg_2);
+ }
+
+ if (wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16)
+ wm.GRFRegisterCount0 = wm_prog_data->reg_blocks_0;
+
+ if (stage_state->sampler_count)
+ wm.SamplerStatePointer =
+ instruction_ro_bo(brw->batch.bo, stage_state->sampler_offset);
+#if GEN_GEN == 5
+ if (wm_prog_data->prog_offset_2)
+ wm.GRFRegisterCount2 = wm_prog_data->reg_blocks_2;
+#endif
+
+ wm.SetupURBEntryReadLength = wm_prog_data->num_varying_inputs * 2;
+ wm.ConstantURBEntryReadLength = wm_prog_data->base.curb_read_length;
+ /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
+ wm.ConstantURBEntryReadOffset = brw->curbe.wm_start * 2;
+ wm.EarlyDepthTestEnable = true;
+ wm.LineAntialiasingRegionWidth = _05pixels;
+ wm.LineEndCapAntialiasingRegionWidth = _10pixels;
+
+ /* _NEW_POLYGON */
+ if (ctx->Polygon.OffsetFill) {
+ wm.GlobalDepthOffsetEnable = true;
+ /* Something weird going on with legacy_global_depth_bias,
+ * offset_constant, scaling and MRD. This value passes glean
+ * but gives some odd results elsewere (eg. the
+ * quad-offset-units test).
+ */
+ wm.GlobalDepthOffsetConstant = ctx->Polygon.OffsetUnits * 2;
+
+ */
+ wm.GlobalDepthOffsetScale = ctx->Polygon.OffsetFactor;
+ }
+
+ wm.DepthCoefficientURBReadOffset = 1;
+#endif
+
+ /* BRW_NEW_STATS_WM */
+ wm.StatisticsEnable = GEN_GEN >= 6 || brw->stats_wm;
+
#if GEN_GEN < 7
if (wm_prog_data->base.use_alt_mode)
- wm.FloatingPointMode = Alternate;
+ wm.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+
+ wm.SamplerCount = GEN_GEN == 5 ?
+ 0 : DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.SamplerCount = DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.BindingTableEntryCount = wm_prog_data->base.binding_table.size_bytes / 4;
+ wm.BindingTableEntryCount =
+ wm_prog_data->base.binding_table.size_bytes / 4;
wm.MaximumNumberofThreads = devinfo->max_wm_threads - 1;
wm._8PixelDispatchEnable = wm_prog_data->dispatch_8;
wm._16PixelDispatchEnable = wm_prog_data->dispatch_16;
wm.DispatchGRFStartRegisterForConstantSetupData0 =
wm_prog_data->base.dispatch_grf_start_reg;
- wm.DispatchGRFStartRegisterForConstantSetupData2 =
- wm_prog_data->dispatch_grf_start_reg_2;
- wm.KernelStartPointer0 = stage_state->prog_offset;
- wm.KernelStartPointer2 = stage_state->prog_offset +
- wm_prog_data->prog_offset_2;
+ if (GEN_GEN == 6 ||
+ wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16) {
+ wm.KernelStartPointer0 = KSP_ro(brw,
+ stage_state->prog_offset);
+ }
+
+#if GEN_GEN >= 5
+ if (GEN_GEN == 6 || wm_prog_data->prog_offset_2) {
+ wm.KernelStartPointer2 =
+ KSP_ro(brw, stage_state->prog_offset +
+ wm_prog_data->prog_offset_2);
+ }
+#endif
+
+#if GEN_GEN == 6
wm.DualSourceBlendEnable =
wm_prog_data->dual_src_blend && (ctx->Color.BlendEnabled & 1) &&
ctx->Color.Blend[0]._UsesDualSrc;
@@ -1792,42 +1884,34 @@ genX(upload_wm)(struct brw_context *brw)
else
wm.PositionXYOffsetSelect = POSOFFSET_NONE;
+ wm.DispatchGRFStartRegisterForConstantSetupData2 =
+ wm_prog_data->dispatch_grf_start_reg_2;
+#endif
+
if (wm_prog_data->base.total_scratch) {
wm.ScratchSpaceBasePointer =
- render_bo(stage_state->scratch_bo,
- ffs(stage_state->per_thread_scratch) - 11);
+ render_bo(stage_state->scratch_bo, 0);
+ wm.PerThreadScratchSpace =
+ ffs(stage_state->per_thread_scratch) - 11;
}
wm.PixelShaderComputedDepth = writes_depth;
#endif
- wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
-
/* _NEW_LINE */
wm.LineStippleEnable = ctx->Line.StippleFlag;
/* _NEW_POLYGON */
wm.PolygonStippleEnable = ctx->Polygon.StippleFlag;
- wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
#if GEN_GEN < 8
- /* _NEW_BUFFERS */
- const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
- wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+#if GEN_GEN >= 6
wm.PixelShaderUsesSourceW = wm_prog_data->uses_src_w;
- if (wm_prog_data->uses_kill ||
- _mesa_is_alpha_test_enabled(ctx) ||
- _mesa_is_alpha_to_coverage_enabled(ctx) ||
- wm_prog_data->uses_omask) {
- wm.PixelShaderKillsPixel = true;
- }
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) || writes_depth ||
- wm_prog_data->has_side_effects || wm.PixelShaderKillsPixel) {
- wm.ThreadDispatchEnable = true;
- }
+ /* _NEW_BUFFERS */
+ const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+
if (multisampled_fbo) {
/* _NEW_MULTISAMPLE */
if (ctx->Multisample.Enabled)
@@ -1843,6 +1927,21 @@ genX(upload_wm)(struct brw_context *brw)
wm.MultisampleRasterizationMode = MSRASTMODE_OFF_PIXEL;
wm.MultisampleDispatchMode = MSDISPMODE_PERSAMPLE;
}
+#endif
+ wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+ if (wm_prog_data->uses_kill ||
+ _mesa_is_alpha_test_enabled(ctx) ||
+ _mesa_is_alpha_to_coverage_enabled(ctx) ||
+ (GEN_GEN >= 6 && wm_prog_data->uses_omask)) {
+ wm.PixelShaderKillsPixel = true;
+ }
+
+ /* _NEW_BUFFERS | _NEW_COLOR */
+ if (brw_color_buffer_write_enabled(brw) || writes_depth ||
+ wm.PixelShaderKillsPixel ||
+ (GEN_GEN >= 6 && wm_prog_data->has_side_effects)) {
+ wm.ThreadDispatchEnable = true;
+ }
#if GEN_GEN >= 7
wm.PixelShaderComputedDepthMode = wm_prog_data->computed_depth_mode;
@@ -1873,6 +1972,16 @@ genX(upload_wm)(struct brw_context *brw)
wm.EarlyDepthStencilControl = EDSC_PSEXEC;
#endif
}
+
+#if GEN_GEN <= 5
+ if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
+ brw_batch_emit(brw, GENX(3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP), clamp) {
+ clamp.GlobalDepthOffsetClamp = ctx->Polygon.OffsetClamp;
+ }
+
+ brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
+ }
+#endif
}
static const struct brw_tracked_state genX(wm_state) = {
@@ -1880,17 +1989,23 @@ static const struct brw_tracked_state genX(wm_state) = {
.mesa = _NEW_LINE |
_NEW_POLYGON |
(GEN_GEN < 8 ? _NEW_BUFFERS |
- _NEW_COLOR |
0) |
- (GEN_GEN < 7 ? _NEW_PROGRAM_CONSTANTS : 0),
+ (GEN_GEN == 6 ? _NEW_PROGRAM_CONSTANTS : 0) |
+ (GEN_GEN < 6 ? _NEW_POLYGONSTIPPLE : 0) |
+ (GEN_GEN < 8 && GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0),
.brw = BRW_NEW_BLORP |
BRW_NEW_FS_PROG_DATA |
+ (GEN_GEN < 6 ? BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_FRAGMENT_PROGRAM |
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_SAMPLER_STATE_TABLE |
+ BRW_NEW_STATS_WM
+ : 0) |
(GEN_GEN < 7 ? BRW_NEW_BATCH : BRW_NEW_CONTEXT),
},
.emit = genX(upload_wm),
};
-#endif
/* ---------------------------------------------------------------------- */
@@ -4475,7 +4590,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_vs_samplers,
/* These set up state for brw_psp_urb_cbs */
- &brw_wm_unit,
+ &genX(wm_state),
&genX(sf_clip_viewport),
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
--
2.9.4
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Rafael Antognolli
2017-06-19 18:17:24 UTC
Permalink
Raw Message
Post by Kristian Høgsberg
On Fri, Jun 16, 2017 at 4:31 PM, Rafael Antognolli
Post by Rafael Antognolli
The code doesn't get exactly a lot simpler but at least it is in a single
place, and we delete more than we add.
Another good point is that you get rid of struct brw_wm_unit_state
which was a third mechanism for encoding GEN state. We used to have
GENXML, manual packing and these bitfield structs. Now we're down to
just GENXML and some manual packing.
Nice, I think I can add this to the commit message if you don't mind :)
Post by Kristian Høgsberg
Kristian
Post by Rafael Antognolli
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 121 ------------
src/mesa/drivers/dri/i965/brw_wm.h | 2 -
src/mesa/drivers/dri/i965/brw_wm_state.c | 274 --------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 191 ++++++++++++++----
6 files changed, 153 insertions(+), 437 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_wm_state.c
diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 89be92e..c15b3ef 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -61,7 +61,6 @@ i965_FILES = \
brw_vs_surface_state.c \
brw_wm.c \
brw_wm.h \
- brw_wm_state.c \
brw_wm_surface_state.c \
gen4_blorp_exec.h \
gen6_clip_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 8f3bd7f..9588a51 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -89,7 +89,6 @@ extern const struct brw_tracked_state brw_wm_image_surfaces;
extern const struct brw_tracked_state brw_cs_ubo_surfaces;
extern const struct brw_tracked_state brw_cs_abo_surfaces;
extern const struct brw_tracked_state brw_cs_image_surfaces;
-extern const struct brw_tracked_state brw_wm_unit;
extern const struct brw_tracked_state brw_psp_urb_cbs;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 5a0d91d..fb592be 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -65,127 +65,6 @@ struct brw_urb_fence
} bits1;
};
- */
-
-
-struct thread0
-{
- unsigned pad0:1;
- unsigned grf_reg_count:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
-};
-
-struct thread1
-{
- unsigned ext_halt_exception_enable:1;
- unsigned sw_exception_enable:1;
- unsigned mask_stack_exception_enable:1;
- unsigned timeout_exception_enable:1;
- unsigned illegal_op_exception_enable:1;
- unsigned pad0:3;
- unsigned depth_coef_urb_read_offset:6; /* WM only */
- unsigned pad1:2;
- unsigned floating_point_mode:1;
- unsigned thread_priority:1;
- unsigned binding_table_entry_count:8;
- unsigned pad3:5;
- unsigned single_program_flow:1;
-};
-
-struct thread2
-{
- unsigned per_thread_scratch_space:4;
- unsigned pad0:6;
- unsigned scratch_space_base_pointer:22;
-};
-
-
-struct thread3
-{
- unsigned dispatch_grf_start_reg:4;
- unsigned urb_entry_read_offset:6;
- unsigned pad0:1;
- unsigned urb_entry_read_length:6;
- unsigned pad1:1;
- unsigned const_urb_entry_read_offset:6;
- unsigned pad2:1;
- unsigned const_urb_entry_read_length:6;
- unsigned pad3:1;
-};
-
-struct brw_wm_unit_state
-{
- struct thread0 thread0;
- struct thread1 thread1;
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct {
- unsigned stats_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned sampler_count:3;
- unsigned sampler_state_pointer:27;
- } wm4;
-
- struct
- {
- unsigned enable_8_pix:1;
- unsigned enable_16_pix:1;
- unsigned enable_32_pix:1;
- unsigned enable_con_32_pix:1;
- unsigned enable_con_64_pix:1;
- unsigned pad0:1;
-
- /* These next four bits are for Ironlake+ */
- unsigned fast_span_coverage_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned depth_buffer_resolve_enable:1;
- unsigned hierarchical_depth_buffer_resolve_enable:1;
-
- unsigned legacy_global_depth_bias:1;
- unsigned line_stipple:1;
- unsigned depth_offset:1;
- unsigned polygon_stipple:1;
- unsigned line_aa_region_width:2;
- unsigned line_endcap_aa_region_width:2;
- unsigned early_depth_test:1;
- unsigned thread_dispatch_enable:1;
- unsigned program_uses_depth:1;
- unsigned program_computes_depth:1;
- unsigned program_uses_killpixel:1;
- unsigned legacy_line_rast: 1;
- unsigned transposed_urb_read_enable:1;
- unsigned max_threads:7;
- } wm5;
-
- float global_depth_offset_constant;
- float global_depth_offset_scale;
-
- /* for Ironlake only */
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_1:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_1:26;
- } wm8;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_2:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_2:26;
- } wm9;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_3:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_3:26;
- } wm10;
-};
-
struct gen5_sampler_default_color {
uint8_t ub[4];
float f[4];
diff --git a/src/mesa/drivers/dri/i965/brw_wm.h b/src/mesa/drivers/dri/i965/brw_wm.h
index 613172a..113cdf3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.h
+++ b/src/mesa/drivers/dri/i965/brw_wm.h
@@ -41,8 +41,6 @@
extern "C" {
#endif
-bool brw_color_buffer_write_enabled(struct brw_context *brw);
-
void
brw_upload_wm_prog(struct brw_context *brw);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c
deleted file mode 100644
index 69bbeb2..0000000
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ /dev/null
@@ -1,274 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- */
-
-
-
-#include "intel_batchbuffer.h"
-#include "intel_fbo.h"
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_wm.h"
-#include "compiler/nir/nir.h"
-
-/***********************************************************************
- * WM unit - fragment programs and rasterization
- */
-
-bool
-brw_color_buffer_write_enabled(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- unsigned i;
-
- /* _NEW_BUFFERS */
- for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
- uint64_t outputs_written = fp->info.outputs_written;
-
- /* _NEW_COLOR */
- if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
- outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
- (ctx->Color.ColorMask[i][0] ||
- ctx->Color.ColorMask[i][1] ||
- ctx->Color.ColorMask[i][2] ||
- ctx->Color.ColorMask[i][3])) {
- return true;
- }
- }
-
- return false;
-}
-
-/**
- * Setup wm hardware state. See page 225 of Volume 2
- */
-static void
-brw_upload_wm_unit(struct brw_context *brw)
-{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- /* BRW_NEW_FS_PROG_DATA */
- const struct brw_wm_prog_data *prog_data =
- brw_wm_prog_data(brw->wm.base.prog_data);
- struct brw_wm_unit_state *wm;
-
- wm = brw_state_batch(brw, sizeof(*wm), 32, &brw->wm.base.state_offset);
- memset(wm, 0, sizeof(*wm));
-
- if (prog_data->dispatch_8 && prog_data->dispatch_16) {
- /* These two fields should be the same pre-gen6, which is why we
- * only have one hardware field to program for both dispatch
- * widths.
- */
- assert(prog_data->base.dispatch_grf_start_reg ==
- prog_data->dispatch_grf_start_reg_2);
- }
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_FS_PROG_DATA */
- wm->wm5.enable_8_pix = prog_data->dispatch_8;
- wm->wm5.enable_16_pix = prog_data->dispatch_16;
-
- if (prog_data->dispatch_8 || prog_data->dispatch_16) {
- wm->thread0.grf_reg_count = prog_data->reg_blocks_0;
- wm->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread0),
- brw->wm.base.prog_offset +
- (wm->thread0.grf_reg_count << 1)) >> 6;
- }
-
- if (prog_data->prog_offset_2) {
- wm->wm9.grf_reg_count_2 = prog_data->reg_blocks_2;
- wm->wm9.kernel_start_pointer_2 =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm9),
- brw->wm.base.prog_offset +
- prog_data->prog_offset_2 +
- (wm->wm9.grf_reg_count_2 << 1)) >> 6;
- }
-
- wm->thread1.depth_coef_urb_read_offset = 1;
- if (prog_data->base.use_alt_mode)
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- else
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754;
-
- wm->thread1.binding_table_entry_count =
- prog_data->base.binding_table.size_bytes / 4;
-
- if (prog_data->base.total_scratch != 0) {
- wm->thread2.scratch_space_base_pointer =
- brw->wm.base.scratch_bo->offset64 >> 10; /* reloc */
- wm->thread2.per_thread_scratch_space =
- ffs(brw->wm.base.per_thread_scratch) - 11;
- } else {
- wm->thread2.scratch_space_base_pointer = 0;
- wm->thread2.per_thread_scratch_space = 0;
- }
-
- wm->thread3.dispatch_grf_start_reg =
- prog_data->base.dispatch_grf_start_reg;
- wm->thread3.urb_entry_read_length =
- prog_data->num_varying_inputs * 2;
- wm->thread3.urb_entry_read_offset = 0;
- wm->thread3.const_urb_entry_read_length =
- prog_data->base.curb_read_length;
- /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
- wm->thread3.const_urb_entry_read_offset = brw->curbe.wm_start * 2;
-
- if (brw->gen == 5)
- wm->wm4.sampler_count = 0; /* hardware requirement */
- else {
- wm->wm4.sampler_count = (brw->wm.base.sampler_count + 1) / 4;
- }
-
- if (brw->wm.base.sampler_count) {
- /* BRW_NEW_SAMPLER_STATE_TABLE - reloc */
- wm->wm4.sampler_state_pointer = (brw->batch.bo->offset64 +
- brw->wm.base.sampler_offset) >> 5;
- } else {
- wm->wm4.sampler_state_pointer = 0;
- }
-
- /* BRW_NEW_FRAGMENT_PROGRAM */
- wm->wm5.program_uses_depth = prog_data->uses_src_depth;
- wm->wm5.program_computes_depth = (fp->info.outputs_written &
- BITFIELD64_BIT(FRAG_RESULT_DEPTH)) != 0;
- /* _NEW_BUFFERS
- * Override for NULL depthbuffer case, required by the Pixel Shader Computed
- * Depth field.
- */
- if (!intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH))
- wm->wm5.program_computes_depth = 0;
-
- /* _NEW_COLOR */
- wm->wm5.program_uses_killpixel =
- prog_data->uses_kill || ctx->Color.AlphaEnabled;
-
- wm->wm5.max_threads = devinfo->max_wm_threads - 1;
-
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) ||
- wm->wm5.program_uses_killpixel ||
- wm->wm5.program_computes_depth) {
- wm->wm5.thread_dispatch_enable = 1;
- }
-
- wm->wm5.legacy_line_rast = 0;
- wm->wm5.legacy_global_depth_bias = 0;
- wm->wm5.early_depth_test = 1; /* never need to disable */
- wm->wm5.line_aa_region_width = 0;
- wm->wm5.line_endcap_aa_region_width = 1;
-
- /* _NEW_POLYGONSTIPPLE */
- wm->wm5.polygon_stipple = ctx->Polygon.StippleFlag;
-
- /* _NEW_POLYGON */
- if (ctx->Polygon.OffsetFill) {
- wm->wm5.depth_offset = 1;
- /* Something weird going on with legacy_global_depth_bias,
- * offset_constant, scaling and MRD. This value passes glean
- * but gives some odd results elsewere (eg. the
- * quad-offset-units test).
- */
- wm->global_depth_offset_constant = ctx->Polygon.OffsetUnits * 2;
-
- */
- wm->global_depth_offset_scale = ctx->Polygon.OffsetFactor;
- }
-
- /* _NEW_LINE */
- wm->wm5.line_stipple = ctx->Line.StippleFlag;
-
- /* BRW_NEW_STATS_WM */
- if (brw->stats_wm)
- wm->wm4.stats_enable = 1;
-
- /* Emit scratch space relocation */
- if (prog_data->base.total_scratch != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread2),
- brw->wm.base.scratch_bo,
- wm->thread2.per_thread_scratch_space,
- I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER);
- }
-
- /* Emit sampler state relocation */
- if (brw->wm.base.sampler_count != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm4),
- brw->batch.bo,
- brw->wm.base.sampler_offset | wm->wm4.stats_enable |
- (wm->wm4.sampler_count << 2),
- I915_GEM_DOMAIN_INSTRUCTION, 0);
- }
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* _NEW_POLGYON */
- if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
- BEGIN_BATCH(2);
- OUT_BATCH(_3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP << 16 | (2 - 2));
- OUT_BATCH_F(ctx->Polygon.OffsetClamp);
- ADVANCE_BATCH();
-
- brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
- }
-}
-
-const struct brw_tracked_state brw_wm_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_LINE |
- _NEW_POLYGON |
- _NEW_POLYGONSTIPPLE,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_FRAGMENT_PROGRAM |
- BRW_NEW_FS_PROG_DATA |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_SAMPLER_STATE_TABLE |
- BRW_NEW_STATS_WM,
- },
- .emit = brw_upload_wm_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 4ff5394..bc64c5d 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1713,7 +1713,33 @@ static const struct brw_tracked_state genX(sf_state) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
+static bool
+brw_color_buffer_write_enabled(struct brw_context *brw)
+{
+ struct gl_context *ctx = &brw->ctx;
+ /* BRW_NEW_FRAGMENT_PROGRAM */
+ const struct gl_program *fp = brw->fragment_program;
+ unsigned i;
+
+ /* _NEW_BUFFERS */
+ for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
+ struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+ uint64_t outputs_written = fp->info.outputs_written;
+
+ /* _NEW_COLOR */
+ if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
+ outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
+ (ctx->Color.ColorMask[i][0] ||
+ ctx->Color.ColorMask[i][1] ||
+ ctx->Color.ColorMask[i][2] ||
+ ctx->Color.ColorMask[i][3])) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
static void
genX(upload_wm)(struct brw_context *brw)
{
@@ -1725,11 +1751,10 @@ genX(upload_wm)(struct brw_context *brw)
UNUSED bool writes_depth =
wm_prog_data->computed_depth_mode != BRW_PSCDEPTH_OFF;
+ UNUSED struct brw_stage_state *stage_state = &brw->wm.base;
+ UNUSED const struct gen_device_info *devinfo = &brw->screen->devinfo;
-#if GEN_GEN < 7
- const struct brw_stage_state *stage_state = &brw->wm.base;
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
-
+#if GEN_GEN == 6
/* We can't fold this into gen6_upload_wm_push_constants(), because
* according to the SNB PRM, vol 2 part 1 section 7.2.2
@@ -1748,27 +1773,94 @@ genX(upload_wm)(struct brw_context *brw)
}
#endif
+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_WM), wm) {
- wm.StatisticsEnable = true;
wm.LineAntialiasingRegionWidth = _10pixels;
wm.LineEndCapAntialiasingRegionWidth = _05pixels;
+ wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
+ wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(WM_STATE), 64, &stage_state->state_offset, wm) {
+ if (wm_prog_data->dispatch_8 && wm_prog_data->dispatch_16) {
+ /* These two fields should be the same pre-gen6, which is why we
+ * only have one hardware field to program for both dispatch
+ * widths.
+ */
+ assert(wm_prog_data->base.dispatch_grf_start_reg ==
+ wm_prog_data->dispatch_grf_start_reg_2);
+ }
+
+ if (wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16)
+ wm.GRFRegisterCount0 = wm_prog_data->reg_blocks_0;
+
+ if (stage_state->sampler_count)
+ wm.SamplerStatePointer =
+ instruction_ro_bo(brw->batch.bo, stage_state->sampler_offset);
+#if GEN_GEN == 5
+ if (wm_prog_data->prog_offset_2)
+ wm.GRFRegisterCount2 = wm_prog_data->reg_blocks_2;
+#endif
+
+ wm.SetupURBEntryReadLength = wm_prog_data->num_varying_inputs * 2;
+ wm.ConstantURBEntryReadLength = wm_prog_data->base.curb_read_length;
+ /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
+ wm.ConstantURBEntryReadOffset = brw->curbe.wm_start * 2;
+ wm.EarlyDepthTestEnable = true;
+ wm.LineAntialiasingRegionWidth = _05pixels;
+ wm.LineEndCapAntialiasingRegionWidth = _10pixels;
+
+ /* _NEW_POLYGON */
+ if (ctx->Polygon.OffsetFill) {
+ wm.GlobalDepthOffsetEnable = true;
+ /* Something weird going on with legacy_global_depth_bias,
+ * offset_constant, scaling and MRD. This value passes glean
+ * but gives some odd results elsewere (eg. the
+ * quad-offset-units test).
+ */
+ wm.GlobalDepthOffsetConstant = ctx->Polygon.OffsetUnits * 2;
+
+ */
+ wm.GlobalDepthOffsetScale = ctx->Polygon.OffsetFactor;
+ }
+
+ wm.DepthCoefficientURBReadOffset = 1;
+#endif
+
+ /* BRW_NEW_STATS_WM */
+ wm.StatisticsEnable = GEN_GEN >= 6 || brw->stats_wm;
+
#if GEN_GEN < 7
if (wm_prog_data->base.use_alt_mode)
- wm.FloatingPointMode = Alternate;
+ wm.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+
+ wm.SamplerCount = GEN_GEN == 5 ?
+ 0 : DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.SamplerCount = DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.BindingTableEntryCount = wm_prog_data->base.binding_table.size_bytes / 4;
+ wm.BindingTableEntryCount =
+ wm_prog_data->base.binding_table.size_bytes / 4;
wm.MaximumNumberofThreads = devinfo->max_wm_threads - 1;
wm._8PixelDispatchEnable = wm_prog_data->dispatch_8;
wm._16PixelDispatchEnable = wm_prog_data->dispatch_16;
wm.DispatchGRFStartRegisterForConstantSetupData0 =
wm_prog_data->base.dispatch_grf_start_reg;
- wm.DispatchGRFStartRegisterForConstantSetupData2 =
- wm_prog_data->dispatch_grf_start_reg_2;
- wm.KernelStartPointer0 = stage_state->prog_offset;
- wm.KernelStartPointer2 = stage_state->prog_offset +
- wm_prog_data->prog_offset_2;
+ if (GEN_GEN == 6 ||
+ wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16) {
+ wm.KernelStartPointer0 = KSP_ro(brw,
+ stage_state->prog_offset);
+ }
+
+#if GEN_GEN >= 5
+ if (GEN_GEN == 6 || wm_prog_data->prog_offset_2) {
+ wm.KernelStartPointer2 =
+ KSP_ro(brw, stage_state->prog_offset +
+ wm_prog_data->prog_offset_2);
+ }
+#endif
+
+#if GEN_GEN == 6
wm.DualSourceBlendEnable =
wm_prog_data->dual_src_blend && (ctx->Color.BlendEnabled & 1) &&
ctx->Color.Blend[0]._UsesDualSrc;
@@ -1792,42 +1884,34 @@ genX(upload_wm)(struct brw_context *brw)
else
wm.PositionXYOffsetSelect = POSOFFSET_NONE;
+ wm.DispatchGRFStartRegisterForConstantSetupData2 =
+ wm_prog_data->dispatch_grf_start_reg_2;
+#endif
+
if (wm_prog_data->base.total_scratch) {
wm.ScratchSpaceBasePointer =
- render_bo(stage_state->scratch_bo,
- ffs(stage_state->per_thread_scratch) - 11);
+ render_bo(stage_state->scratch_bo, 0);
+ wm.PerThreadScratchSpace =
+ ffs(stage_state->per_thread_scratch) - 11;
}
wm.PixelShaderComputedDepth = writes_depth;
#endif
- wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
-
/* _NEW_LINE */
wm.LineStippleEnable = ctx->Line.StippleFlag;
/* _NEW_POLYGON */
wm.PolygonStippleEnable = ctx->Polygon.StippleFlag;
- wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
#if GEN_GEN < 8
- /* _NEW_BUFFERS */
- const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
- wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+#if GEN_GEN >= 6
wm.PixelShaderUsesSourceW = wm_prog_data->uses_src_w;
- if (wm_prog_data->uses_kill ||
- _mesa_is_alpha_test_enabled(ctx) ||
- _mesa_is_alpha_to_coverage_enabled(ctx) ||
- wm_prog_data->uses_omask) {
- wm.PixelShaderKillsPixel = true;
- }
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) || writes_depth ||
- wm_prog_data->has_side_effects || wm.PixelShaderKillsPixel) {
- wm.ThreadDispatchEnable = true;
- }
+ /* _NEW_BUFFERS */
+ const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+
if (multisampled_fbo) {
/* _NEW_MULTISAMPLE */
if (ctx->Multisample.Enabled)
@@ -1843,6 +1927,21 @@ genX(upload_wm)(struct brw_context *brw)
wm.MultisampleRasterizationMode = MSRASTMODE_OFF_PIXEL;
wm.MultisampleDispatchMode = MSDISPMODE_PERSAMPLE;
}
+#endif
+ wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+ if (wm_prog_data->uses_kill ||
+ _mesa_is_alpha_test_enabled(ctx) ||
+ _mesa_is_alpha_to_coverage_enabled(ctx) ||
+ (GEN_GEN >= 6 && wm_prog_data->uses_omask)) {
+ wm.PixelShaderKillsPixel = true;
+ }
+
+ /* _NEW_BUFFERS | _NEW_COLOR */
+ if (brw_color_buffer_write_enabled(brw) || writes_depth ||
+ wm.PixelShaderKillsPixel ||
+ (GEN_GEN >= 6 && wm_prog_data->has_side_effects)) {
+ wm.ThreadDispatchEnable = true;
+ }
#if GEN_GEN >= 7
wm.PixelShaderComputedDepthMode = wm_prog_data->computed_depth_mode;
@@ -1873,6 +1972,16 @@ genX(upload_wm)(struct brw_context *brw)
wm.EarlyDepthStencilControl = EDSC_PSEXEC;
#endif
}
+
+#if GEN_GEN <= 5
+ if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
+ brw_batch_emit(brw, GENX(3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP), clamp) {
+ clamp.GlobalDepthOffsetClamp = ctx->Polygon.OffsetClamp;
+ }
+
+ brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
+ }
+#endif
}
static const struct brw_tracked_state genX(wm_state) = {
@@ -1880,17 +1989,23 @@ static const struct brw_tracked_state genX(wm_state) = {
.mesa = _NEW_LINE |
_NEW_POLYGON |
(GEN_GEN < 8 ? _NEW_BUFFERS |
- _NEW_COLOR |
0) |
- (GEN_GEN < 7 ? _NEW_PROGRAM_CONSTANTS : 0),
+ (GEN_GEN == 6 ? _NEW_PROGRAM_CONSTANTS : 0) |
+ (GEN_GEN < 6 ? _NEW_POLYGONSTIPPLE : 0) |
+ (GEN_GEN < 8 && GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0),
.brw = BRW_NEW_BLORP |
BRW_NEW_FS_PROG_DATA |
+ (GEN_GEN < 6 ? BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_FRAGMENT_PROGRAM |
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_SAMPLER_STATE_TABLE |
+ BRW_NEW_STATS_WM
+ : 0) |
(GEN_GEN < 7 ? BRW_NEW_BATCH : BRW_NEW_CONTEXT),
},
.emit = genX(upload_wm),
};
-#endif
/* ---------------------------------------------------------------------- */
@@ -4475,7 +4590,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_vs_samplers,
/* These set up state for brw_psp_urb_cbs */
- &brw_wm_unit,
+ &genX(wm_state),
&genX(sf_clip_viewport),
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
--
2.9.4
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Kristian Høgsberg
2017-06-19 19:09:00 UTC
Permalink
Raw Message
On Mon, Jun 19, 2017 at 11:17 AM, Rafael Antognolli
Post by Rafael Antognolli
Post by Kristian Høgsberg
On Fri, Jun 16, 2017 at 4:31 PM, Rafael Antognolli
Post by Rafael Antognolli
The code doesn't get exactly a lot simpler but at least it is in a single
place, and we delete more than we add.
Another good point is that you get rid of struct brw_wm_unit_state
which was a third mechanism for encoding GEN state. We used to have
GENXML, manual packing and these bitfield structs. Now we're down to
just GENXML and some manual packing.
Nice, I think I can add this to the commit message if you don't mind :)
Please do, that's why I brought it up ;-)
Post by Rafael Antognolli
Post by Kristian Høgsberg
Kristian
Post by Rafael Antognolli
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 121 ------------
src/mesa/drivers/dri/i965/brw_wm.h | 2 -
src/mesa/drivers/dri/i965/brw_wm_state.c | 274 --------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 191 ++++++++++++++----
6 files changed, 153 insertions(+), 437 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_wm_state.c
diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources
index 89be92e..c15b3ef 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -61,7 +61,6 @@ i965_FILES = \
brw_vs_surface_state.c \
brw_wm.c \
brw_wm.h \
- brw_wm_state.c \
brw_wm_surface_state.c \
gen4_blorp_exec.h \
gen6_clip_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
index 8f3bd7f..9588a51 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -89,7 +89,6 @@ extern const struct brw_tracked_state brw_wm_image_surfaces;
extern const struct brw_tracked_state brw_cs_ubo_surfaces;
extern const struct brw_tracked_state brw_cs_abo_surfaces;
extern const struct brw_tracked_state brw_cs_image_surfaces;
-extern const struct brw_tracked_state brw_wm_unit;
extern const struct brw_tracked_state brw_psp_urb_cbs;
diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h
index 5a0d91d..fb592be 100644
--- a/src/mesa/drivers/dri/i965/brw_structs.h
+++ b/src/mesa/drivers/dri/i965/brw_structs.h
@@ -65,127 +65,6 @@ struct brw_urb_fence
} bits1;
};
- */
-
-
-struct thread0
-{
- unsigned pad0:1;
- unsigned grf_reg_count:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer:26; /* Offset from GENERAL_STATE_BASE */
-};
-
-struct thread1
-{
- unsigned ext_halt_exception_enable:1;
- unsigned sw_exception_enable:1;
- unsigned mask_stack_exception_enable:1;
- unsigned timeout_exception_enable:1;
- unsigned illegal_op_exception_enable:1;
- unsigned pad0:3;
- unsigned depth_coef_urb_read_offset:6; /* WM only */
- unsigned pad1:2;
- unsigned floating_point_mode:1;
- unsigned thread_priority:1;
- unsigned binding_table_entry_count:8;
- unsigned pad3:5;
- unsigned single_program_flow:1;
-};
-
-struct thread2
-{
- unsigned per_thread_scratch_space:4;
- unsigned pad0:6;
- unsigned scratch_space_base_pointer:22;
-};
-
-
-struct thread3
-{
- unsigned dispatch_grf_start_reg:4;
- unsigned urb_entry_read_offset:6;
- unsigned pad0:1;
- unsigned urb_entry_read_length:6;
- unsigned pad1:1;
- unsigned const_urb_entry_read_offset:6;
- unsigned pad2:1;
- unsigned const_urb_entry_read_length:6;
- unsigned pad3:1;
-};
-
-struct brw_wm_unit_state
-{
- struct thread0 thread0;
- struct thread1 thread1;
- struct thread2 thread2;
- struct thread3 thread3;
-
- struct {
- unsigned stats_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned sampler_count:3;
- unsigned sampler_state_pointer:27;
- } wm4;
-
- struct
- {
- unsigned enable_8_pix:1;
- unsigned enable_16_pix:1;
- unsigned enable_32_pix:1;
- unsigned enable_con_32_pix:1;
- unsigned enable_con_64_pix:1;
- unsigned pad0:1;
-
- /* These next four bits are for Ironlake+ */
- unsigned fast_span_coverage_enable:1;
- unsigned depth_buffer_clear:1;
- unsigned depth_buffer_resolve_enable:1;
- unsigned hierarchical_depth_buffer_resolve_enable:1;
-
- unsigned legacy_global_depth_bias:1;
- unsigned line_stipple:1;
- unsigned depth_offset:1;
- unsigned polygon_stipple:1;
- unsigned line_aa_region_width:2;
- unsigned line_endcap_aa_region_width:2;
- unsigned early_depth_test:1;
- unsigned thread_dispatch_enable:1;
- unsigned program_uses_depth:1;
- unsigned program_computes_depth:1;
- unsigned program_uses_killpixel:1;
- unsigned legacy_line_rast: 1;
- unsigned transposed_urb_read_enable:1;
- unsigned max_threads:7;
- } wm5;
-
- float global_depth_offset_constant;
- float global_depth_offset_scale;
-
- /* for Ironlake only */
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_1:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_1:26;
- } wm8;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_2:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_2:26;
- } wm9;
-
- struct {
- unsigned pad0:1;
- unsigned grf_reg_count_3:3;
- unsigned pad1:2;
- unsigned kernel_start_pointer_3:26;
- } wm10;
-};
-
struct gen5_sampler_default_color {
uint8_t ub[4];
float f[4];
diff --git a/src/mesa/drivers/dri/i965/brw_wm.h b/src/mesa/drivers/dri/i965/brw_wm.h
index 613172a..113cdf3 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.h
+++ b/src/mesa/drivers/dri/i965/brw_wm.h
@@ -41,8 +41,6 @@
extern "C" {
#endif
-bool brw_color_buffer_write_enabled(struct brw_context *brw);
-
void
brw_upload_wm_prog(struct brw_context *brw);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c
deleted file mode 100644
index 69bbeb2..0000000
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ /dev/null
@@ -1,274 +0,0 @@
-/*
- Copyright (C) Intel Corp. 2006. All Rights Reserved.
- Intel funded Tungsten Graphics to
- develop this 3D driver.
-
- Permission is hereby granted, free of charge, to any person obtaining
- a copy of this software and associated documentation files (the
- "Software"), to deal in the Software without restriction, including
- without limitation the rights to use, copy, modify, merge, publish,
- distribute, sublicense, and/or sell copies of the Software, and to
- permit persons to whom the Software is furnished to do so, subject to
-
- The above copyright notice and this permission notice (including the
- next paragraph) shall be included in all copies or substantial
- portions of the Software.
-
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
- **********************************************************************/
- /*
- */
-
-
-
-#include "intel_batchbuffer.h"
-#include "intel_fbo.h"
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_wm.h"
-#include "compiler/nir/nir.h"
-
-/***********************************************************************
- * WM unit - fragment programs and rasterization
- */
-
-bool
-brw_color_buffer_write_enabled(struct brw_context *brw)
-{
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- unsigned i;
-
- /* _NEW_BUFFERS */
- for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
- struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
- uint64_t outputs_written = fp->info.outputs_written;
-
- /* _NEW_COLOR */
- if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
- outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
- (ctx->Color.ColorMask[i][0] ||
- ctx->Color.ColorMask[i][1] ||
- ctx->Color.ColorMask[i][2] ||
- ctx->Color.ColorMask[i][3])) {
- return true;
- }
- }
-
- return false;
-}
-
-/**
- * Setup wm hardware state. See page 225 of Volume 2
- */
-static void
-brw_upload_wm_unit(struct brw_context *brw)
-{
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
- struct gl_context *ctx = &brw->ctx;
- /* BRW_NEW_FRAGMENT_PROGRAM */
- const struct gl_program *fp = brw->fragment_program;
- /* BRW_NEW_FS_PROG_DATA */
- const struct brw_wm_prog_data *prog_data =
- brw_wm_prog_data(brw->wm.base.prog_data);
- struct brw_wm_unit_state *wm;
-
- wm = brw_state_batch(brw, sizeof(*wm), 32, &brw->wm.base.state_offset);
- memset(wm, 0, sizeof(*wm));
-
- if (prog_data->dispatch_8 && prog_data->dispatch_16) {
- /* These two fields should be the same pre-gen6, which is why we
- * only have one hardware field to program for both dispatch
- * widths.
- */
- assert(prog_data->base.dispatch_grf_start_reg ==
- prog_data->dispatch_grf_start_reg_2);
- }
-
- /* BRW_NEW_PROGRAM_CACHE | BRW_NEW_FS_PROG_DATA */
- wm->wm5.enable_8_pix = prog_data->dispatch_8;
- wm->wm5.enable_16_pix = prog_data->dispatch_16;
-
- if (prog_data->dispatch_8 || prog_data->dispatch_16) {
- wm->thread0.grf_reg_count = prog_data->reg_blocks_0;
- wm->thread0.kernel_start_pointer =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread0),
- brw->wm.base.prog_offset +
- (wm->thread0.grf_reg_count << 1)) >> 6;
- }
-
- if (prog_data->prog_offset_2) {
- wm->wm9.grf_reg_count_2 = prog_data->reg_blocks_2;
- wm->wm9.kernel_start_pointer_2 =
- brw_program_reloc(brw,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm9),
- brw->wm.base.prog_offset +
- prog_data->prog_offset_2 +
- (wm->wm9.grf_reg_count_2 << 1)) >> 6;
- }
-
- wm->thread1.depth_coef_urb_read_offset = 1;
- if (prog_data->base.use_alt_mode)
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754;
- else
- wm->thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754;
-
- wm->thread1.binding_table_entry_count =
- prog_data->base.binding_table.size_bytes / 4;
-
- if (prog_data->base.total_scratch != 0) {
- wm->thread2.scratch_space_base_pointer =
- brw->wm.base.scratch_bo->offset64 >> 10; /* reloc */
- wm->thread2.per_thread_scratch_space =
- ffs(brw->wm.base.per_thread_scratch) - 11;
- } else {
- wm->thread2.scratch_space_base_pointer = 0;
- wm->thread2.per_thread_scratch_space = 0;
- }
-
- wm->thread3.dispatch_grf_start_reg =
- prog_data->base.dispatch_grf_start_reg;
- wm->thread3.urb_entry_read_length =
- prog_data->num_varying_inputs * 2;
- wm->thread3.urb_entry_read_offset = 0;
- wm->thread3.const_urb_entry_read_length =
- prog_data->base.curb_read_length;
- /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
- wm->thread3.const_urb_entry_read_offset = brw->curbe.wm_start * 2;
-
- if (brw->gen == 5)
- wm->wm4.sampler_count = 0; /* hardware requirement */
- else {
- wm->wm4.sampler_count = (brw->wm.base.sampler_count + 1) / 4;
- }
-
- if (brw->wm.base.sampler_count) {
- /* BRW_NEW_SAMPLER_STATE_TABLE - reloc */
- wm->wm4.sampler_state_pointer = (brw->batch.bo->offset64 +
- brw->wm.base.sampler_offset) >> 5;
- } else {
- wm->wm4.sampler_state_pointer = 0;
- }
-
- /* BRW_NEW_FRAGMENT_PROGRAM */
- wm->wm5.program_uses_depth = prog_data->uses_src_depth;
- wm->wm5.program_computes_depth = (fp->info.outputs_written &
- BITFIELD64_BIT(FRAG_RESULT_DEPTH)) != 0;
- /* _NEW_BUFFERS
- * Override for NULL depthbuffer case, required by the Pixel Shader Computed
- * Depth field.
- */
- if (!intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH))
- wm->wm5.program_computes_depth = 0;
-
- /* _NEW_COLOR */
- wm->wm5.program_uses_killpixel =
- prog_data->uses_kill || ctx->Color.AlphaEnabled;
-
- wm->wm5.max_threads = devinfo->max_wm_threads - 1;
-
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) ||
- wm->wm5.program_uses_killpixel ||
- wm->wm5.program_computes_depth) {
- wm->wm5.thread_dispatch_enable = 1;
- }
-
- wm->wm5.legacy_line_rast = 0;
- wm->wm5.legacy_global_depth_bias = 0;
- wm->wm5.early_depth_test = 1; /* never need to disable */
- wm->wm5.line_aa_region_width = 0;
- wm->wm5.line_endcap_aa_region_width = 1;
-
- /* _NEW_POLYGONSTIPPLE */
- wm->wm5.polygon_stipple = ctx->Polygon.StippleFlag;
-
- /* _NEW_POLYGON */
- if (ctx->Polygon.OffsetFill) {
- wm->wm5.depth_offset = 1;
- /* Something weird going on with legacy_global_depth_bias,
- * offset_constant, scaling and MRD. This value passes glean
- * but gives some odd results elsewere (eg. the
- * quad-offset-units test).
- */
- wm->global_depth_offset_constant = ctx->Polygon.OffsetUnits * 2;
-
- */
- wm->global_depth_offset_scale = ctx->Polygon.OffsetFactor;
- }
-
- /* _NEW_LINE */
- wm->wm5.line_stipple = ctx->Line.StippleFlag;
-
- /* BRW_NEW_STATS_WM */
- if (brw->stats_wm)
- wm->wm4.stats_enable = 1;
-
- /* Emit scratch space relocation */
- if (prog_data->base.total_scratch != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, thread2),
- brw->wm.base.scratch_bo,
- wm->thread2.per_thread_scratch_space,
- I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER);
- }
-
- /* Emit sampler state relocation */
- if (brw->wm.base.sampler_count != 0) {
- brw_emit_reloc(&brw->batch,
- brw->wm.base.state_offset +
- offsetof(struct brw_wm_unit_state, wm4),
- brw->batch.bo,
- brw->wm.base.sampler_offset | wm->wm4.stats_enable |
- (wm->wm4.sampler_count << 2),
- I915_GEM_DOMAIN_INSTRUCTION, 0);
- }
-
- brw->ctx.NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
-
- /* _NEW_POLGYON */
- if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
- BEGIN_BATCH(2);
- OUT_BATCH(_3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP << 16 | (2 - 2));
- OUT_BATCH_F(ctx->Polygon.OffsetClamp);
- ADVANCE_BATCH();
-
- brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
- }
-}
-
-const struct brw_tracked_state brw_wm_unit = {
- .dirty = {
- .mesa = _NEW_BUFFERS |
- _NEW_COLOR |
- _NEW_LINE |
- _NEW_POLYGON |
- _NEW_POLYGONSTIPPLE,
- .brw = BRW_NEW_BATCH |
- BRW_NEW_BLORP |
- BRW_NEW_PUSH_CONSTANT_ALLOCATION |
- BRW_NEW_FRAGMENT_PROGRAM |
- BRW_NEW_FS_PROG_DATA |
- BRW_NEW_PROGRAM_CACHE |
- BRW_NEW_SAMPLER_STATE_TABLE |
- BRW_NEW_STATS_WM,
- },
- .emit = brw_upload_wm_unit,
-};
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 4ff5394..bc64c5d 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1713,7 +1713,33 @@ static const struct brw_tracked_state genX(sf_state) = {
/* ---------------------------------------------------------------------- */
-#if GEN_GEN >= 6
+static bool
+brw_color_buffer_write_enabled(struct brw_context *brw)
+{
+ struct gl_context *ctx = &brw->ctx;
+ /* BRW_NEW_FRAGMENT_PROGRAM */
+ const struct gl_program *fp = brw->fragment_program;
+ unsigned i;
+
+ /* _NEW_BUFFERS */
+ for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
+ struct gl_renderbuffer *rb = ctx->DrawBuffer->_ColorDrawBuffers[i];
+ uint64_t outputs_written = fp->info.outputs_written;
+
+ /* _NEW_COLOR */
+ if (rb && (outputs_written & BITFIELD64_BIT(FRAG_RESULT_COLOR) ||
+ outputs_written & BITFIELD64_BIT(FRAG_RESULT_DATA0 + i)) &&
+ (ctx->Color.ColorMask[i][0] ||
+ ctx->Color.ColorMask[i][1] ||
+ ctx->Color.ColorMask[i][2] ||
+ ctx->Color.ColorMask[i][3])) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
static void
genX(upload_wm)(struct brw_context *brw)
{
@@ -1725,11 +1751,10 @@ genX(upload_wm)(struct brw_context *brw)
UNUSED bool writes_depth =
wm_prog_data->computed_depth_mode != BRW_PSCDEPTH_OFF;
+ UNUSED struct brw_stage_state *stage_state = &brw->wm.base;
+ UNUSED const struct gen_device_info *devinfo = &brw->screen->devinfo;
-#if GEN_GEN < 7
- const struct brw_stage_state *stage_state = &brw->wm.base;
- const struct gen_device_info *devinfo = &brw->screen->devinfo;
-
+#if GEN_GEN == 6
/* We can't fold this into gen6_upload_wm_push_constants(), because
* according to the SNB PRM, vol 2 part 1 section 7.2.2
@@ -1748,27 +1773,94 @@ genX(upload_wm)(struct brw_context *brw)
}
#endif
+#if GEN_GEN >= 6
brw_batch_emit(brw, GENX(3DSTATE_WM), wm) {
- wm.StatisticsEnable = true;
wm.LineAntialiasingRegionWidth = _10pixels;
wm.LineEndCapAntialiasingRegionWidth = _05pixels;
+ wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
+ wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
+#else
+ ctx->NewDriverState |= BRW_NEW_GEN4_UNIT_STATE;
+ brw_state_emit(brw, GENX(WM_STATE), 64, &stage_state->state_offset, wm) {
+ if (wm_prog_data->dispatch_8 && wm_prog_data->dispatch_16) {
+ /* These two fields should be the same pre-gen6, which is why we
+ * only have one hardware field to program for both dispatch
+ * widths.
+ */
+ assert(wm_prog_data->base.dispatch_grf_start_reg ==
+ wm_prog_data->dispatch_grf_start_reg_2);
+ }
+
+ if (wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16)
+ wm.GRFRegisterCount0 = wm_prog_data->reg_blocks_0;
+
+ if (stage_state->sampler_count)
+ wm.SamplerStatePointer =
+ instruction_ro_bo(brw->batch.bo, stage_state->sampler_offset);
+#if GEN_GEN == 5
+ if (wm_prog_data->prog_offset_2)
+ wm.GRFRegisterCount2 = wm_prog_data->reg_blocks_2;
+#endif
+
+ wm.SetupURBEntryReadLength = wm_prog_data->num_varying_inputs * 2;
+ wm.ConstantURBEntryReadLength = wm_prog_data->base.curb_read_length;
+ /* BRW_NEW_PUSH_CONSTANT_ALLOCATION */
+ wm.ConstantURBEntryReadOffset = brw->curbe.wm_start * 2;
+ wm.EarlyDepthTestEnable = true;
+ wm.LineAntialiasingRegionWidth = _05pixels;
+ wm.LineEndCapAntialiasingRegionWidth = _10pixels;
+
+ /* _NEW_POLYGON */
+ if (ctx->Polygon.OffsetFill) {
+ wm.GlobalDepthOffsetEnable = true;
+ /* Something weird going on with legacy_global_depth_bias,
+ * offset_constant, scaling and MRD. This value passes glean
+ * but gives some odd results elsewere (eg. the
+ * quad-offset-units test).
+ */
+ wm.GlobalDepthOffsetConstant = ctx->Polygon.OffsetUnits * 2;
+
+ */
+ wm.GlobalDepthOffsetScale = ctx->Polygon.OffsetFactor;
+ }
+
+ wm.DepthCoefficientURBReadOffset = 1;
+#endif
+
+ /* BRW_NEW_STATS_WM */
+ wm.StatisticsEnable = GEN_GEN >= 6 || brw->stats_wm;
+
#if GEN_GEN < 7
if (wm_prog_data->base.use_alt_mode)
- wm.FloatingPointMode = Alternate;
+ wm.FloatingPointMode = FLOATING_POINT_MODE_Alternate;
+
+ wm.SamplerCount = GEN_GEN == 5 ?
+ 0 : DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.SamplerCount = DIV_ROUND_UP(stage_state->sampler_count, 4);
- wm.BindingTableEntryCount = wm_prog_data->base.binding_table.size_bytes / 4;
+ wm.BindingTableEntryCount =
+ wm_prog_data->base.binding_table.size_bytes / 4;
wm.MaximumNumberofThreads = devinfo->max_wm_threads - 1;
wm._8PixelDispatchEnable = wm_prog_data->dispatch_8;
wm._16PixelDispatchEnable = wm_prog_data->dispatch_16;
wm.DispatchGRFStartRegisterForConstantSetupData0 =
wm_prog_data->base.dispatch_grf_start_reg;
- wm.DispatchGRFStartRegisterForConstantSetupData2 =
- wm_prog_data->dispatch_grf_start_reg_2;
- wm.KernelStartPointer0 = stage_state->prog_offset;
- wm.KernelStartPointer2 = stage_state->prog_offset +
- wm_prog_data->prog_offset_2;
+ if (GEN_GEN == 6 ||
+ wm_prog_data->dispatch_8 || wm_prog_data->dispatch_16) {
+ wm.KernelStartPointer0 = KSP_ro(brw,
+ stage_state->prog_offset);
+ }
+
+#if GEN_GEN >= 5
+ if (GEN_GEN == 6 || wm_prog_data->prog_offset_2) {
+ wm.KernelStartPointer2 =
+ KSP_ro(brw, stage_state->prog_offset +
+ wm_prog_data->prog_offset_2);
+ }
+#endif
+
+#if GEN_GEN == 6
wm.DualSourceBlendEnable =
wm_prog_data->dual_src_blend && (ctx->Color.BlendEnabled & 1) &&
ctx->Color.Blend[0]._UsesDualSrc;
@@ -1792,42 +1884,34 @@ genX(upload_wm)(struct brw_context *brw)
else
wm.PositionXYOffsetSelect = POSOFFSET_NONE;
+ wm.DispatchGRFStartRegisterForConstantSetupData2 =
+ wm_prog_data->dispatch_grf_start_reg_2;
+#endif
+
if (wm_prog_data->base.total_scratch) {
wm.ScratchSpaceBasePointer =
- render_bo(stage_state->scratch_bo,
- ffs(stage_state->per_thread_scratch) - 11);
+ render_bo(stage_state->scratch_bo, 0);
+ wm.PerThreadScratchSpace =
+ ffs(stage_state->per_thread_scratch) - 11;
}
wm.PixelShaderComputedDepth = writes_depth;
#endif
- wm.PointRasterizationRule = RASTRULE_UPPER_RIGHT;
-
/* _NEW_LINE */
wm.LineStippleEnable = ctx->Line.StippleFlag;
/* _NEW_POLYGON */
wm.PolygonStippleEnable = ctx->Polygon.StippleFlag;
- wm.BarycentricInterpolationMode = wm_prog_data->barycentric_interp_modes;
#if GEN_GEN < 8
- /* _NEW_BUFFERS */
- const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
- wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+#if GEN_GEN >= 6
wm.PixelShaderUsesSourceW = wm_prog_data->uses_src_w;
- if (wm_prog_data->uses_kill ||
- _mesa_is_alpha_test_enabled(ctx) ||
- _mesa_is_alpha_to_coverage_enabled(ctx) ||
- wm_prog_data->uses_omask) {
- wm.PixelShaderKillsPixel = true;
- }
- /* _NEW_BUFFERS | _NEW_COLOR */
- if (brw_color_buffer_write_enabled(brw) || writes_depth ||
- wm_prog_data->has_side_effects || wm.PixelShaderKillsPixel) {
- wm.ThreadDispatchEnable = true;
- }
+ /* _NEW_BUFFERS */
+ const bool multisampled_fbo = _mesa_geometric_samples(ctx->DrawBuffer) > 1;
+
if (multisampled_fbo) {
/* _NEW_MULTISAMPLE */
if (ctx->Multisample.Enabled)
@@ -1843,6 +1927,21 @@ genX(upload_wm)(struct brw_context *brw)
wm.MultisampleRasterizationMode = MSRASTMODE_OFF_PIXEL;
wm.MultisampleDispatchMode = MSDISPMODE_PERSAMPLE;
}
+#endif
+ wm.PixelShaderUsesSourceDepth = wm_prog_data->uses_src_depth;
+ if (wm_prog_data->uses_kill ||
+ _mesa_is_alpha_test_enabled(ctx) ||
+ _mesa_is_alpha_to_coverage_enabled(ctx) ||
+ (GEN_GEN >= 6 && wm_prog_data->uses_omask)) {
+ wm.PixelShaderKillsPixel = true;
+ }
+
+ /* _NEW_BUFFERS | _NEW_COLOR */
+ if (brw_color_buffer_write_enabled(brw) || writes_depth ||
+ wm.PixelShaderKillsPixel ||
+ (GEN_GEN >= 6 && wm_prog_data->has_side_effects)) {
+ wm.ThreadDispatchEnable = true;
+ }
#if GEN_GEN >= 7
wm.PixelShaderComputedDepthMode = wm_prog_data->computed_depth_mode;
@@ -1873,6 +1972,16 @@ genX(upload_wm)(struct brw_context *brw)
wm.EarlyDepthStencilControl = EDSC_PSEXEC;
#endif
}
+
+#if GEN_GEN <= 5
+ if (brw->wm.offset_clamp != ctx->Polygon.OffsetClamp) {
+ brw_batch_emit(brw, GENX(3DSTATE_GLOBAL_DEPTH_OFFSET_CLAMP), clamp) {
+ clamp.GlobalDepthOffsetClamp = ctx->Polygon.OffsetClamp;
+ }
+
+ brw->wm.offset_clamp = ctx->Polygon.OffsetClamp;
+ }
+#endif
}
static const struct brw_tracked_state genX(wm_state) = {
@@ -1880,17 +1989,23 @@ static const struct brw_tracked_state genX(wm_state) = {
.mesa = _NEW_LINE |
_NEW_POLYGON |
(GEN_GEN < 8 ? _NEW_BUFFERS |
- _NEW_COLOR |
0) |
- (GEN_GEN < 7 ? _NEW_PROGRAM_CONSTANTS : 0),
+ (GEN_GEN == 6 ? _NEW_PROGRAM_CONSTANTS : 0) |
+ (GEN_GEN < 6 ? _NEW_POLYGONSTIPPLE : 0) |
+ (GEN_GEN < 8 && GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0),
.brw = BRW_NEW_BLORP |
BRW_NEW_FS_PROG_DATA |
+ (GEN_GEN < 6 ? BRW_NEW_PUSH_CONSTANT_ALLOCATION |
+ BRW_NEW_FRAGMENT_PROGRAM |
+ BRW_NEW_PROGRAM_CACHE |
+ BRW_NEW_SAMPLER_STATE_TABLE |
+ BRW_NEW_STATS_WM
+ : 0) |
(GEN_GEN < 7 ? BRW_NEW_BATCH : BRW_NEW_CONTEXT),
},
.emit = genX(upload_wm),
};
-#endif
/* ---------------------------------------------------------------------- */
@@ -4475,7 +4590,7 @@ genX(init_atoms)(struct brw_context *brw)
&brw_vs_samplers,
/* These set up state for brw_psp_urb_cbs */
- &brw_wm_unit,
+ &genX(wm_state),
&genX(sf_clip_viewport),
&genX(sf_state),
&genX(vs_state), /* always required, enabled or not */
--
2.9.4
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Kenneth Graunke
2017-07-18 06:35:01 UTC
Permalink
Raw Message
Post by Rafael Antognolli
The code doesn't get exactly a lot simpler but at least it is in a single
place, and we delete more than we add.
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 -
src/mesa/drivers/dri/i965/brw_state.h | 1 -
src/mesa/drivers/dri/i965/brw_structs.h | 121 ------------
src/mesa/drivers/dri/i965/brw_wm.h | 2 -
src/mesa/drivers/dri/i965/brw_wm_state.c | 274 --------------------------
src/mesa/drivers/dri/i965/genX_state_upload.c | 191 ++++++++++++++----
6 files changed, 153 insertions(+), 437 deletions(-)
delete mode 100644 src/mesa/drivers/dri/i965/brw_wm_state.c
Reviewed-by: Kenneth Graunke <***@whitecape.org>

Loading...