Discussion:
[PATCH] drirc: add option to disable ARB_draw_indirect
(too old to reply)
Rob Clark
2017-12-05 12:54:12 UTC
Permalink
Raw Message
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.

Signed-off-by: Rob Clark <***@gmail.com>
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).

src/gallium/auxiliary/pipe-loader/driinfo_gallium.h | 1 +
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_screen.c | 2 ++
src/mesa/state_tracker/st_extensions.c | 3 +++
src/util/drirc | 10 ++++++++++
src/util/xmlpool/t_options.h | 5 +++++
6 files changed, 22 insertions(+)

diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
index 003a3d7089e..9c1705bd9a8 100644
--- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
+++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
@@ -32,4 +32,5 @@ DRI_CONF_SECTION_END
DRI_CONF_SECTION_MISCELLANEOUS
DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false")
DRI_CONF_GLSL_ZERO_INIT("false")
+ DRI_CONF_DISABLE_DRAW_INDIRECT("false")
DRI_CONF_SECTION_END
diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h
index 44d6b474f8f..20a7843992a 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -217,6 +217,7 @@ struct st_config_options
boolean disable_blend_func_extended;
boolean disable_glsl_line_continuations;
boolean disable_shader_bit_encoding;
+ boolean disable_draw_indirect;
boolean force_glsl_extensions_warn;
unsigned force_glsl_version;
boolean allow_glsl_extension_directive_midshader;
diff --git a/src/gallium/state_trackers/dri/dri_screen.c b/src/gallium/state_trackers/dri/dri_screen.c
index 1ca511612ad..035406a771e 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -68,6 +68,8 @@ dri_fill_st_options(struct dri_screen *screen)
driQueryOptionb(optionCache, "disable_glsl_line_continuations");
options->disable_shader_bit_encoding =
driQueryOptionb(optionCache, "disable_shader_bit_encoding");
+ options->disable_draw_indirect =
+ driQueryOptionb(optionCache, "disable_draw_indirect");
options->force_glsl_extensions_warn =
driQueryOptionb(optionCache, "force_glsl_extensions_warn");
options->force_glsl_version =
diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c
index 9ef0df1e926..0c7e7e3abf1 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -909,6 +909,9 @@ void st_init_extensions(struct pipe_screen *screen,
}
}

+ if (options->disable_draw_indirect)
+ extensions->ARB_draw_indirect = GL_FALSE;
+
/* Expose the extensions which directly correspond to gallium formats. */
init_format_extensions(screen, extensions, rendertarget_mapping,
ARRAY_SIZE(rendertarget_mapping), PIPE_TEXTURE_2D,
diff --git a/src/util/drirc b/src/util/drirc
index 9d27330036e..cea80ecc3d6 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -275,4 +275,14 @@ TODO: document the other workarounds.
<option name="radeonsi_clear_db_cache_before_clear" value="true" />
</application>
</device>
+ <device driver="msm">
+ <!--
+ hw supports both GL_ARB_base_instance and GL_ARB_draw_indirect,
+ but current firmware does not support both at the same time, so
+ we need to blacklist apps that are known to depend on both.
+ -->
+ <application name="SuperTuxKart" executable="supertuxkart">
+ <option name="disable_draw_indirect" value="true"/>
+ </application>
+ </device>
</driconf>
diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h
index bd553085c86..0d1c597a830 100644
--- a/src/util/xmlpool/t_options.h
+++ b/src/util/xmlpool/t_options.h
@@ -379,6 +379,11 @@ DRI_CONF_OPT_BEGIN_B(glsl_zero_init, def) \
DRI_CONF_DESC(en,gettext("Force uninitialized variables to default to zero")) \
DRI_CONF_OPT_END

+#define DRI_CONF_DISABLE_DRAW_INDIRECT(def) \
+DRI_CONF_OPT_BEGIN_B(disable_draw_indirect, def) \
+ DRI_CONF_DESC(en, gettext("Disable the GL_ARB_draw_indirect extension")) \
+DRI_CONF_OPT_END
+
/**
* \brief Initialization configuration options
*/
--
2.13.6
Ilia Mirkin
2017-12-05 13:02:12 UTC
Permalink
Raw Message
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
That means you should only expose one of those features, probably the
draw-indirect one. If someone desperately wants that functionality,
they can force it with MESA_EXTENSION_OVERRIDE. ARB_base_instance has
a specc'd interaction with ARB_indirect_draw...

-ilia
Post by Rob Clark
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
src/gallium/auxiliary/pipe-loader/driinfo_gallium.h | 1 +
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_screen.c | 2 ++
src/mesa/state_tracker/st_extensions.c | 3 +++
src/util/drirc | 10 ++++++++++
src/util/xmlpool/t_options.h | 5 +++++
6 files changed, 22 insertions(+)
diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
index 003a3d7089e..9c1705bd9a8 100644
--- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
+++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
@@ -32,4 +32,5 @@ DRI_CONF_SECTION_END
DRI_CONF_SECTION_MISCELLANEOUS
DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false")
DRI_CONF_GLSL_ZERO_INIT("false")
+ DRI_CONF_DISABLE_DRAW_INDIRECT("false")
DRI_CONF_SECTION_END
diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h
index 44d6b474f8f..20a7843992a 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -217,6 +217,7 @@ struct st_config_options
boolean disable_blend_func_extended;
boolean disable_glsl_line_continuations;
boolean disable_shader_bit_encoding;
+ boolean disable_draw_indirect;
boolean force_glsl_extensions_warn;
unsigned force_glsl_version;
boolean allow_glsl_extension_directive_midshader;
diff --git a/src/gallium/state_trackers/dri/dri_screen.c b/src/gallium/state_trackers/dri/dri_screen.c
index 1ca511612ad..035406a771e 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -68,6 +68,8 @@ dri_fill_st_options(struct dri_screen *screen)
driQueryOptionb(optionCache, "disable_glsl_line_continuations");
options->disable_shader_bit_encoding =
driQueryOptionb(optionCache, "disable_shader_bit_encoding");
+ options->disable_draw_indirect =
+ driQueryOptionb(optionCache, "disable_draw_indirect");
options->force_glsl_extensions_warn =
driQueryOptionb(optionCache, "force_glsl_extensions_warn");
options->force_glsl_version =
diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c
index 9ef0df1e926..0c7e7e3abf1 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -909,6 +909,9 @@ void st_init_extensions(struct pipe_screen *screen,
}
}
+ if (options->disable_draw_indirect)
+ extensions->ARB_draw_indirect = GL_FALSE;
+
/* Expose the extensions which directly correspond to gallium formats. */
init_format_extensions(screen, extensions, rendertarget_mapping,
ARRAY_SIZE(rendertarget_mapping), PIPE_TEXTURE_2D,
diff --git a/src/util/drirc b/src/util/drirc
index 9d27330036e..cea80ecc3d6 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -275,4 +275,14 @@ TODO: document the other workarounds.
<option name="radeonsi_clear_db_cache_before_clear" value="true" />
</application>
</device>
+ <device driver="msm">
+ <!--
+ hw supports both GL_ARB_base_instance and GL_ARB_draw_indirect,
+ but current firmware does not support both at the same time, so
+ we need to blacklist apps that are known to depend on both.
+ -->
+ <application name="SuperTuxKart" executable="supertuxkart">
+ <option name="disable_draw_indirect" value="true"/>
+ </application>
+ </device>
</driconf>
diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h
index bd553085c86..0d1c597a830 100644
--- a/src/util/xmlpool/t_options.h
+++ b/src/util/xmlpool/t_options.h
@@ -379,6 +379,11 @@ DRI_CONF_OPT_BEGIN_B(glsl_zero_init, def) \
DRI_CONF_DESC(en,gettext("Force uninitialized variables to default to zero")) \
DRI_CONF_OPT_END
+#define DRI_CONF_DISABLE_DRAW_INDIRECT(def) \
+DRI_CONF_OPT_BEGIN_B(disable_draw_indirect, def) \
+ DRI_CONF_DESC(en, gettext("Disable the GL_ARB_draw_indirect extension")) \
+DRI_CONF_OPT_END
+
/**
* \brief Initialization configuration options
*/
--
2.13.6
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Rob Clark
2017-12-05 13:15:16 UTC
Permalink
Raw Message
Post by Ilia Mirkin
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
That means you should only expose one of those features, probably the
draw-indirect one. If someone desperately wants that functionality,
they can force it with MESA_EXTENSION_OVERRIDE. ARB_base_instance has
a specc'd interaction with ARB_indirect_draw...
In that case, we should have driconf to force override extension (ie.
not rely on user to remember magic MESA_EXTENSION_OVERRIDE
incantations) for any games that might use base-instance but not
draw-indirect..

BR,
-R
Post by Ilia Mirkin
-ilia
Post by Rob Clark
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
src/gallium/auxiliary/pipe-loader/driinfo_gallium.h | 1 +
src/gallium/include/state_tracker/st_api.h | 1 +
src/gallium/state_trackers/dri/dri_screen.c | 2 ++
src/mesa/state_tracker/st_extensions.c | 3 +++
src/util/drirc | 10 ++++++++++
src/util/xmlpool/t_options.h | 5 +++++
6 files changed, 22 insertions(+)
diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
index 003a3d7089e..9c1705bd9a8 100644
--- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
+++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
@@ -32,4 +32,5 @@ DRI_CONF_SECTION_END
DRI_CONF_SECTION_MISCELLANEOUS
DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false")
DRI_CONF_GLSL_ZERO_INIT("false")
+ DRI_CONF_DISABLE_DRAW_INDIRECT("false")
DRI_CONF_SECTION_END
diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h
index 44d6b474f8f..20a7843992a 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -217,6 +217,7 @@ struct st_config_options
boolean disable_blend_func_extended;
boolean disable_glsl_line_continuations;
boolean disable_shader_bit_encoding;
+ boolean disable_draw_indirect;
boolean force_glsl_extensions_warn;
unsigned force_glsl_version;
boolean allow_glsl_extension_directive_midshader;
diff --git a/src/gallium/state_trackers/dri/dri_screen.c b/src/gallium/state_trackers/dri/dri_screen.c
index 1ca511612ad..035406a771e 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -68,6 +68,8 @@ dri_fill_st_options(struct dri_screen *screen)
driQueryOptionb(optionCache, "disable_glsl_line_continuations");
options->disable_shader_bit_encoding =
driQueryOptionb(optionCache, "disable_shader_bit_encoding");
+ options->disable_draw_indirect =
+ driQueryOptionb(optionCache, "disable_draw_indirect");
options->force_glsl_extensions_warn =
driQueryOptionb(optionCache, "force_glsl_extensions_warn");
options->force_glsl_version =
diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c
index 9ef0df1e926..0c7e7e3abf1 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -909,6 +909,9 @@ void st_init_extensions(struct pipe_screen *screen,
}
}
+ if (options->disable_draw_indirect)
+ extensions->ARB_draw_indirect = GL_FALSE;
+
/* Expose the extensions which directly correspond to gallium formats. */
init_format_extensions(screen, extensions, rendertarget_mapping,
ARRAY_SIZE(rendertarget_mapping), PIPE_TEXTURE_2D,
diff --git a/src/util/drirc b/src/util/drirc
index 9d27330036e..cea80ecc3d6 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -275,4 +275,14 @@ TODO: document the other workarounds.
<option name="radeonsi_clear_db_cache_before_clear" value="true" />
</application>
</device>
+ <device driver="msm">
+ <!--
+ hw supports both GL_ARB_base_instance and GL_ARB_draw_indirect,
+ but current firmware does not support both at the same time, so
+ we need to blacklist apps that are known to depend on both.
+ -->
+ <application name="SuperTuxKart" executable="supertuxkart">
+ <option name="disable_draw_indirect" value="true"/>
+ </application>
+ </device>
</driconf>
diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h
index bd553085c86..0d1c597a830 100644
--- a/src/util/xmlpool/t_options.h
+++ b/src/util/xmlpool/t_options.h
@@ -379,6 +379,11 @@ DRI_CONF_OPT_BEGIN_B(glsl_zero_init, def) \
DRI_CONF_DESC(en,gettext("Force uninitialized variables to default to zero")) \
DRI_CONF_OPT_END
+#define DRI_CONF_DISABLE_DRAW_INDIRECT(def) \
+DRI_CONF_OPT_BEGIN_B(disable_draw_indirect, def) \
+ DRI_CONF_DESC(en, gettext("Disable the GL_ARB_draw_indirect extension")) \
+DRI_CONF_OPT_END
+
/**
* \brief Initialization configuration options
*/
--
2.13.6
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Emil Velikov
2017-12-05 13:18:47 UTC
Permalink
Raw Message
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.

AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.

-Emil
Ilia Mirkin
2017-12-05 13:25:56 UTC
Permalink
Raw Message
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
Ian Romanick
2017-12-06 20:31:48 UTC
Permalink
Raw Message
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
Post by Ilia Mirkin
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Rob Clark
2017-12-06 22:52:55 UTC
Permalink
Raw Message
Post by Ian Romanick
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
I can ask.. I guess the odds are lower compared to a open src driver
effort supported by the hw vendor..

For now I've disabled exposing base-instance since that isn't needed
until later than draw-indirect..

BR,
-R
Rob Clark
2017-12-14 23:56:06 UTC
Permalink
Raw Message
Post by Ian Romanick
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
so after r/e the instruction set for the CP microcontrollers and
writing a disassembler and assembler[1], and figuring out how the fw
handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
to the conclusion that the issue isn't actually with draw-indirect vs
base-instance (at least not w/ the fw from my pixel2 which md5sum
claims is the same as what is in linux-firmware.. it is possible that
I was using an earlier version of the fw before when I came to this
conclusion). On the plus side, the PFP/ME microcontrollers that parse
the cmdstream are pretty neat and I learned some useful stuff along
the way.

But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
and update the draw-indirect buffers, it seems to me there are plenty
of ways this can go wrong w/ tilers (and even more when you throw
re-ordering into the mix). Possibly I should disable reordering when
the indirect buffer is mapped w/ PERSISTENT bit, although for games
like stk this is probably counter-productive vs just hiding the
draw-indirect extension.. for games that actually use the GPU to write
the draw-indirect buffer it shouldn't be a problem. So I think a
driconf patch like this probably still ends up being useful in the
end.

BR,
-R

[1] https://github.com/freedreno/envytools/tree/afuc/afuc
Nicolai Hähnle
2017-12-15 09:41:23 UTC
Permalink
Raw Message
Post by Rob Clark
Post by Ian Romanick
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
so after r/e the instruction set for the CP microcontrollers and
writing a disassembler and assembler[1], and figuring out how the fw
handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
to the conclusion that the issue isn't actually with draw-indirect vs
base-instance (at least not w/ the fw from my pixel2 which md5sum
claims is the same as what is in linux-firmware.. it is possible that
I was using an earlier version of the fw before when I came to this
conclusion). On the plus side, the PFP/ME microcontrollers that parse
the cmdstream are pretty neat and I learned some useful stuff along
the way.
But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
and update the draw-indirect buffers, it seems to me there are plenty
of ways this can go wrong w/ tilers (and even more when you throw
re-ordering into the mix). Possibly I should disable reordering when
the indirect buffer is mapped w/ PERSISTENT bit, although for games
like stk this is probably counter-productive vs just hiding the
draw-indirect extension.. for games that actually use the GPU to write
the draw-indirect buffer it shouldn't be a problem. So I think a
driconf patch like this probably still ends up being useful in the
end.
Can you detail a bit what you think could go wrong? I believe that the
intention of the GL spec is that reordering in tilers should be possible
at least for buffers that are mapped PERSISTENT but not COHERENT.

You may only have to block reordering if the buffer is mapped both
PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.

Granted, the spec is unclear as usual when it comes to these memory
synchronization issues -- the description of the MAP_COHERENT_BIT in
section 6.2 does not mention WAR hazards (in particular, Write by client
after Read by server) -- but perhaps that can be fixed.

To go into a bit more detail, what I suspect you're worried about is
applications doing stuff like:

1. Write to indirect buffer (persistently & coherently mapped)
2. Draw*Indirect
3. Write to the same location in the indirect buffer
4. Draw*Indirect

... but this is bound to fail with "normal" GPUs (like ours) as well.
Perhaps you have a different scenario in mind?

Cheers,
Nicolai
Rob Clark
2017-12-15 11:37:50 UTC
Permalink
Raw Message
Post by Nicolai Hähnle
Post by Rob Clark
Post by Ian Romanick
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
so after r/e the instruction set for the CP microcontrollers and
writing a disassembler and assembler[1], and figuring out how the fw
handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
to the conclusion that the issue isn't actually with draw-indirect vs
base-instance (at least not w/ the fw from my pixel2 which md5sum
claims is the same as what is in linux-firmware.. it is possible that
I was using an earlier version of the fw before when I came to this
conclusion). On the plus side, the PFP/ME microcontrollers that parse
the cmdstream are pretty neat and I learned some useful stuff along
the way.
But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
and update the draw-indirect buffers, it seems to me there are plenty
of ways this can go wrong w/ tilers (and even more when you throw
re-ordering into the mix). Possibly I should disable reordering when
the indirect buffer is mapped w/ PERSISTENT bit, although for games
like stk this is probably counter-productive vs just hiding the
draw-indirect extension.. for games that actually use the GPU to write
the draw-indirect buffer it shouldn't be a problem. So I think a
driconf patch like this probably still ends up being useful in the
end.
Can you detail a bit what you think could go wrong? I believe that the
intention of the GL spec is that reordering in tilers should be possible at
least for buffers that are mapped PERSISTENT but not COHERENT.
You may only have to block reordering if the buffer is mapped both
PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.
Granted, the spec is unclear as usual when it comes to these memory
synchronization issues -- the description of the MAP_COHERENT_BIT in section
6.2 does not mention WAR hazards (in particular, Write by client after Read
by server) -- but perhaps that can be fixed.
To go into a bit more detail, what I suspect you're worried about is
1. Write to indirect buffer (persistently & coherently mapped)
2. Draw*Indirect
3. Write to the same location in the indirect buffer
4. Draw*Indirect
... but this is bound to fail with "normal" GPUs (like ours) as well.
Perhaps you have a different scenario in mind?
yeah, this was basically the scenario I had in mind.. although I'm
perhaps more aggressive in deferring rendering, to the point of
re-ordering draws if unnecessary fbo switches are made. Normally I
track which buffers are read and written in a given batch (draw pass)
in order to preserve correctness (and in some cases shadowing or doing
a staging transfer to update buffers/textures to avoid splitting a
batch). Perhaps it is only an issue w/ persistent+coherent, but w/
cpu updating buffer without driver knowing when is kind of
sub-optimal.

I'm thinking I do need to keep track when there are outstanding
coherent+persistent transfers mapped and switch off some of the
cleverness.

All the same, the way stk uses draw-indirect is not useful and it
would be better to shut it off, at least for me. Although perhaps
instead of adding things like this to driconf extension by extension,
it would be more useful to have a way to provide a default
MESA_EXTENSION_OVERRIDE via driconf?

BR,
-R
Nicolai Hähnle
2017-12-15 18:08:03 UTC
Permalink
Raw Message
Post by Rob Clark
Post by Nicolai Hähnle
Post by Rob Clark
Post by Ian Romanick
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
so after r/e the instruction set for the CP microcontrollers and
writing a disassembler and assembler[1], and figuring out how the fw
handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
to the conclusion that the issue isn't actually with draw-indirect vs
base-instance (at least not w/ the fw from my pixel2 which md5sum
claims is the same as what is in linux-firmware.. it is possible that
I was using an earlier version of the fw before when I came to this
conclusion). On the plus side, the PFP/ME microcontrollers that parse
the cmdstream are pretty neat and I learned some useful stuff along
the way.
But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
and update the draw-indirect buffers, it seems to me there are plenty
of ways this can go wrong w/ tilers (and even more when you throw
re-ordering into the mix). Possibly I should disable reordering when
the indirect buffer is mapped w/ PERSISTENT bit, although for games
like stk this is probably counter-productive vs just hiding the
draw-indirect extension.. for games that actually use the GPU to write
the draw-indirect buffer it shouldn't be a problem. So I think a
driconf patch like this probably still ends up being useful in the
end.
Can you detail a bit what you think could go wrong? I believe that the
intention of the GL spec is that reordering in tilers should be possible at
least for buffers that are mapped PERSISTENT but not COHERENT.
You may only have to block reordering if the buffer is mapped both
PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.
Granted, the spec is unclear as usual when it comes to these memory
synchronization issues -- the description of the MAP_COHERENT_BIT in section
6.2 does not mention WAR hazards (in particular, Write by client after Read
by server) -- but perhaps that can be fixed.
To go into a bit more detail, what I suspect you're worried about is
1. Write to indirect buffer (persistently & coherently mapped)
2. Draw*Indirect
3. Write to the same location in the indirect buffer
4. Draw*Indirect
... but this is bound to fail with "normal" GPUs (like ours) as well.
Perhaps you have a different scenario in mind?
yeah, this was basically the scenario I had in mind.. although I'm
perhaps more aggressive in deferring rendering, to the point of
re-ordering draws if unnecessary fbo switches are made. Normally I
track which buffers are read and written in a given batch (draw pass)
in order to preserve correctness (and in some cases shadowing or doing
a staging transfer to update buffers/textures to avoid splitting a
batch). Perhaps it is only an issue w/ persistent+coherent, but w/
cpu updating buffer without driver knowing when is kind of
sub-optimal.
I'm thinking I do need to keep track when there are outstanding
coherent+persistent transfers mapped and switch off some of the
cleverness.
I'm not convinced that this is actually necessary. You probably only
need to break for things like glFlushMappedBufferRange() and glFenceSync().

The reasoning is basically this: unless one of those synchronizing
commands is used, every desktop GPU will effectively re-order a sequence of:

1. Write #1 to persistent-mapped buffer
2. Draw #1
3. Write #2 to persistent-mapped buffer
4. Draw #2

to:

1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3. Draw #1
4. Draw #2

Correct me if I'm wrong, but the way I understand tiled GPUs, you might
in addition do some multi-passing that ends up as:

1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3.0. Draw #1, tile 0
4.0. Draw #2, tile 0
3.1. Draw #1, tile 1
4.1. Draw #2, tile 1
... etc. ...

So as far as interactions with persistent-mapped buffers are concerned,
tiling just doesn't make a difference.
Post by Rob Clark
All the same, the way stk uses draw-indirect is not useful and it
would be better to shut it off, at least for me. Although perhaps
instead of adding things like this to driconf extension by extension,
it would be more useful to have a way to provide a default
MESA_EXTENSION_OVERRIDE via driconf?
Yeah, that makes sense.

Cheers,
Nicolai
Post by Rob Clark
BR,
-R
Eric Anholt
2017-12-18 18:17:38 UTC
Permalink
Raw Message
Post by Nicolai Hähnle
Post by Rob Clark
Post by Nicolai Hähnle
Post by Rob Clark
Post by Ian Romanick
Post by Ilia Mirkin
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So add a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware or LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
so after r/e the instruction set for the CP microcontrollers and
writing a disassembler and assembler[1], and figuring out how the fw
handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
to the conclusion that the issue isn't actually with draw-indirect vs
base-instance (at least not w/ the fw from my pixel2 which md5sum
claims is the same as what is in linux-firmware.. it is possible that
I was using an earlier version of the fw before when I came to this
conclusion). On the plus side, the PFP/ME microcontrollers that parse
the cmdstream are pretty neat and I learned some useful stuff along
the way.
But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
and update the draw-indirect buffers, it seems to me there are plenty
of ways this can go wrong w/ tilers (and even more when you throw
re-ordering into the mix). Possibly I should disable reordering when
the indirect buffer is mapped w/ PERSISTENT bit, although for games
like stk this is probably counter-productive vs just hiding the
draw-indirect extension.. for games that actually use the GPU to write
the draw-indirect buffer it shouldn't be a problem. So I think a
driconf patch like this probably still ends up being useful in the
end.
Can you detail a bit what you think could go wrong? I believe that the
intention of the GL spec is that reordering in tilers should be possible at
least for buffers that are mapped PERSISTENT but not COHERENT.
You may only have to block reordering if the buffer is mapped both
PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.
Granted, the spec is unclear as usual when it comes to these memory
synchronization issues -- the description of the MAP_COHERENT_BIT in section
6.2 does not mention WAR hazards (in particular, Write by client after Read
by server) -- but perhaps that can be fixed.
To go into a bit more detail, what I suspect you're worried about is
1. Write to indirect buffer (persistently & coherently mapped)
2. Draw*Indirect
3. Write to the same location in the indirect buffer
4. Draw*Indirect
... but this is bound to fail with "normal" GPUs (like ours) as well.
Perhaps you have a different scenario in mind?
yeah, this was basically the scenario I had in mind.. although I'm
perhaps more aggressive in deferring rendering, to the point of
re-ordering draws if unnecessary fbo switches are made. Normally I
track which buffers are read and written in a given batch (draw pass)
in order to preserve correctness (and in some cases shadowing or doing
a staging transfer to update buffers/textures to avoid splitting a
batch). Perhaps it is only an issue w/ persistent+coherent, but w/
cpu updating buffer without driver knowing when is kind of
sub-optimal.
I'm thinking I do need to keep track when there are outstanding
coherent+persistent transfers mapped and switch off some of the
cleverness.
I'm not convinced that this is actually necessary. You probably only
need to break for things like glFlushMappedBufferRange() and glFenceSync().
The reasoning is basically this: unless one of those synchronizing
1. Write #1 to persistent-mapped buffer
2. Draw #1
3. Write #2 to persistent-mapped buffer
4. Draw #2
1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3. Draw #1
4. Draw #2
Correct me if I'm wrong, but the way I understand tiled GPUs, you might
1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3.0. Draw #1, tile 0
4.0. Draw #2, tile 0
3.1. Draw #1, tile 1
4.1. Draw #2, tile 1
... etc. ...
So as far as interactions with persistent-mapped buffers are concerned,
tiling just doesn't make a difference.
Yeah, that's right.

I don't see how being a tiler would affect persistent/coherent mappings.
Rob Clark
2017-12-18 20:45:43 UTC
Permalink
Raw Message
Post by Rob Clark
Post by Nicolai Hähnle
Post by Rob Clark
Post by Ian Romanick
On Tue, Dec 5, 2017 at 8:18 AM, Emil Velikov
Post by Emil Velikov
Hi Rob,
Post by Rob Clark
This is a bit sad/annoying. But with current GPU firmware (at least on
a5xx) we can support both draw-indirect and base-instance. But we can't
support draw-indirect with a non-zero base-instance specified. So
add
a
driconf option to hide the extension from games that are known to use
both.
---
Tbh, I'm also not really sure what to do when/if we got updated firmware
which handled draw-indirect with base-instance, since we'd need to make
this option conditional on fw version. For STK that probably isn't a
big deal since it doesn't use draw-indirect in a particularly useful way
(the indirect buffer is generated on CPU).
Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
disable the extension) as it detects buggy FW?
This is what radeons have been doing as they encounter iffy firmware
or
LLVM.
AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
Rob is this -><- close to ES 3.1, so that's not a great option.
And I don't suppose there's a way to get updated firmware? i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.
so after r/e the instruction set for the CP microcontrollers and
writing a disassembler and assembler[1], and figuring out how the fw
handles CP_DRAW_INDIRECT and CP_DRAW_INDX_INDIRECT packets, I've come
to the conclusion that the issue isn't actually with draw-indirect vs
base-instance (at least not w/ the fw from my pixel2 which md5sum
claims is the same as what is in linux-firmware.. it is possible that
I was using an earlier version of the fw before when I came to this
conclusion). On the plus side, the PFP/ME microcontrollers that parse
the cmdstream are pretty neat and I learned some useful stuff along
the way.
But thinking a bit about how stk is using GL_MAP_PERSISTENT_BIT to map
and update the draw-indirect buffers, it seems to me there are plenty
of ways this can go wrong w/ tilers (and even more when you throw
re-ordering into the mix). Possibly I should disable reordering when
the indirect buffer is mapped w/ PERSISTENT bit, although for games
like stk this is probably counter-productive vs just hiding the
draw-indirect extension.. for games that actually use the GPU to write
the draw-indirect buffer it shouldn't be a problem. So I think a
driconf patch like this probably still ends up being useful in the
end.
Can you detail a bit what you think could go wrong? I believe that the
intention of the GL spec is that reordering in tilers should be possible at
least for buffers that are mapped PERSISTENT but not COHERENT.
You may only have to block reordering if the buffer is mapped both
PERSISTENT *and* COHERENT -- and even then, reordering is probably possible.
Granted, the spec is unclear as usual when it comes to these memory
synchronization issues -- the description of the MAP_COHERENT_BIT in section
6.2 does not mention WAR hazards (in particular, Write by client after Read
by server) -- but perhaps that can be fixed.
To go into a bit more detail, what I suspect you're worried about is
1. Write to indirect buffer (persistently & coherently mapped)
2. Draw*Indirect
3. Write to the same location in the indirect buffer
4. Draw*Indirect
... but this is bound to fail with "normal" GPUs (like ours) as well.
Perhaps you have a different scenario in mind?
yeah, this was basically the scenario I had in mind.. although I'm
perhaps more aggressive in deferring rendering, to the point of
re-ordering draws if unnecessary fbo switches are made. Normally I
track which buffers are read and written in a given batch (draw pass)
in order to preserve correctness (and in some cases shadowing or doing
a staging transfer to update buffers/textures to avoid splitting a
batch). Perhaps it is only an issue w/ persistent+coherent, but w/
cpu updating buffer without driver knowing when is kind of
sub-optimal.
I'm thinking I do need to keep track when there are outstanding
coherent+persistent transfers mapped and switch off some of the
cleverness.
I'm not convinced that this is actually necessary. You probably only need to
break for things like glFlushMappedBufferRange() and glFenceSync().
The reasoning is basically this: unless one of those synchronizing commands
1. Write #1 to persistent-mapped buffer
2. Draw #1
3. Write #2 to persistent-mapped buffer
4. Draw #2
1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3. Draw #1
4. Draw #2
Correct me if I'm wrong, but the way I understand tiled GPUs, you might in
1. Write #1 to persistent-mapped buffer
2. Write #2 to persistent-mapped buffer
3.0. Draw #1, tile 0
4.0. Draw #2, tile 0
3.1. Draw #1, tile 1
4.1. Draw #2, tile 1
... etc. ...
I guess if you throw in FBO switches and draw-reordering things can be
happening a bit more out of order than this. Maybe not an issue w/ a
correctly written game/app..

Anyways, I'll have to look more closely at stk to understand how it is
expecting CPU writes to this indirect-params buffer to be
synchronized. I'm not quite sure what is going on, but I think the
main diff (depending on whether draw-indirect + base-instance is
exposed or not) is using glDrawElementsBaseVertex() vs
glDrawElementsIndirect()/glMultiDrawElementsIndirect(). And somehow
it goes fabulously wrong. I guess I should try mapping/unmapping each
time the buffer is updated to ensure it is synchronized w/ gpu..

BR,
-R
So as far as interactions with persistent-mapped buffers are concerned,
tiling just doesn't make a difference.
Post by Rob Clark
All the same, the way stk uses draw-indirect is not useful and it
would be better to shut it off, at least for me. Although perhaps
instead of adding things like this to driconf extension by extension,
it would be more useful to have a way to provide a default
MESA_EXTENSION_OVERRIDE via driconf?
Yeah, that makes sense.
Cheers,
Nicolai
Post by Rob Clark
BR,
-R
Loading...