Discussion:
[PATCH 00/29] anv: Rework resolves and fast clears
Add Reply
Jason Ekstrand
2017-11-28 03:05:50 UTC
Reply
Permalink
Raw Message
This patch series is a major rework of the aux tracking and fast clear code
in our Vulkan driver. It's broken up into three basic pieces:

1) Patches 1-13 rework the way layout transitions work and add some
additional granularity to our aux tracking scheme. This is required to
support Y-tiled window system buffers where we have CCS_E but need to
do a full resolve prior to handing it off to the window system. The
current code does a partial resolve if and only if CCS_E is enabled.
These patches may get back-ported to 17.3 because it seems that people
are hitting issues with this somewhere in Chrome.

2) Patches 14-25 rework the code we have for doing fast clears, setting up
indirect clear colors, and doing implicit layout transitions. In
particular, we pull them all together into a single begin/end_subpass
function pair instead of scattering them across multiple functions in
genX_cmd_buffer.c and anv_blorp.c. This allows us to avoid the
redundant fast-clear that you get when you have LOAD_OP_CLEAR combined
with IMAGE_LAYOUT_UNDEFINED.

3) Patches 26-29 revive my old CCS ambiguate pass and make us use that
instead of a fast-clear for initializing CCS buffers on gen9+. This
should allow us to avoid some unneeded resolves in a couple of corner
cases. It also simplifies transition_color_buffer a decent bit.

I've organized this patch series in order of priority both in terms of time
and in terms of importance. If the third chunk doesn't land for a while or
never at all, I'm not going to cry over it, but I do think it's quite a bit
better.

Cc: Nanley Chery <***@intel.com>

Jason Ekstrand (29):
intel/isl: Codify AUX operations in an enum
anv/blorp: Rework image clear/resolve helpers
anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
anv/blorp: Rework HiZ ops to look like MCS and CCS
anv/image: Update a comment
anv/image: Add a helper for determining when fast clears are supported
anv/image: Support color aspects in layout_to_aux_usage
anv/cmd_buffer: Recurse in transition_color_buffer instead of falling
through
anv/cmd_buffer: Generalize transition_color_buffer
anv/cmd_buffer: Add an anv_genX_call macro
anv/cmd_buffer: Add a mark_image_written helper
anv/cmd_buffer: Drop the genX from get/set_needs_resolve
anv/cmd_buffer: Rework aux tracking
anv/cmd_buffer: Apply subpass flushes before set_subpass
anv/cmd_buffer: Add begin/end_subpass helpers
anv/cmd_buffer: Pass a subpass id into begin_subpass
anv/cmd_buffer: Move the color portion of clear_subpass into
begin_subpass
intel/blorp: Add a blorp_hiz_clear_depth_stencil helper
anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass
anv/cmd_buffer: Decide whether or not to HiZ clear up-front
anv/cmd_buffer: Iterate all subpass attachments when clearing
anv/cmd_buffer: Add a concept of pending load aspects
anv/cmd_buffer: Sync clear values in begin_subpass
anv/cmd_buffer: Do subpass image transitions in begin/end_subpass
anv/cmd_buffer: Avoid unnecessary transitions before fast clears
intel/blorp: Add a CCS ambiguation pass
anv/cmd_buffer: Pull the undefined layout condition into the if
anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
anv: Use blorp_ccs_ambiguate instead of fast-clears

src/intel/blorp/blorp.h | 16 +
src/intel/blorp/blorp_clear.c | 156 ++++++++
src/intel/isl/isl.h | 74 ++--
src/intel/vulkan/anv_blorp.c | 661 +++++++++++++++---------------
src/intel/vulkan/anv_cmd_buffer.c | 52 ++-
src/intel/vulkan/anv_genX.h | 6 +
src/intel/vulkan/anv_image.c | 108 ++++-
src/intel/vulkan/anv_private.h | 86 +++-
src/intel/vulkan/genX_cmd_buffer.c | 795 +++++++++++++++++++++++--------------
9 files changed, 1249 insertions(+), 705 deletions(-)
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:05:51 UTC
Reply
Permalink
Raw Message
Right now, we have different entrypoints and enums in blorp for these
different operations. This provides us a central enum which we can
begin to transition to.
---
src/intel/isl/isl.h | 74 +++++++++++++++++++++++++++++++++++------------------
1 file changed, 49 insertions(+), 25 deletions(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index e3acb0e..fda2411 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -661,31 +661,8 @@ enum isl_aux_usage {
*
* Drawing with or without aux enabled may implicitly cause the surface to
* transition between these states. There are also four types of auxiliary
- * compression operations which cause an explicit transition:
- *
- * 1) Fast Clear: This operation writes the magic "clear" value to the
- * auxiliary surface. This operation will safely transition any slice
- * of a surface from any state to the clear state so long as the entire
- * slice is fast cleared at once. A fast clear that only covers part of
- * a slice of a surface is called a partial fast clear.
- *
- * 2) Full Resolve: This operation combines the auxiliary surface data
- * with the primary surface data and writes the result to the primary.
- * For HiZ, the docs call this a depth resolve. For CCS, the hardware
- * full resolve operation does both a full resolve and an ambiguate so
- * it actually takes you all the way to the pass-through state.
- *
- * 3) Partial Resolve: This operation considers blocks which are in the
- * "clear" state and writes the clear value directly into the primary or
- * auxiliary surface. Once this operation completes, the surface is
- * still compressed but no longer references the clear color. This
- * operation is only available for CCS.
- *
- * 4) Ambiguate: This operation throws away the current auxiliary data and
- * replaces it with the magic pass-through value. If an ambiguate
- * operation is performed when the primary surface does not contain 100%
- * of the data, data will be lost. This operation is only implemented
- * in hardware for depth where it is called a HiZ resolve.
+ * compression operations which cause an explicit transition which are
+ * described by the isl_aux_op enum below.
*
* Not all operations are valid or useful in all states. The diagram below
* contains a complete description of the states and all valid and useful
@@ -787,6 +764,53 @@ enum isl_aux_state {
ISL_AUX_STATE_AUX_INVALID,
};

+/**
+ * Enum which describes explicit aux transition operations.
+ */
+enum isl_aux_op {
+ ISL_AUX_OP_NONE,
+
+ /** Fast Clear
+ *
+ * This operation writes the magic "clear" value to the auxiliary surface.
+ * This operation will safely transition any slice of a surface from any
+ * state to the clear state so long as the entire slice is fast cleared at
+ * once. A fast clear that only covers part of a slice of a surface is
+ * called a partial fast clear.
+ */
+ ISL_AUX_OP_FAST_CLEAR,
+
+ /** Full Resolve
+ *
+ * This operation combines the auxiliary surface data with the primary
+ * surface data and writes the result to the primary. For HiZ, the docs
+ * call this a depth resolve. For CCS, the hardware full resolve operation
+ * does both a full resolve and an ambiguate so it actually takes you all
+ * the way to the pass-through state.
+ */
+ ISL_AUX_OP_FULL_RESOLVE,
+
+ /** Partial Resolve
+ *
+ * This operation considers blocks which are in the "clear" state and
+ * writes the clear value directly into the primary or auxiliary surface.
+ * Once this operation completes, the surface is still compressed but no
+ * longer references the clear color. This operation is only available
+ * for CCS_E.
+ */
+ ISL_AUX_OP_PARTIAL_RESOLVE,
+
+ /** Ambiguate
+ *
+ * This operation throws away the current auxiliary data and replaces it
+ * with the magic pass-through value. If an ambiguate operation is
+ * performed when the primary surface does not contain 100% of the data,
+ * data will be lost. This operation is only implemented in hardware for
+ * depth where it is called a HiZ resolve.
+ */
+ ISL_AUX_OP_AMBIGUATE,
+};
+
/* TODO(chadv): Explain */
enum isl_array_pitch_span {
ISL_ARRAY_PITCH_SPAN_FULL,
--
2.5.0.400.gff86faf
Nanley Chery
2017-12-05 00:28:24 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
Right now, we have different entrypoints and enums in blorp for these
different operations. This provides us a central enum which we can
begin to transition to.
---
src/intel/isl/isl.h | 74 +++++++++++++++++++++++++++++++++++------------------
1 file changed, 49 insertions(+), 25 deletions(-)
This patch is
Post by Jason Ekstrand
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index e3acb0e..fda2411 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -661,31 +661,8 @@ enum isl_aux_usage {
*
* Drawing with or without aux enabled may implicitly cause the surface to
* transition between these states. There are also four types of auxiliary
- *
- * 1) Fast Clear: This operation writes the magic "clear" value to the
- * auxiliary surface. This operation will safely transition any slice
- * of a surface from any state to the clear state so long as the entire
- * slice is fast cleared at once. A fast clear that only covers part of
- * a slice of a surface is called a partial fast clear.
- *
- * 2) Full Resolve: This operation combines the auxiliary surface data
- * with the primary surface data and writes the result to the primary.
- * For HiZ, the docs call this a depth resolve. For CCS, the hardware
- * full resolve operation does both a full resolve and an ambiguate so
- * it actually takes you all the way to the pass-through state.
- *
- * 3) Partial Resolve: This operation considers blocks which are in the
- * "clear" state and writes the clear value directly into the primary or
- * auxiliary surface. Once this operation completes, the surface is
- * still compressed but no longer references the clear color. This
- * operation is only available for CCS.
- *
- * 4) Ambiguate: This operation throws away the current auxiliary data and
- * replaces it with the magic pass-through value. If an ambiguate
- * operation is performed when the primary surface does not contain 100%
- * of the data, data will be lost. This operation is only implemented
- * in hardware for depth where it is called a HiZ resolve.
+ * compression operations which cause an explicit transition which are
+ * described by the isl_aux_op enum below.
*
* Not all operations are valid or useful in all states. The diagram below
* contains a complete description of the states and all valid and useful
@@ -787,6 +764,53 @@ enum isl_aux_state {
ISL_AUX_STATE_AUX_INVALID,
};
+/**
+ * Enum which describes explicit aux transition operations.
+ */
+enum isl_aux_op {
+ ISL_AUX_OP_NONE,
+
+ /** Fast Clear
+ *
+ * This operation writes the magic "clear" value to the auxiliary surface.
+ * This operation will safely transition any slice of a surface from any
+ * state to the clear state so long as the entire slice is fast cleared at
+ * once. A fast clear that only covers part of a slice of a surface is
+ * called a partial fast clear.
+ */
+ ISL_AUX_OP_FAST_CLEAR,
+
+ /** Full Resolve
+ *
+ * This operation combines the auxiliary surface data with the primary
+ * surface data and writes the result to the primary. For HiZ, the docs
+ * call this a depth resolve. For CCS, the hardware full resolve operation
+ * does both a full resolve and an ambiguate so it actually takes you all
+ * the way to the pass-through state.
+ */
+ ISL_AUX_OP_FULL_RESOLVE,
+
+ /** Partial Resolve
+ *
+ * This operation considers blocks which are in the "clear" state and
+ * writes the clear value directly into the primary or auxiliary surface.
+ * Once this operation completes, the surface is still compressed but no
+ * longer references the clear color. This operation is only available
+ * for CCS_E.
+ */
+ ISL_AUX_OP_PARTIAL_RESOLVE,
+
+ /** Ambiguate
+ *
+ * This operation throws away the current auxiliary data and replaces it
+ * with the magic pass-through value. If an ambiguate operation is
+ * performed when the primary surface does not contain 100% of the data,
+ * data will be lost. This operation is only implemented in hardware for
+ * depth where it is called a HiZ resolve.
+ */
+ ISL_AUX_OP_AMBIGUATE,
+};
+
/* TODO(chadv): Explain */
enum isl_array_pitch_span {
ISL_ARRAY_PITCH_SPAN_FULL,
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:05:53 UTC
Reply
Permalink
Raw Message
If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.
---
src/intel/vulkan/anv_blorp.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 7c8a673..f10adf0 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -193,12 +193,13 @@ get_blorp_surf_for_anv_image(const struct anv_device *device,
{
uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);

- if (aux_usage == ANV_AUX_USAGE_DEFAULT)
+ if (aux_usage == ANV_AUX_USAGE_DEFAULT) {
aux_usage = image->planes[plane].aux_usage;

- if (aspect == VK_IMAGE_ASPECT_STENCIL_BIT ||
- aux_usage == ISL_AUX_USAGE_HIZ)
- aux_usage = ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_STENCIL_BIT ||
+ aux_usage == ISL_AUX_USAGE_HIZ)
+ aux_usage = ISL_AUX_USAGE_NONE;
+ }

const struct anv_surface *surface = &image->planes[plane].surface;
*blorp_surf = (struct blorp_surf) {
@@ -1593,18 +1594,7 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device,
image, VK_IMAGE_ASPECT_DEPTH_BIT,
- ISL_AUX_USAGE_NONE, &surf);
-
- /* Manually add the aux HiZ surf */
- surf.aux_surf = &image->planes[0].aux_surface.isl,
- surf.aux_addr = (struct blorp_address) {
- .buffer = image->planes[0].bo,
- .offset = image->planes[0].bo_offset +
- image->planes[0].aux_surface.offset,
- .mocs = cmd_buffer->device->default_mocs,
- };
- surf.aux_usage = ISL_AUX_USAGE_HIZ;
-
+ ISL_AUX_USAGE_HIZ, &surf);
surf.clear_color.f32[0] = ANV_HZ_FC_VAL;

blorp_hiz_op(&batch, &surf, 0, 0, 1, op);
--
2.5.0.400.gff86faf
Nanley Chery
2017-12-06 00:11:41 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.
---
src/intel/vulkan/anv_blorp.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 7c8a673..f10adf0 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -193,12 +193,13 @@ get_blorp_surf_for_anv_image(const struct anv_device *device,
{
uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
- if (aux_usage == ANV_AUX_USAGE_DEFAULT)
+ if (aux_usage == ANV_AUX_USAGE_DEFAULT) {
aux_usage = image->planes[plane].aux_usage;
- if (aspect == VK_IMAGE_ASPECT_STENCIL_BIT ||
- aux_usage == ISL_AUX_USAGE_HIZ)
- aux_usage = ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_STENCIL_BIT ||
+ aux_usage == ISL_AUX_USAGE_HIZ)
I think the predicate no longer needs a check on the aspect. If the
aspect is stencil the aux_usage will either be NONE or HIZ. If the
aux_usage HIZ it'll be caught by the aux_usage check. If it's NONE
there's no work to be done. Either way, this patch is
Post by Jason Ekstrand
+ aux_usage = ISL_AUX_USAGE_NONE;
+ }
const struct anv_surface *surface = &image->planes[plane].surface;
*blorp_surf = (struct blorp_surf) {
@@ -1593,18 +1594,7 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device,
image, VK_IMAGE_ASPECT_DEPTH_BIT,
- ISL_AUX_USAGE_NONE, &surf);
-
- /* Manually add the aux HiZ surf */
- surf.aux_surf = &image->planes[0].aux_surface.isl,
- surf.aux_addr = (struct blorp_address) {
- .buffer = image->planes[0].bo,
- .offset = image->planes[0].bo_offset +
- image->planes[0].aux_surface.offset,
- .mocs = cmd_buffer->device->default_mocs,
- };
- surf.aux_usage = ISL_AUX_USAGE_HIZ;
-
Thanks for getting rid of that!
Post by Jason Ekstrand
+ ISL_AUX_USAGE_HIZ, &surf);
surf.clear_color.f32[0] = ANV_HZ_FC_VAL;
blorp_hiz_op(&batch, &surf, 0, 0, 1, op);
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:05:52 UTC
Reply
Permalink
Raw Message
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS. This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.
---
src/intel/vulkan/anv_blorp.c | 218 ++++++++++++++++++++++---------------
src/intel/vulkan/anv_private.h | 23 ++--
src/intel/vulkan/genX_cmd_buffer.c | 28 +++--
3 files changed, 165 insertions(+), 104 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index e244468..7c8a673 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
}

void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count)
-{
- assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
-
- if (image->type == VK_IMAGE_TYPE_3D) {
- assert(base_layer == 0);
- assert(layer_count == anv_minify(image->extent.depth, base_level));
- }
-
- struct blorp_batch batch;
- blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
-
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
- fast_clear_aux_usage(image, aspect),
- &surf);
-
- /* From the Sky Lake PRM Vol. 7, "Render Target Fast Clear":
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's going
- * or what's really needed. It appears that fast clear ops are not
- * properly synchronized with other drawing. This means that we cannot
- * have a fast clear operation in the pipe at the same time as other
- * regular drawing operations. We need to use a PIPE_CONTROL to ensure
- * that the contents of the previous draw hit the render target before we
- * resolve and then use a second PIPE_CONTROL after the resolve to ensure
- * that it is completed before any additional drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
- uint32_t width_div = image->format->planes[plane].denominator_scales[0];
- uint32_t height_div = image->format->planes[plane].denominator_scales[1];
-
- for (uint32_t l = 0; l < level_count; l++) {
- const uint32_t level = base_level + l;
-
- const VkExtent3D extent = {
- .width = anv_minify(image->extent.width, level),
- .height = anv_minify(image->extent.height, level),
- .depth = anv_minify(image->extent.depth, level),
- };
-
- if (image->type == VK_IMAGE_TYPE_3D)
- layer_count = extent.depth;
-
- assert(level < anv_image_aux_levels(image, aspect));
- assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, level));
- blorp_fast_clear(&batch, &surf, surf.surf->format,
- level, base_layer, layer_count,
- 0, 0,
- extent.width / width_div,
- extent.height / height_div);
- }
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-}
-
-void
anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
}

void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op)
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate)
{
- assert(cmd_buffer && image);
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->samples > 1);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));

- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ fast_clear_aux_usage(image, aspect),
+ &surf);
+
+ /* From the Sky Lake PRM Vol. 7, "Render Target Fast Clear":
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (mcs_op) {
+ case ISL_AUX_OP_FAST_CLEAR:
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ 0, base_layer, layer_count,
+ 0, 0, image->extent.width, image->extent.height);
+ break;
+ case ISL_AUX_OP_FULL_RESOLVE:
+ case ISL_AUX_OP_PARTIAL_RESOLVE:
+ case ISL_AUX_OP_AMBIGUATE:
+ default:
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ blorp_batch_finish(&batch);
+}

- /* The resolved subresource range must have a CCS buffer. */
+static enum blorp_fast_clear_op
+isl_to_blorp_fast_clear_op(enum isl_aux_op isl_op)
+{
+ switch (isl_op) {
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_FAST_CLEAR_OP_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
+ case ISL_AUX_OP_PARTIAL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
+ default:
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate)
+{
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(image->samples == 1);
assert(level < anv_image_aux_levels(image, aspect));
- assert(start_layer + layer_count <=
+ assert(base_layer + layer_count <=
anv_image_aux_layers(image, aspect, level));
- assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV && image->samples == 1);
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ uint32_t width_div = image->format->planes[plane].denominator_scales[0];
+ uint32_t height_div = image->format->planes[plane].denominator_scales[1];
+ uint32_t level_width = anv_minify(image->extent.width, level) / width_div;
+ uint32_t level_height = anv_minify(image->extent.height, level) / height_div;

struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
- BLORP_BATCH_PREDICATE_ENABLE);
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);

struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
fast_clear_aux_usage(image, aspect),
&surf);
- surf.clear_color_addr = anv_to_blorp_address(
- anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, level));

- blorp_ccs_resolve(&batch, &surf, level, start_layer, layer_count,
- image->planes[plane].surface.isl.format, op);
+ if (ccs_op == ISL_AUX_OP_FULL_RESOLVE ||
+ ccs_op == ISL_AUX_OP_PARTIAL_RESOLVE) {
+ /* If we're doing a resolve operation, then we need the indirect clear
+ * color. The clear and ambiguate operations just stomp the CCS to a
+ * particular value and don't care about format or clear value.
+ */
+ const struct anv_address clear_color_addr =
+ anv_image_get_clear_color_addr(cmd_buffer->device, image,
+ aspect, level);
+ surf.clear_color_addr = anv_to_blorp_address(clear_color_addr);
+ }
+
+ /* From the Sky Lake PRM Vol. 7, "Render Target Fast Clear":
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (ccs_op) {
+ case ISL_AUX_OP_FAST_CLEAR:
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ level, base_layer, layer_count,
+ 0, 0, level_width, level_height);
+ break;
+ case ISL_AUX_OP_FULL_RESOLVE:
+ case ISL_AUX_OP_PARTIAL_RESOLVE:
+ blorp_ccs_resolve(&batch, &surf, level, base_layer, layer_count,
+ surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
+ break;
+ case ISL_AUX_OP_AMBIGUATE:
+ default:
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;

blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ca3644d..dc44ab6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2533,20 +2533,19 @@ void
anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
enum blorp_hiz_op op);
-void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op);

void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count);
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate);
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate);

void
anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d..2e7a2cc 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -689,9 +689,22 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
"define an MCS buffer.");
}

- anv_image_fast_clear(cmd_buffer, image, aspect,
- base_level, level_count,
- base_layer, layer_count);
+ if (image->samples == 1) {
+ for (uint32_t l = 0; l < level_count; l++) {
+ const uint32_t level = base_level + l;
+ const uint32_t level_layer_count =
+ MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, level_layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
+ } else {
+ assert(image->samples > 1);
+ assert(base_level == 0 && level_count == 1);
+ anv_image_mcs_op(cmd_buffer, image, aspect,
+ base_layer, layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
}
/* At this point, some elements of the CCS buffer may have the fast-clear
* bit-arrangement. As the user writes to a subresource, we need to have
@@ -760,10 +773,11 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,

genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);

- anv_ccs_resolve(cmd_buffer, image, aspect, level, base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL :
- BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, layer_count,
+ image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
+ ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
+ true);

genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
Nanley Chery
2017-12-05 23:48:45 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS. This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.
---
src/intel/vulkan/anv_blorp.c | 218 ++++++++++++++++++++++---------------
src/intel/vulkan/anv_private.h | 23 ++--
src/intel/vulkan/genX_cmd_buffer.c | 28 +++--
3 files changed, 165 insertions(+), 104 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index e244468..7c8a673 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
}
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count)
-{
- assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
-
- if (image->type == VK_IMAGE_TYPE_3D) {
- assert(base_layer == 0);
- assert(layer_count == anv_minify(image->extent.depth, base_level));
- }
-
- struct blorp_batch batch;
- blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
-
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
- fast_clear_aux_usage(image, aspect),
- &surf);
-
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's going
- * or what's really needed. It appears that fast clear ops are not
- * properly synchronized with other drawing. This means that we cannot
- * have a fast clear operation in the pipe at the same time as other
- * regular drawing operations. We need to use a PIPE_CONTROL to ensure
- * that the contents of the previous draw hit the render target before we
- * resolve and then use a second PIPE_CONTROL after the resolve to ensure
- * that it is completed before any additional drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
- uint32_t width_div = image->format->planes[plane].denominator_scales[0];
- uint32_t height_div = image->format->planes[plane].denominator_scales[1];
-
- for (uint32_t l = 0; l < level_count; l++) {
- const uint32_t level = base_level + l;
-
- const VkExtent3D extent = {
- .width = anv_minify(image->extent.width, level),
- .height = anv_minify(image->extent.height, level),
- .depth = anv_minify(image->extent.depth, level),
- };
-
- if (image->type == VK_IMAGE_TYPE_3D)
- layer_count = extent.depth;
-
- assert(level < anv_image_aux_levels(image, aspect));
- assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, level));
- blorp_fast_clear(&batch, &surf, surf.surf->format,
- level, base_layer, layer_count,
- 0, 0,
- extent.width / width_div,
- extent.height / height_div);
- }
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-}
-
-void
anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
}
void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op)
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate)
{
- assert(cmd_buffer && image);
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->samples > 1);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
Is this true? I can't find a similar restriction in anv_formats.c.
Post by Jason Ekstrand
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ fast_clear_aux_usage(image, aspect),
How about ANV_AUX_USAGE_DEFAULT instead? The fast_clear_aux_usage helper
seems only beneficial for CCS_D/CCS images. Not a big deal though.
Post by Jason Ekstrand
+ &surf);
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (mcs_op) {
Are you missing this case?
case ISL_AUX_OP_NONE:
return;

Seems like the NONE case is left out in a number of other switches. Was
this intentional?
Post by Jason Ekstrand
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ 0, base_layer, layer_count,
+ 0, 0, image->extent.width, image->extent.height);
+ break;
+ unreachable("Unsupported CCS operation");
^
MCS
Post by Jason Ekstrand
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ blorp_batch_finish(&batch);
+}
- /* The resolved subresource range must have a CCS buffer. */
+static enum blorp_fast_clear_op
+isl_to_blorp_fast_clear_op(enum isl_aux_op isl_op)
+{
+ switch (isl_op) {
Are you missing this case?
case ISL_AUX_OP_NONE: return BLORP_FAST_CLEAR_OP_NONE;
Post by Jason Ekstrand
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_FAST_CLEAR_OP_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
+ case ISL_AUX_OP_PARTIAL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate)
+{
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(image->samples == 1);
assert(level < anv_image_aux_levels(image, aspect));
- assert(start_layer + layer_count <=
+ assert(base_layer + layer_count <=
anv_image_aux_layers(image, aspect, level));
- assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV && image->samples == 1);
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ uint32_t width_div = image->format->planes[plane].denominator_scales[0];
+ uint32_t height_div = image->format->planes[plane].denominator_scales[1];
+ uint32_t level_width = anv_minify(image->extent.width, level) / width_div;
+ uint32_t level_height = anv_minify(image->extent.height, level) / height_div;
I can't find any spec text covering mipmaps and multi-planar images, but
the image level is no longer a valid YCbCr subresource if

(anv_minify(image->extent.width , level) % width_div ) != 0
(anv_minify(image->extent.height, level) % height_div) != 0

If this is an open issue, what do you think about some assertions for
this? This was a problem in the original code as well.
Post by Jason Ekstrand
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
- BLORP_BATCH_PREDICATE_ENABLE);
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
fast_clear_aux_usage(image, aspect),
&surf);
- surf.clear_color_addr = anv_to_blorp_address(
- anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, level));
- blorp_ccs_resolve(&batch, &surf, level, start_layer, layer_count,
- image->planes[plane].surface.isl.format, op);
+ if (ccs_op == ISL_AUX_OP_FULL_RESOLVE ||
+ ccs_op == ISL_AUX_OP_PARTIAL_RESOLVE) {
+ /* If we're doing a resolve operation, then we need the indirect clear
+ * color. The clear and ambiguate operations just stomp the CCS to a
+ * particular value and don't care about format or clear value.
+ */
+ const struct anv_address clear_color_addr =
+ anv_image_get_clear_color_addr(cmd_buffer->device, image,
+ aspect, level);
+ surf.clear_color_addr = anv_to_blorp_address(clear_color_addr);
+ }
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
We now explicitly flush:
* Between the levels of a multi-level layout transition.
* Around resolves.
Is there any performance penalty associated with this coarser-grained
flushing?

-Nanley
Post by Jason Ekstrand
+ switch (ccs_op) {
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ level, base_layer, layer_count,
+ 0, 0, level_width, level_height);
+ break;
+ blorp_ccs_resolve(&batch, &surf, level, base_layer, layer_count,
+ surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
+ break;
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ca3644d..dc44ab6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2533,20 +2533,19 @@ void
anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
enum blorp_hiz_op op);
-void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op);
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count);
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate);
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate);
void
anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d..2e7a2cc 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -689,9 +689,22 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
"define an MCS buffer.");
}
- anv_image_fast_clear(cmd_buffer, image, aspect,
- base_level, level_count,
- base_layer, layer_count);
+ if (image->samples == 1) {
+ for (uint32_t l = 0; l < level_count; l++) {
+ const uint32_t level = base_level + l;
+ const uint32_t level_layer_count =
+ MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, level_layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
+ } else {
+ assert(image->samples > 1);
+ assert(base_level == 0 && level_count == 1);
+ anv_image_mcs_op(cmd_buffer, image, aspect,
+ base_layer, layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
}
/* At this point, some elements of the CCS buffer may have the fast-clear
* bit-arrangement. As the user writes to a subresource, we need to have
@@ -760,10 +773,11 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
- anv_ccs_resolve(cmd_buffer, image, aspect, level, base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, layer_count,
+ image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
+ ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
+ true);
genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-06 17:40:25 UTC
Reply
Permalink
Raw Message
Post by Nanley Chery
Post by Jason Ekstrand
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS. This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.
---
src/intel/vulkan/anv_blorp.c | 218 ++++++++++++++++++++++---------------
src/intel/vulkan/anv_private.h | 23 ++--
src/intel/vulkan/genX_cmd_buffer.c | 28 +++--
3 files changed, 165 insertions(+), 104 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index e244468..7c8a673 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
}
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count)
-{
- assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
-
- if (image->type == VK_IMAGE_TYPE_3D) {
- assert(base_layer == 0);
- assert(layer_count == anv_minify(image->extent.depth, base_level));
- }
-
- struct blorp_batch batch;
- blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
-
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
- fast_clear_aux_usage(image, aspect),
- &surf);
-
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's going
- * or what's really needed. It appears that fast clear ops are not
- * properly synchronized with other drawing. This means that we cannot
- * have a fast clear operation in the pipe at the same time as other
- * regular drawing operations. We need to use a PIPE_CONTROL to ensure
- * that the contents of the previous draw hit the render target before we
- * resolve and then use a second PIPE_CONTROL after the resolve to ensure
- * that it is completed before any additional drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
- uint32_t width_div = image->format->planes[plane].denominator_scales[0];
- uint32_t height_div = image->format->planes[plane].denominator_scales[1];
-
- for (uint32_t l = 0; l < level_count; l++) {
- const uint32_t level = base_level + l;
-
- const VkExtent3D extent = {
- .width = anv_minify(image->extent.width, level),
- .height = anv_minify(image->extent.height, level),
- .depth = anv_minify(image->extent.depth, level),
- };
-
- if (image->type == VK_IMAGE_TYPE_3D)
- layer_count = extent.depth;
-
- assert(level < anv_image_aux_levels(image, aspect));
- assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, level));
- blorp_fast_clear(&batch, &surf, surf.surf->format,
- level, base_layer, layer_count,
- 0, 0,
- extent.width / width_div,
- extent.height / height_div);
- }
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-}
-
-void
anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
}
void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op)
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate)
{
- assert(cmd_buffer && image);
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->samples > 1);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
Is this true? I can't find a similar restriction in anv_formats.c.
Post by Jason Ekstrand
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ fast_clear_aux_usage(image, aspect),
How about ANV_AUX_USAGE_DEFAULT instead? The fast_clear_aux_usage helper
seems only beneficial for CCS_D/CCS images. Not a big deal though.
Post by Jason Ekstrand
+ &surf);
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (mcs_op) {
Are you missing this case?
return;
Seems like the NONE case is left out in a number of other switches. Was
this intentional?
Post by Jason Ekstrand
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ 0, base_layer, layer_count,
+ 0, 0, image->extent.width, image->extent.height);
+ break;
+ unreachable("Unsupported CCS operation");
^
MCS
Post by Jason Ekstrand
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ blorp_batch_finish(&batch);
+}
- /* The resolved subresource range must have a CCS buffer. */
+static enum blorp_fast_clear_op
+isl_to_blorp_fast_clear_op(enum isl_aux_op isl_op)
+{
+ switch (isl_op) {
Are you missing this case?
case ISL_AUX_OP_NONE: return BLORP_FAST_CLEAR_OP_NONE;
Post by Jason Ekstrand
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_FAST_CLEAR_OP_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
+ case ISL_AUX_OP_PARTIAL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate)
+{
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(image->samples == 1);
assert(level < anv_image_aux_levels(image, aspect));
- assert(start_layer + layer_count <=
+ assert(base_layer + layer_count <=
anv_image_aux_layers(image, aspect, level));
- assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV && image->samples == 1);
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ uint32_t width_div = image->format->planes[plane].denominator_scales[0];
+ uint32_t height_div = image->format->planes[plane].denominator_scales[1];
+ uint32_t level_width = anv_minify(image->extent.width, level) / width_div;
+ uint32_t level_height = anv_minify(image->extent.height, level) / height_div;
I can't find any spec text covering mipmaps and multi-planar images, but
the image level is no longer a valid YCbCr subresource if
(anv_minify(image->extent.width , level) % width_div ) != 0
(anv_minify(image->extent.height, level) % height_div) != 0
If this is an open issue, what do you think about some assertions for
this? This was a problem in the original code as well.
We're good. Lionel pointed out the relevant spec text that limits the
level to one.

-Nanley
Post by Nanley Chery
Post by Jason Ekstrand
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
- BLORP_BATCH_PREDICATE_ENABLE);
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
fast_clear_aux_usage(image, aspect),
&surf);
- surf.clear_color_addr = anv_to_blorp_address(
- anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, level));
- blorp_ccs_resolve(&batch, &surf, level, start_layer, layer_count,
- image->planes[plane].surface.isl.format, op);
+ if (ccs_op == ISL_AUX_OP_FULL_RESOLVE ||
+ ccs_op == ISL_AUX_OP_PARTIAL_RESOLVE) {
+ /* If we're doing a resolve operation, then we need the indirect clear
+ * color. The clear and ambiguate operations just stomp the CCS to a
+ * particular value and don't care about format or clear value.
+ */
+ const struct anv_address clear_color_addr =
+ anv_image_get_clear_color_addr(cmd_buffer->device, image,
+ aspect, level);
+ surf.clear_color_addr = anv_to_blorp_address(clear_color_addr);
+ }
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
* Between the levels of a multi-level layout transition.
* Around resolves.
Is there any performance penalty associated with this coarser-grained
flushing?
-Nanley
Post by Jason Ekstrand
+ switch (ccs_op) {
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ level, base_layer, layer_count,
+ 0, 0, level_width, level_height);
+ break;
+ blorp_ccs_resolve(&batch, &surf, level, base_layer, layer_count,
+ surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
+ break;
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ca3644d..dc44ab6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2533,20 +2533,19 @@ void
anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
enum blorp_hiz_op op);
-void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op);
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count);
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate);
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate);
void
anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d..2e7a2cc 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -689,9 +689,22 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
"define an MCS buffer.");
}
- anv_image_fast_clear(cmd_buffer, image, aspect,
- base_level, level_count,
- base_layer, layer_count);
+ if (image->samples == 1) {
+ for (uint32_t l = 0; l < level_count; l++) {
+ const uint32_t level = base_level + l;
+ const uint32_t level_layer_count =
+ MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, level_layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
+ } else {
+ assert(image->samples > 1);
+ assert(base_level == 0 && level_count == 1);
+ anv_image_mcs_op(cmd_buffer, image, aspect,
+ base_layer, layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
}
/* At this point, some elements of the CCS buffer may have the fast-clear
* bit-arrangement. As the user writes to a subresource, we need to have
@@ -760,10 +773,11 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
- anv_ccs_resolve(cmd_buffer, image, aspect, level, base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, layer_count,
+ image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
+ ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
+ true);
genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-06 17:56:53 UTC
Reply
Permalink
Raw Message
Post by Nanley Chery
Post by Nanley Chery
Post by Jason Ekstrand
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS. This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.
---
src/intel/vulkan/anv_blorp.c | 218 ++++++++++++++++++++++---------------
src/intel/vulkan/anv_private.h | 23 ++--
src/intel/vulkan/genX_cmd_buffer.c | 28 +++--
3 files changed, 165 insertions(+), 104 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index e244468..7c8a673 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
}
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count)
-{
- assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
-
- if (image->type == VK_IMAGE_TYPE_3D) {
- assert(base_layer == 0);
- assert(layer_count == anv_minify(image->extent.depth, base_level));
- }
-
- struct blorp_batch batch;
- blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
-
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
- fast_clear_aux_usage(image, aspect),
- &surf);
-
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's going
- * or what's really needed. It appears that fast clear ops are not
- * properly synchronized with other drawing. This means that we cannot
- * have a fast clear operation in the pipe at the same time as other
- * regular drawing operations. We need to use a PIPE_CONTROL to ensure
- * that the contents of the previous draw hit the render target before we
- * resolve and then use a second PIPE_CONTROL after the resolve to ensure
- * that it is completed before any additional drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
- uint32_t width_div = image->format->planes[plane].denominator_scales[0];
- uint32_t height_div = image->format->planes[plane].denominator_scales[1];
-
- for (uint32_t l = 0; l < level_count; l++) {
- const uint32_t level = base_level + l;
-
- const VkExtent3D extent = {
- .width = anv_minify(image->extent.width, level),
- .height = anv_minify(image->extent.height, level),
- .depth = anv_minify(image->extent.depth, level),
- };
-
- if (image->type == VK_IMAGE_TYPE_3D)
- layer_count = extent.depth;
-
- assert(level < anv_image_aux_levels(image, aspect));
- assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, level));
- blorp_fast_clear(&batch, &surf, surf.surf->format,
- level, base_layer, layer_count,
- 0, 0,
- extent.width / width_div,
- extent.height / height_div);
- }
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-}
-
-void
anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
}
void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op)
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate)
{
- assert(cmd_buffer && image);
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->samples > 1);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
Is this true? I can't find a similar restriction in anv_formats.c.
I forgot I had another comment on the YCbCr parts of this patch. Sorry
for the multiple emails.

Lionel also pointed out the spec text for this as well. According to
that, drivers aren't expected to support multisampling planar images.
The code comment makes it seem like a TODO.

-Nanley
Post by Nanley Chery
Post by Nanley Chery
Post by Jason Ekstrand
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ fast_clear_aux_usage(image, aspect),
How about ANV_AUX_USAGE_DEFAULT instead? The fast_clear_aux_usage helper
seems only beneficial for CCS_D/CCS images. Not a big deal though.
Post by Jason Ekstrand
+ &surf);
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (mcs_op) {
Are you missing this case?
return;
Seems like the NONE case is left out in a number of other switches. Was
this intentional?
Post by Jason Ekstrand
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ 0, base_layer, layer_count,
+ 0, 0, image->extent.width, image->extent.height);
+ break;
+ unreachable("Unsupported CCS operation");
^
MCS
Post by Jason Ekstrand
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ blorp_batch_finish(&batch);
+}
- /* The resolved subresource range must have a CCS buffer. */
+static enum blorp_fast_clear_op
+isl_to_blorp_fast_clear_op(enum isl_aux_op isl_op)
+{
+ switch (isl_op) {
Are you missing this case?
case ISL_AUX_OP_NONE: return BLORP_FAST_CLEAR_OP_NONE;
Post by Jason Ekstrand
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_FAST_CLEAR_OP_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
+ case ISL_AUX_OP_PARTIAL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate)
+{
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(image->samples == 1);
assert(level < anv_image_aux_levels(image, aspect));
- assert(start_layer + layer_count <=
+ assert(base_layer + layer_count <=
anv_image_aux_layers(image, aspect, level));
- assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV && image->samples == 1);
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ uint32_t width_div = image->format->planes[plane].denominator_scales[0];
+ uint32_t height_div = image->format->planes[plane].denominator_scales[1];
+ uint32_t level_width = anv_minify(image->extent.width, level) / width_div;
+ uint32_t level_height = anv_minify(image->extent.height, level) / height_div;
I can't find any spec text covering mipmaps and multi-planar images, but
the image level is no longer a valid YCbCr subresource if
(anv_minify(image->extent.width , level) % width_div ) != 0
(anv_minify(image->extent.height, level) % height_div) != 0
If this is an open issue, what do you think about some assertions for
this? This was a problem in the original code as well.
We're good. Lionel pointed out the relevant spec text that limits the
level to one.
-Nanley
Post by Nanley Chery
Post by Jason Ekstrand
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
- BLORP_BATCH_PREDICATE_ENABLE);
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
fast_clear_aux_usage(image, aspect),
&surf);
- surf.clear_color_addr = anv_to_blorp_address(
- anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, level));
- blorp_ccs_resolve(&batch, &surf, level, start_layer, layer_count,
- image->planes[plane].surface.isl.format, op);
+ if (ccs_op == ISL_AUX_OP_FULL_RESOLVE ||
+ ccs_op == ISL_AUX_OP_PARTIAL_RESOLVE) {
+ /* If we're doing a resolve operation, then we need the indirect clear
+ * color. The clear and ambiguate operations just stomp the CCS to a
+ * particular value and don't care about format or clear value.
+ */
+ const struct anv_address clear_color_addr =
+ anv_image_get_clear_color_addr(cmd_buffer->device, image,
+ aspect, level);
+ surf.clear_color_addr = anv_to_blorp_address(clear_color_addr);
+ }
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
* Between the levels of a multi-level layout transition.
* Around resolves.
Is there any performance penalty associated with this coarser-grained
flushing?
-Nanley
Post by Jason Ekstrand
+ switch (ccs_op) {
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ level, base_layer, layer_count,
+ 0, 0, level_width, level_height);
+ break;
+ blorp_ccs_resolve(&batch, &surf, level, base_layer, layer_count,
+ surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
+ break;
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ca3644d..dc44ab6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2533,20 +2533,19 @@ void
anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
enum blorp_hiz_op op);
-void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op);
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count);
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate);
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate);
void
anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d..2e7a2cc 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -689,9 +689,22 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
"define an MCS buffer.");
}
- anv_image_fast_clear(cmd_buffer, image, aspect,
- base_level, level_count,
- base_layer, layer_count);
+ if (image->samples == 1) {
+ for (uint32_t l = 0; l < level_count; l++) {
+ const uint32_t level = base_level + l;
+ const uint32_t level_layer_count =
+ MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, level_layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
+ } else {
+ assert(image->samples > 1);
+ assert(base_level == 0 && level_count == 1);
+ anv_image_mcs_op(cmd_buffer, image, aspect,
+ base_layer, layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
}
/* At this point, some elements of the CCS buffer may have the fast-clear
* bit-arrangement. As the user writes to a subresource, we need to have
@@ -760,10 +773,11 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
- anv_ccs_resolve(cmd_buffer, image, aspect, level, base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, layer_count,
+ image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
+ ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
+ true);
genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-06 00:16:20 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS. This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.
---
src/intel/vulkan/anv_blorp.c | 218 ++++++++++++++++++++++---------------
src/intel/vulkan/anv_private.h | 23 ++--
src/intel/vulkan/genX_cmd_buffer.c | 28 +++--
3 files changed, 165 insertions(+), 104 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index e244468..7c8a673 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
}
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count)
-{
- assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
-
- if (image->type == VK_IMAGE_TYPE_3D) {
- assert(base_layer == 0);
- assert(layer_count == anv_minify(image->extent.depth, base_level));
- }
-
- struct blorp_batch batch;
- blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
-
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
- fast_clear_aux_usage(image, aspect),
- &surf);
-
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's going
- * or what's really needed. It appears that fast clear ops are not
- * properly synchronized with other drawing. This means that we cannot
- * have a fast clear operation in the pipe at the same time as other
- * regular drawing operations. We need to use a PIPE_CONTROL to ensure
- * that the contents of the previous draw hit the render target before we
- * resolve and then use a second PIPE_CONTROL after the resolve to ensure
- * that it is completed before any additional drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
- uint32_t width_div = image->format->planes[plane].denominator_scales[0];
- uint32_t height_div = image->format->planes[plane].denominator_scales[1];
-
- for (uint32_t l = 0; l < level_count; l++) {
- const uint32_t level = base_level + l;
-
- const VkExtent3D extent = {
- .width = anv_minify(image->extent.width, level),
- .height = anv_minify(image->extent.height, level),
- .depth = anv_minify(image->extent.depth, level),
- };
-
- if (image->type == VK_IMAGE_TYPE_3D)
- layer_count = extent.depth;
-
- assert(level < anv_image_aux_levels(image, aspect));
- assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, level));
- blorp_fast_clear(&batch, &surf, surf.surf->format,
- level, base_layer, layer_count,
- 0, 0,
- extent.width / width_div,
- extent.height / height_div);
- }
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-}
-
-void
anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
{
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
}
void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op)
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate)
{
- assert(cmd_buffer && image);
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->samples > 1);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
- uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ fast_clear_aux_usage(image, aspect),
+ &surf);
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (mcs_op) {
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ 0, base_layer, layer_count,
+ 0, 0, image->extent.width, image->extent.height);
+ break;
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ blorp_batch_finish(&batch);
+}
- /* The resolved subresource range must have a CCS buffer. */
+static enum blorp_fast_clear_op
+isl_to_blorp_fast_clear_op(enum isl_aux_op isl_op)
+{
+ switch (isl_op) {
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_FAST_CLEAR_OP_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_FULL;
+ case ISL_AUX_OP_PARTIAL_RESOLVE: return BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL;
+ unreachable("Unsupported HiZ aux op");
To align with isl_to_blorp_hiz_op() in your later patch, should the
unreachable say something like "Unsupported MCS/CCS aux op?"

-Nanley
Post by Jason Ekstrand
+ }
+}
+
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate)
+{
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(image->samples == 1);
assert(level < anv_image_aux_levels(image, aspect));
- assert(start_layer + layer_count <=
+ assert(base_layer + layer_count <=
anv_image_aux_layers(image, aspect, level));
- assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV && image->samples == 1);
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+ uint32_t width_div = image->format->planes[plane].denominator_scales[0];
+ uint32_t height_div = image->format->planes[plane].denominator_scales[1];
+ uint32_t level_width = anv_minify(image->extent.width, level) / width_div;
+ uint32_t level_height = anv_minify(image->extent.height, level) / height_div;
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
- BLORP_BATCH_PREDICATE_ENABLE);
+ predicate ? BLORP_BATCH_PREDICATE_ENABLE : 0);
struct blorp_surf surf;
get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
fast_clear_aux_usage(image, aspect),
&surf);
- surf.clear_color_addr = anv_to_blorp_address(
- anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, level));
- blorp_ccs_resolve(&batch, &surf, level, start_layer, layer_count,
- image->planes[plane].surface.isl.format, op);
+ if (ccs_op == ISL_AUX_OP_FULL_RESOLVE ||
+ ccs_op == ISL_AUX_OP_PARTIAL_RESOLVE) {
+ /* If we're doing a resolve operation, then we need the indirect clear
+ * color. The clear and ambiguate operations just stomp the CCS to a
+ * particular value and don't care about format or clear value.
+ */
+ const struct anv_address clear_color_addr =
+ anv_image_get_clear_color_addr(cmd_buffer->device, image,
+ aspect, level);
+ surf.clear_color_addr = anv_to_blorp_address(clear_color_addr);
+ }
+
+ *
+ * "After Render target fast clear, pipe-control with color cache
+ * write-flush must be issued before sending any DRAW commands on
+ * that render target."
+ *
+ * This comment is a bit cryptic and doesn't really tell you what's going
+ * or what's really needed. It appears that fast clear ops are not
+ * properly synchronized with other drawing. This means that we cannot
+ * have a fast clear operation in the pipe at the same time as other
+ * regular drawing operations. We need to use a PIPE_CONTROL to ensure
+ * that the contents of the previous draw hit the render target before we
+ * resolve and then use a second PIPE_CONTROL after the resolve to ensure
+ * that it is completed before any additional drawing occurs.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
+
+ switch (ccs_op) {
+ blorp_fast_clear(&batch, &surf, surf.surf->format,
+ level, base_layer, layer_count,
+ 0, 0, level_width, level_height);
+ break;
+ blorp_ccs_resolve(&batch, &surf, level, base_layer, layer_count,
+ surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
+ break;
+ unreachable("Unsupported CCS operation");
+ }
+
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ca3644d..dc44ab6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2533,20 +2533,19 @@ void
anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
enum blorp_hiz_op op);
-void
-anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
- const struct anv_image * const image,
- VkImageAspectFlagBits aspect,
- const uint8_t level,
- const uint32_t start_layer, const uint32_t layer_count,
- const enum blorp_fast_clear_op op);
void
-anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- const uint32_t base_level, const uint32_t level_count,
- const uint32_t base_layer, uint32_t layer_count);
+anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op mcs_op, bool predicate);
+void
+anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op ccs_op, bool predicate);
void
anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d..2e7a2cc 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -689,9 +689,22 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
"define an MCS buffer.");
}
- anv_image_fast_clear(cmd_buffer, image, aspect,
- base_level, level_count,
- base_layer, layer_count);
+ if (image->samples == 1) {
+ for (uint32_t l = 0; l < level_count; l++) {
+ const uint32_t level = base_level + l;
+ const uint32_t level_layer_count =
+ MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, level_layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
+ } else {
+ assert(image->samples > 1);
+ assert(base_level == 0 && level_count == 1);
+ anv_image_mcs_op(cmd_buffer, image, aspect,
+ base_layer, layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
}
/* At this point, some elements of the CCS buffer may have the fast-clear
* bit-arrangement. As the user writes to a subresource, we need to have
@@ -760,10 +773,11 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
- anv_ccs_resolve(cmd_buffer, image, aspect, level, base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, layer_count,
+ image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
+ ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
+ true);
genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:05:56 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/anv_image.c | 58 ++++++++++++++++++++++++++++++++++++++++++
src/intel/vulkan/anv_private.h | 5 ++++
2 files changed, 63 insertions(+)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index a872149..561da28 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -837,6 +837,64 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
unreachable("layout is not a VkImageLayout enumeration member.");
}

+/**
+ * This function returns true if the given image in the given VkImageLayout
+ * supports unresolved fast-clears.
+ *
+ * @param devinfo The device information of the Intel GPU.
+ * @param image The image that may contain a collection of buffers.
+ * @param aspect The aspect of the image to be accessed.
+ * @param layout The current layout of the image aspect(s).
+ */
+bool
+anv_layout_supports_fast_clear(const struct gen_device_info * const devinfo,
+ const struct anv_image * const image,
+ const VkImageAspectFlagBits aspect,
+ const VkImageLayout layout)
+{
+ /* The aspect must be exactly one of the image aspects. */
+ assert(_mesa_bitcount(aspect) == 1 && (aspect & image->aspects));
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+
+ /* If there is no auxiliary surface allocated, there are no fast-clears */
+ if (image->planes[plane].aux_surface.isl.size == 0)
+ return false;
+
+ /* All images that use an auxiliary surface are required to be tiled. */
+ assert(image->tiling == VK_IMAGE_TILING_OPTIMAL);
+
+ /* Stencil has no aux */
+ assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT);
+
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ /* For depth images (with HiZ), the layout supports fast-clears if and
+ * only if it supports HiZ.
+ */
+ enum isl_aux_usage aux_usage =
+ anv_layout_to_aux_usage(devinfo, image, aspect, layout);
+ return aux_usage == ISL_AUX_USAGE_HIZ;
+ }
+
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+
+ /* Multisample fast-clear is not yet supported. */
+ if (image->samples > 1)
+ return false;
+
+ /* The only layout which actually supports fast-clears today is
+ * VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL. Some day in the future
+ * this may change if our ability to track clear colors improves.
+ */
+ switch (layout) {
+ case VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL:
+ return true;
+
+ default:
+ return false;
+ }
+}
+

static struct anv_state
alloc_surface_state(struct anv_device *device)
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5dd95a3..461bfed 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2559,6 +2559,11 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
const struct anv_image *image,
const VkImageAspectFlagBits aspect,
const VkImageLayout layout);
+bool
+anv_layout_supports_fast_clear(const struct gen_device_info * const devinfo,
+ const struct anv_image * const image,
+ const VkImageAspectFlagBits aspect,
+ const VkImageLayout layout);

/* This is defined as a macro so that it works for both
* VkImageSubresourceRange and VkImageSubresourceLayers
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-29 18:42:05 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/anv_image.c | 58 ++++++++++++++++++++++++++++++++++++++++++
src/intel/vulkan/anv_private.h | 5 ++++
2 files changed, 63 insertions(+)
This seems to be pretty much inline with anv_layout_to_aux_usage(). First I
thought why can't we try calling anv_layout_to_aux_usage() regardless of the
aspect. I was looking at the duplicated asserts and checks for the aux size.
But then I realized it hits unreachacble() with VK_IMAGE_ASPECT_COLOR_BIT. So
Post by Jason Ekstrand
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index a872149..561da28 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -837,6 +837,64 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
unreachable("layout is not a VkImageLayout enumeration member.");
}
+/**
+ * This function returns true if the given image in the given VkImageLayout
+ * supports unresolved fast-clears.
+ *
+ */
+bool
+anv_layout_supports_fast_clear(const struct gen_device_info * const devinfo,
+ const struct anv_image * const image,
+ const VkImageAspectFlagBits aspect,
+ const VkImageLayout layout)
+{
+ /* The aspect must be exactly one of the image aspects. */
+ assert(_mesa_bitcount(aspect) == 1 && (aspect & image->aspects));
+
+ uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
+
+ /* If there is no auxiliary surface allocated, there are no fast-clears */
+ if (image->planes[plane].aux_surface.isl.size == 0)
+ return false;
+
+ /* All images that use an auxiliary surface are required to be tiled. */
+ assert(image->tiling == VK_IMAGE_TILING_OPTIMAL);
+
+ /* Stencil has no aux */
+ assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT);
+
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ /* For depth images (with HiZ), the layout supports fast-clears if and
+ * only if it supports HiZ.
+ */
+ enum isl_aux_usage aux_usage =
+ anv_layout_to_aux_usage(devinfo, image, aspect, layout);
+ return aux_usage == ISL_AUX_USAGE_HIZ;
+ }
+
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+
+ /* Multisample fast-clear is not yet supported. */
+ if (image->samples > 1)
+ return false;
+
+ /* The only layout which actually supports fast-clears today is
+ * VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL. Some day in the future
+ * this may change if our ability to track clear colors improves.
+ */
+ switch (layout) {
+ return true;
+
+ return false;
+ }
+}
+
static struct anv_state
alloc_surface_state(struct anv_device *device)
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5dd95a3..461bfed 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2559,6 +2559,11 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
const struct anv_image *image,
const VkImageAspectFlagBits aspect,
const VkImageLayout layout);
+bool
+anv_layout_supports_fast_clear(const struct gen_device_info * const devinfo,
+ const struct anv_image * const image,
+ const VkImageAspectFlagBits aspect,
+ const VkImageLayout layout);
/* This is defined as a macro so that it works for both
* VkImageSubresourceRange and VkImageSubresourceLayers
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:05:57 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/anv_image.c | 48 ++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 561da28..7e89f75 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -748,12 +748,6 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Stencil has no aux */
assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT);

- /* The following switch currently only handles depth stencil aspects.
- * TODO: Handle the color aspect.
- */
- if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV)
- return image->planes[plane].aux_usage;
-
switch (layout) {

/* Invalid Layouts */
@@ -773,28 +767,38 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,


/* Transfer Layouts
- *
- * This buffer could be a depth buffer used in a transfer operation. BLORP
- * currently doesn't use HiZ for transfer operations so we must use the main
- * buffer for this layout. TODO: Enable HiZ in BLORP.
*/
case VK_IMAGE_LAYOUT_GENERAL:
case VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL:
case VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL:
- return ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ /* This buffer could be a depth buffer used in a transfer operation.
+ * BLORP currently doesn't use HiZ for transfer operations so we must
+ * use the main buffer for this layout. TODO: Enable HiZ in BLORP.
+ */
+ assert(image->planes[plane].aux_usage == ISL_AUX_USAGE_HIZ);
+ return ISL_AUX_USAGE_NONE;
+ } else {
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ return image->planes[plane].aux_usage;
+ }


/* Sampling Layouts */
case VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL:
+ case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL_KHR:
assert((image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) == 0);
/* Fall-through */
case VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL:
- case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_STENCIL_ATTACHMENT_OPTIMAL_KHR:
- assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
- if (anv_can_sample_with_hiz(devinfo, image))
- return ISL_AUX_USAGE_HIZ;
- else
- return ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ if (anv_can_sample_with_hiz(devinfo, image))
+ return ISL_AUX_USAGE_HIZ;
+ else
+ return ISL_AUX_USAGE_NONE;
+ } else {
+ return image->planes[plane].aux_usage;
+ }
+

case VK_IMAGE_LAYOUT_PRESENT_SRC_KHR:
assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
@@ -819,8 +823,14 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,

/* Rendering Layouts */
case VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL:
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- unreachable("Color images are not yet supported.");
+ assert(aspect & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ if (image->planes[plane].aux_usage == ISL_AUX_USAGE_NONE) {
+ assert(image->samples == 1);
+ return ISL_AUX_USAGE_CCS_D;
+ } else {
+ assert(image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_D);
+ return image->planes[plane].aux_usage;
+ }

case VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL:
case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_STENCIL_READ_ONLY_OPTIMAL_KHR:
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-29 19:13:51 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/anv_image.c | 48 ++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 19 deletions(-)
I did a little grepping and it looks anv_layout_to_aux_usage() isn't at this
point ever called against color images (which makes sense as it should crash
on the unreachable("")). In other words, this change can't kick in yet.
Post by Jason Ekstrand
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 561da28..7e89f75 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -748,12 +748,6 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Stencil has no aux */
assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT);
- /* The following switch currently only handles depth stencil aspects.
- * TODO: Handle the color aspect.
- */
- if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV)
- return image->planes[plane].aux_usage;
-
switch (layout) {
/* Invalid Layouts */
@@ -773,28 +767,38 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Transfer Layouts
- *
- * This buffer could be a depth buffer used in a transfer operation. BLORP
- * currently doesn't use HiZ for transfer operations so we must use the main
- * buffer for this layout. TODO: Enable HiZ in BLORP.
*/
- return ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ /* This buffer could be a depth buffer used in a transfer operation.
+ * BLORP currently doesn't use HiZ for transfer operations so we must
+ * use the main buffer for this layout. TODO: Enable HiZ in BLORP.
+ */
+ assert(image->planes[plane].aux_usage == ISL_AUX_USAGE_HIZ);
+ return ISL_AUX_USAGE_NONE;
+ } else {
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ return image->planes[plane].aux_usage;
+ }
/* Sampling Layouts */
assert((image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) == 0);
/* Fall-through */
- assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
- if (anv_can_sample_with_hiz(devinfo, image))
- return ISL_AUX_USAGE_HIZ;
- else
- return ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ if (anv_can_sample_with_hiz(devinfo, image))
+ return ISL_AUX_USAGE_HIZ;
+ else
+ return ISL_AUX_USAGE_NONE;
+ } else {
+ return image->planes[plane].aux_usage;
+ }
+
assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
@@ -819,8 +823,14 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Rendering Layouts */
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- unreachable("Color images are not yet supported.");
+ assert(aspect & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ if (image->planes[plane].aux_usage == ISL_AUX_USAGE_NONE) {
+ assert(image->samples == 1);
+ return ISL_AUX_USAGE_CCS_D;
+ } else {
+ assert(image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_D);
+ return image->planes[plane].aux_usage;
+ }
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-11 23:41:12 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/anv_image.c | 48 ++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 19 deletions(-)
This patch is
Post by Jason Ekstrand
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 561da28..7e89f75 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -748,12 +748,6 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Stencil has no aux */
assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT);
- /* The following switch currently only handles depth stencil aspects.
- * TODO: Handle the color aspect.
- */
- if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV)
- return image->planes[plane].aux_usage;
-
switch (layout) {
/* Invalid Layouts */
@@ -773,28 +767,38 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Transfer Layouts
- *
- * This buffer could be a depth buffer used in a transfer operation. BLORP
- * currently doesn't use HiZ for transfer operations so we must use the main
- * buffer for this layout. TODO: Enable HiZ in BLORP.
*/
- return ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ /* This buffer could be a depth buffer used in a transfer operation.
+ * BLORP currently doesn't use HiZ for transfer operations so we must
+ * use the main buffer for this layout. TODO: Enable HiZ in BLORP.
+ */
+ assert(image->planes[plane].aux_usage == ISL_AUX_USAGE_HIZ);
+ return ISL_AUX_USAGE_NONE;
+ } else {
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ return image->planes[plane].aux_usage;
+ }
/* Sampling Layouts */
assert((image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) == 0);
/* Fall-through */
- assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
- if (anv_can_sample_with_hiz(devinfo, image))
- return ISL_AUX_USAGE_HIZ;
- else
- return ISL_AUX_USAGE_NONE;
+ if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) {
+ if (anv_can_sample_with_hiz(devinfo, image))
+ return ISL_AUX_USAGE_HIZ;
+ else
+ return ISL_AUX_USAGE_NONE;
+ } else {
+ return image->planes[plane].aux_usage;
+ }
+
assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
@@ -819,8 +823,14 @@ anv_layout_to_aux_usage(const struct gen_device_info * const devinfo,
/* Rendering Layouts */
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- unreachable("Color images are not yet supported.");
+ assert(aspect & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ if (image->planes[plane].aux_usage == ISL_AUX_USAGE_NONE) {
+ assert(image->samples == 1);
+ return ISL_AUX_USAGE_CCS_D;
+ } else {
+ assert(image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_D);
+ return image->planes[plane].aux_usage;
+ }
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:12 UTC
Reply
Permalink
Raw Message
These are the same as pending clear aspects only for the "load"
operation.
---
src/intel/vulkan/anv_private.h | 1 +
src/intel/vulkan/genX_cmd_buffer.c | 22 ++++++++++++++++------
2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index b881157..f4b0f90 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1670,6 +1670,7 @@ struct anv_attachment_state {

VkImageLayout current_layout;
VkImageAspectFlags pending_clear_aspects;
+ VkImageAspectFlags pending_load_aspects;
bool fast_clear;
VkClearValue clear_value;
bool clear_color_is_zero_one;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index e5e0d1c..0915d1a 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1050,26 +1050,36 @@ genX(cmd_buffer_setup_attachments)(struct anv_cmd_buffer *cmd_buffer,
struct anv_render_pass_attachment *att = &pass->attachments[i];
VkImageAspectFlags att_aspects = vk_format_aspects(att->format);
VkImageAspectFlags clear_aspects = 0;
+ VkImageAspectFlags load_aspects = 0;

if (att_aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
/* color attachment */
if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
+ } else if (att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
+ load_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
}
} else {
/* depthstencil attachment */
- if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
- att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
- clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
+ if (att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
+ clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
+ } else if (att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
+ load_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
+ }
}
- if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
- att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
- clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
+ if (att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
+ if (att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
+ clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
+ } else if (att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
+ load_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
+ }
}
}

state->attachments[i].current_layout = att->initial_layout;
state->attachments[i].pending_clear_aspects = clear_aspects;
+ state->attachments[i].pending_load_aspects = load_aspects;
if (clear_aspects)
state->attachments[i].clear_value = begin->pClearValues[i];
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:06 UTC
Reply
Permalink
Raw Message
This is a bit less awkward than passing in the subpass because it means
we don't have to extract the subpass id from the subpass.
---
src/intel/vulkan/genX_cmd_buffer.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 6f2fa0a..56036f7 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3136,13 +3136,11 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
}
}

-
static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
- struct anv_subpass *subpass)
+ uint32_t subpass_id)
{
- cmd_buffer->state.subpass = subpass;
- uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
+ cmd_buffer->state.subpass = &cmd_buffer->state.pass->subpasses[subpass_id];

cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;

@@ -3222,7 +3220,7 @@ void genX(CmdBeginRenderPass)(

genX(flush_pipeline_select_3d)(cmd_buffer);

- cmd_buffer_begin_subpass(cmd_buffer, pass->subpasses);
+ cmd_buffer_begin_subpass(cmd_buffer, 0);
}

void genX(CmdNextSubpass)(
@@ -3236,9 +3234,9 @@ void genX(CmdNextSubpass)(

assert(cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY);

+ uint32_t prev_subpass = anv_get_subpass_id(&cmd_buffer->state);
cmd_buffer_end_subpass(cmd_buffer);
-
- cmd_buffer_begin_subpass(cmd_buffer, cmd_buffer->state.subpass + 1);
+ cmd_buffer_begin_subpass(cmd_buffer, prev_subpass + 1);
}

void genX(CmdEndRenderPass)(
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-12-04 13:54:48 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This is a bit less awkward than passing in the subpass because it means
we don't have to extract the subpass id from the subpass.
---
src/intel/vulkan/genX_cmd_buffer.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 6f2fa0a..56036f7 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3136,13 +3136,11 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
}
}
-
static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
- struct anv_subpass *subpass)
+ uint32_t subpass_id)
{
- cmd_buffer->state.subpass = subpass;
- uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
+ cmd_buffer->state.subpass = &cmd_buffer->state.pass->subpasses[subpass_id];
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
@@ -3222,7 +3220,7 @@ void genX(CmdBeginRenderPass)(
genX(flush_pipeline_select_3d)(cmd_buffer);
- cmd_buffer_begin_subpass(cmd_buffer, pass->subpasses);
+ cmd_buffer_begin_subpass(cmd_buffer, 0);
}
void genX(CmdNextSubpass)(
@@ -3236,9 +3234,9 @@ void genX(CmdNextSubpass)(
assert(cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY);
+ uint32_t prev_subpass = anv_get_subpass_id(&cmd_buffer->state);
cmd_buffer_end_subpass(cmd_buffer);
-
- cmd_buffer_begin_subpass(cmd_buffer, cmd_buffer->state.subpass + 1);
+ cmd_buffer_begin_subpass(cmd_buffer, prev_subpass + 1);
}
void genX(CmdEndRenderPass)(
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:05:58 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/genX_cmd_buffer.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 0c1ae83..be717eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -719,20 +719,19 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
if (image->samples == 1 &&
image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout. Continue
- * executing this function to perform a resolve.
+ /* The CCS_D buffer may not be enabled in the final layout. Call this
+ * function again with a initial layout of COLOR_ATTACHMENT_OPTIMAL
+ * to perform a resolve.
*/
anv_perf_warn(cmd_buffer->device->instance, image,
"Performing an additional resolve for CCS_D layout "
"transition. Consider always leaving it on or "
"performing an ambiguation pass.");
- } else {
- /* Writes in the final layout will be aware of the auxiliary buffer.
- * In addition, the clear buffer entries and the auxiliary buffers
- * have been populated with values that will result in correct
- * rendering.
- */
- return;
+ transition_color_buffer(cmd_buffer, image, aspect,
+ base_level, level_count,
+ base_layer, layer_count,
+ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
+ final_layout);
}
} else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
/* Resolves are only necessary if the subresource may contain blocks
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-29 19:57:34 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 0c1ae83..be717eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -719,20 +719,19 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
if (image->samples == 1 &&
image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout. Continue
- * executing this function to perform a resolve.
+ /* The CCS_D buffer may not be enabled in the final layout. Call this
+ * function again with a initial layout of COLOR_ATTACHMENT_OPTIMAL
+ * to perform a resolve.
*/
anv_perf_warn(cmd_buffer->device->instance, image,
"Performing an additional resolve for CCS_D layout "
"transition. Consider always leaving it on or "
"performing an ambiguation pass.");
- } else {
- /* Writes in the final layout will be aware of the auxiliary buffer.
- * In addition, the clear buffer entries and the auxiliary buffers
- * have been populated with values that will result in correct
- * rendering.
- */
- return;
I must be missing something here. This now calls transition_color_buffer()
again also for the case that doesn't need resolves and after return goes
and falls thru and does resolves.
Post by Jason Ekstrand
+ transition_color_buffer(cmd_buffer, image, aspect,
+ base_level, level_count,
+ base_layer, layer_count,
+ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
+ final_layout);
}
} else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
/* Resolves are only necessary if the subresource may contain blocks
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-29 20:01:51 UTC
Reply
Permalink
Raw Message
On Wed, Nov 29, 2017 at 11:57 AM, Pohjolainen, Topi <
Post by Jason Ekstrand
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
Post by Jason Ekstrand
index 0c1ae83..be717eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -719,20 +719,19 @@ transition_color_buffer(struct anv_cmd_buffer
*cmd_buffer,
Post by Jason Ekstrand
if (image->samples == 1 &&
image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout.
Continue
Post by Jason Ekstrand
- * executing this function to perform a resolve.
+ /* The CCS_D buffer may not be enabled in the final layout.
Call this
Post by Jason Ekstrand
+ * function again with a initial layout of
COLOR_ATTACHMENT_OPTIMAL
Post by Jason Ekstrand
+ * to perform a resolve.
*/
anv_perf_warn(cmd_buffer->device->instance, image,
"Performing an additional resolve for CCS_D
layout "
Post by Jason Ekstrand
"transition. Consider always leaving it on or "
"performing an ambiguation pass.");
- } else {
- /* Writes in the final layout will be aware of the auxiliary
buffer.
Post by Jason Ekstrand
- * In addition, the clear buffer entries and the auxiliary
buffers
Post by Jason Ekstrand
- * have been populated with values that will result in correct
- * rendering.
- */
- return;
I must be missing something here. This now calls transition_color_buffer()
again also for the case that doesn't need resolves and after return goes
and falls thru and does resolves.
Yikes! You're not missing anything. I'm missing a return statement.
Post by Jason Ekstrand
Post by Jason Ekstrand
+ transition_color_buffer(cmd_buffer, image, aspect,
+ base_level, level_count,
+ base_layer, layer_count,
+ VK_IMAGE_LAYOUT_COLOR_
ATTACHMENT_OPTIMAL,
Post by Jason Ekstrand
+ final_layout);
}
} else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL)
{
Post by Jason Ekstrand
/* Resolves are only necessary if the subresource may contain
blocks
Post by Jason Ekstrand
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-29 20:04:22 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
On Wed, Nov 29, 2017 at 11:57 AM, Pohjolainen, Topi <
Post by Jason Ekstrand
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
Post by Jason Ekstrand
index 0c1ae83..be717eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -719,20 +719,19 @@ transition_color_buffer(struct anv_cmd_buffer
*cmd_buffer,
Post by Jason Ekstrand
if (image->samples == 1 &&
image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout.
Continue
Post by Jason Ekstrand
- * executing this function to perform a resolve.
+ /* The CCS_D buffer may not be enabled in the final layout.
Call this
Post by Jason Ekstrand
+ * function again with a initial layout of
COLOR_ATTACHMENT_OPTIMAL
Post by Jason Ekstrand
+ * to perform a resolve.
*/
anv_perf_warn(cmd_buffer->device->instance, image,
"Performing an additional resolve for CCS_D
layout "
Post by Jason Ekstrand
"transition. Consider always leaving it on or "
"performing an ambiguation pass.");
- } else {
- /* Writes in the final layout will be aware of the auxiliary
buffer.
Post by Jason Ekstrand
- * In addition, the clear buffer entries and the auxiliary
buffers
Post by Jason Ekstrand
- * have been populated with values that will result in correct
- * rendering.
- */
- return;
I must be missing something here. This now calls transition_color_buffer()
again also for the case that doesn't need resolves and after return goes
and falls thru and does resolves.
Yikes! You're not missing anything. I'm missing a return statement.
Upon further inspection, it appears to get added in the next patch. I've
moved it to this one.
Post by Jason Ekstrand
Post by Jason Ekstrand
+ transition_color_buffer(cmd_buffer, image, aspect,
Post by Jason Ekstrand
+ base_level, level_count,
+ base_layer, layer_count,
+ VK_IMAGE_LAYOUT_COLOR_ATTACHM
ENT_OPTIMAL,
Post by Jason Ekstrand
+ final_layout);
}
} else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL)
{
Post by Jason Ekstrand
/* Resolves are only necessary if the subresource may contain
blocks
Post by Jason Ekstrand
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-11 23:57:58 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 0c1ae83..be717eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -719,20 +719,19 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
if (image->samples == 1 &&
image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout. Continue
- * executing this function to perform a resolve.
+ /* The CCS_D buffer may not be enabled in the final layout. Call this
+ * function again with a initial layout of COLOR_ATTACHMENT_OPTIMAL
+ * to perform a resolve.
*/
anv_perf_warn(cmd_buffer->device->instance, image,
"Performing an additional resolve for CCS_D layout "
"transition. Consider always leaving it on or "
"performing an ambiguation pass.");
- } else {
- /* Writes in the final layout will be aware of the auxiliary buffer.
- * In addition, the clear buffer entries and the auxiliary buffers
- * have been populated with values that will result in correct
- * rendering.
- */
- return;
By deleting this else, we have to build the command buffer for a no-op
resolve if the image has a CCS_E buffer. Perhaps this would fit better
in the next patch?

-Nanley
Post by Jason Ekstrand
+ transition_color_buffer(cmd_buffer, image, aspect,
+ base_level, level_count,
+ base_layer, layer_count,
+ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
+ final_layout);
}
} else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
/* Resolves are only necessary if the subresource may contain blocks
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-12 00:13:04 UTC
Reply
Permalink
Raw Message
Post by Nanley Chery
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 0c1ae83..be717eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -719,20 +719,19 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
if (image->samples == 1 &&
image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout. Continue
- * executing this function to perform a resolve.
+ /* The CCS_D buffer may not be enabled in the final layout. Call this
+ * function again with a initial layout of COLOR_ATTACHMENT_OPTIMAL
+ * to perform a resolve.
*/
anv_perf_warn(cmd_buffer->device->instance, image,
"Performing an additional resolve for CCS_D layout "
"transition. Consider always leaving it on or "
"performing an ambiguation pass.");
- } else {
- /* Writes in the final layout will be aware of the auxiliary buffer.
- * In addition, the clear buffer entries and the auxiliary buffers
- * have been populated with values that will result in correct
- * rendering.
- */
- return;
By deleting this else, we have to build the command buffer for a no-op
resolve if the image has a CCS_E buffer. Perhaps this would fit better
in the next patch?
Actually, the next patch doesn't seem to fix this issue...

-Nanley
Post by Nanley Chery
-Nanley
Post by Jason Ekstrand
+ transition_color_buffer(cmd_buffer, image, aspect,
+ base_level, level_count,
+ base_layer, layer_count,
+ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
+ final_layout);
}
} else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
/* Resolves are only necessary if the subresource may contain blocks
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:02 UTC
Reply
Permalink
Raw Message
They are static functions so there's no real need to have the genX and
it just makes the function names longer.
---
src/intel/vulkan/genX_cmd_buffer.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 7d040bd..6aeffa3 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -411,7 +411,7 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
* performed properly.
*/
static void
-genX(set_image_needs_resolve)(struct anv_cmd_buffer *cmd_buffer,
+set_image_needs_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect,
unsigned level, bool needs_resolve)
@@ -432,10 +432,10 @@ genX(set_image_needs_resolve)(struct anv_cmd_buffer *cmd_buffer,
}

static void
-genX(load_needs_resolve_predicate)(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- unsigned level)
+load_needs_resolve_predicate(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ unsigned level)
{
assert(cmd_buffer && image);
assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
@@ -488,8 +488,8 @@ init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
* to return incorrect data. The fast clear data in CCS_D buffers should
* be removed because CCS_D isn't enabled all the time.
*/
- genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level,
- aux_usage == ISL_AUX_USAGE_NONE);
+ set_image_needs_resolve(cmd_buffer, image, aspect, level,
+ aux_usage == ISL_AUX_USAGE_NONE);

/* The fast clear value dword(s) will be copied into a surface state object.
* Ensure that the restrictions of the fields in the dword(s) are followed.
@@ -812,12 +812,12 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
layer_count = MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
}

- genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
+ load_needs_resolve_predicate(cmd_buffer, image, aspect, level);

anv_image_ccs_op(cmd_buffer, image, aspect, level,
base_layer, layer_count, resolve_op, true);

- genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
+ set_image_needs_resolve(cmd_buffer, image, aspect, level, false);
}

cmd_buffer->state.pending_pipe_bits |=
@@ -2992,15 +2992,15 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
* will match what's in every RENDER_SURFACE_STATE object when it's
* being used for sampling.
*/
- genX(set_image_needs_resolve)(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- false);
+ set_image_needs_resolve(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ false);
} else {
- genX(set_image_needs_resolve)(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- true);
+ set_image_needs_resolve(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ true);
}
} else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
/* The attachment may have been fast-cleared in a previous render
--
2.5.0.400.gff86faf
Nanley Chery
2017-11-28 19:31:41 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
They are static functions so there's no real need to have the genX and
it just makes the function names longer.
---
src/intel/vulkan/genX_cmd_buffer.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)
This patch is
Post by Jason Ekstrand
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 7d040bd..6aeffa3 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -411,7 +411,7 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
* performed properly.
*/
static void
-genX(set_image_needs_resolve)(struct anv_cmd_buffer *cmd_buffer,
+set_image_needs_resolve(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect,
unsigned level, bool needs_resolve)
@@ -432,10 +432,10 @@ genX(set_image_needs_resolve)(struct anv_cmd_buffer *cmd_buffer,
}
static void
-genX(load_needs_resolve_predicate)(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- unsigned level)
+load_needs_resolve_predicate(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ unsigned level)
{
assert(cmd_buffer && image);
assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
@@ -488,8 +488,8 @@ init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
* to return incorrect data. The fast clear data in CCS_D buffers should
* be removed because CCS_D isn't enabled all the time.
*/
- genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level,
- aux_usage == ISL_AUX_USAGE_NONE);
+ set_image_needs_resolve(cmd_buffer, image, aspect, level,
+ aux_usage == ISL_AUX_USAGE_NONE);
/* The fast clear value dword(s) will be copied into a surface state object.
* Ensure that the restrictions of the fields in the dword(s) are followed.
@@ -812,12 +812,12 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
layer_count = MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
}
- genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
+ load_needs_resolve_predicate(cmd_buffer, image, aspect, level);
anv_image_ccs_op(cmd_buffer, image, aspect, level,
base_layer, layer_count, resolve_op, true);
- genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
+ set_image_needs_resolve(cmd_buffer, image, aspect, level, false);
}
cmd_buffer->state.pending_pipe_bits |=
@@ -2992,15 +2992,15 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
* will match what's in every RENDER_SURFACE_STATE object when it's
* being used for sampling.
*/
- genX(set_image_needs_resolve)(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- false);
+ set_image_needs_resolve(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ false);
} else {
- genX(set_image_needs_resolve)(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- true);
+ set_image_needs_resolve(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ true);
}
} else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
/* The attachment may have been fast-cleared in a previous render
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:13 UTC
Reply
Permalink
Raw Message
This is quite a bit cleaner because we now sync the clear values at the
same time as we do the fast clear. For loading the clear values into
the surface state, we now do it once when we handle the LOAD_OP_LOAD
instead of every subpass.
---
src/intel/vulkan/anv_private.h | 8 ++
src/intel/vulkan/genX_cmd_buffer.c | 154 +++++++++++++------------------------
2 files changed, 60 insertions(+), 102 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index f4b0f90..4137a9a 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2823,6 +2823,14 @@ anv_subpass_view_count(const struct anv_subpass *subpass)
return MAX2(1, _mesa_bitcount(subpass->view_mask));
}

+static inline bool
+anv_subpass_att_is_color(const struct anv_subpass *subpass,
+ const VkAttachmentReference *att)
+{
+ return att >= subpass->color_attachments &&
+ att < subpass->color_attachments + subpass->color_count;
+}
+
struct anv_render_pass_attachment {
/* TODO: Consider using VkAttachmentDescription instead of storing each of
* its members individually.
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 0915d1a..7901b0c 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3096,99 +3096,6 @@ cmd_buffer_subpass_transition_layouts(struct anv_cmd_buffer * const cmd_buffer,
}
}

-/* Update the clear value dword(s) in surface state objects or the fast clear
- * state buffer entry for the color attachments used in this subpass.
- */
-static void
-cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
-{
- assert(cmd_buffer && cmd_buffer->state.subpass);
-
- const struct anv_cmd_state *state = &cmd_buffer->state;
-
- /* Iterate through every color attachment used in this subpass. */
- for (uint32_t i = 0; i < state->subpass->color_count; ++i) {
-
- /* The attachment should be one of the attachments described in the
- * render pass and used in the subpass.
- */
- const uint32_t a = state->subpass->color_attachments[i].attachment;
- if (a == VK_ATTACHMENT_UNUSED)
- continue;
-
- assert(a < state->pass->attachment_count);
-
- /* Store some information regarding this attachment. */
- const struct anv_attachment_state *att_state = &state->attachments[a];
- const struct anv_image_view *iview = state->framebuffer->attachments[a];
- const struct anv_render_pass_attachment *rp_att =
- &state->pass->attachments[a];
-
- if (att_state->aux_usage == ISL_AUX_USAGE_NONE)
- continue;
-
- /* The fast clear state entry must be updated if a fast clear is going to
- * happen. The surface state must be updated if the clear value from a
- * prior fast clear may be needed.
- */
- if (att_state->pending_clear_aspects && att_state->fast_clear) {
- /* Update the fast clear state entry. */
- genX(copy_fast_clear_dwords)(cmd_buffer, att_state->color.state,
- iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- true /* copy from ss */);
-
- /* Fast-clears impact whether or not a resolve will be necessary. */
- if (iview->image->planes[0].aux_usage == ISL_AUX_USAGE_CCS_E &&
- att_state->clear_color_is_zero) {
- /* This image always has the auxiliary buffer enabled. We can mark
- * the subresource as not needing a resolve because the clear color
- * will match what's in every RENDER_SURFACE_STATE object when it's
- * being used for sampling.
- */
- clear_image_needs_resolve_bits(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- ANV_IMAGE_HAS_FAST_CLEAR_BIT);
- } else {
- set_image_needs_resolve_bits(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- ANV_IMAGE_HAS_FAST_CLEAR_BIT);
- }
- } else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
- /* The attachment may have been fast-cleared in a previous render
- * pass and the value is needed now. Update the surface state(s).
- *
- * TODO: Do this only once per render pass instead of every subpass.
- */
- genX(copy_fast_clear_dwords)(cmd_buffer, att_state->color.state,
- iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- false /* copy to ss */);
-
- if (need_input_attachment_state(rp_att) &&
- att_state->input_aux_usage != ISL_AUX_USAGE_NONE) {
- genX(copy_fast_clear_dwords)(cmd_buffer, att_state->input.state,
- iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- false /* copy to ss */);
- }
- }
-
- /* We assume that if we're starting a subpass, we're going to do some
- * rendering so we may end up with compressed data.
- */
- genX(cmd_buffer_mark_image_written)(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage,
- iview->planes[0].isl.base_level);
- }
-}
-
static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
uint32_t subpass_id)
@@ -3218,15 +3125,6 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
*/
cmd_buffer_subpass_transition_layouts(cmd_buffer, false);

- /* Update clear values *after* performing automatic layout transitions.
- * This ensures that transitions from the UNDEFINED layout have had a chance
- * to populate the clear value buffer with the correct values for the
- * LOAD_OP_LOAD loadOp and that the fast-clears will update the buffer
- * without the aforementioned layout transition overwriting the fast-clear
- * value.
- */
- cmd_buffer_subpass_sync_fast_clear_values(cmd_buffer);
-
VkRect2D render_area = cmd_buffer->state.render_area;
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;

@@ -3254,6 +3152,30 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
iview->planes[0].isl.base_array_layer,
fb->layers,
ISL_AUX_OP_FAST_CLEAR, false);
+
+ genX(copy_fast_clear_dwords)(cmd_buffer, att_state->color.state,
+ image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ true /* copy from ss */);
+
+ /* Fast-clears impact whether or not a resolve will be necessary. */
+ if (image->planes[0].aux_usage == ISL_AUX_USAGE_CCS_E &&
+ att_state->clear_color_is_zero) {
+ /* This image always has the auxiliary buffer enabled. We can
+ * mark the subresource as not needing a resolve because the
+ * clear color will match what's in every RENDER_SURFACE_STATE
+ * object when it's being used for sampling.
+ */
+ clear_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
+ } else {
+ set_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
+ }
} else {
anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
att_state->aux_usage,
@@ -3292,7 +3214,35 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
assert(att_state->pending_clear_aspects == 0);
}

+ if (att_state->pending_load_aspects) {
+ if (att_state->aux_usage != ISL_AUX_USAGE_NONE) {
+ genX(copy_fast_clear_dwords)(cmd_buffer, att_state->color.state,
+ image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ false /* copy to ss */);
+ }
+
+ if (need_input_attachment_state(&cmd_state->pass->attachments[a]) &&
+ att_state->input_aux_usage != ISL_AUX_USAGE_NONE) {
+ genX(copy_fast_clear_dwords)(cmd_buffer, att_state->input.state,
+ image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ false /* copy to ss */);
+ }
+ }
+
+ if (anv_subpass_att_is_color(subpass, &subpass->attachments[i])) {
+ /* We assume that if we're starting a subpass, we're going to do some
+ * rendering so we may end up with compressed data.
+ */
+ genX(cmd_buffer_mark_image_written)(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.base_level);
+ }
+
att_state->pending_clear_aspects = 0;
+ att_state->pending_load_aspects = 0;
}

cmd_buffer_emit_depth_stencil(cmd_buffer);
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:16 UTC
Reply
Permalink
Raw Message
---
src/intel/blorp/blorp.h | 5 ++
src/intel/blorp/blorp_clear.c | 106 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 111 insertions(+)

diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
index 208b2db..dda0bf9 100644
--- a/src/intel/blorp/blorp.h
+++ b/src/intel/blorp/blorp.h
@@ -215,6 +215,11 @@ blorp_ccs_resolve(struct blorp_batch *batch,
enum blorp_fast_clear_op resolve_op);

void
+blorp_ccs_ambiguate(struct blorp_batch *batch,
+ struct blorp_surf *surf,
+ uint32_t level, uint32_t layer);
+
+void
blorp_mcs_partial_resolve(struct blorp_batch *batch,
struct blorp_surf *surf,
enum isl_format format,
diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
index ec859c2..af4d1c7 100644
--- a/src/intel/blorp/blorp_clear.c
+++ b/src/intel/blorp/blorp_clear.c
@@ -931,3 +931,109 @@ blorp_mcs_partial_resolve(struct blorp_batch *batch,

batch->blorp->exec(batch, &params);
}
+
+/** Clear a CCS to the "uncompressed" state
+ *
+ * This pass is the CCS equivalent of a "HiZ resolve". It sets the CCS values
+ * for a given layer/level of a surface to 0x0 which is the "uncompressed"
+ * state which tells the sampler to go look at the main surface.
+ */
+void
+blorp_ccs_ambiguate(struct blorp_batch *batch,
+ struct blorp_surf *surf,
+ uint32_t level, uint32_t layer)
+{
+ struct blorp_params params;
+ blorp_params_init(&params);
+
+ assert(ISL_DEV_GEN(batch->blorp->isl_dev) >= 9);
+
+ const struct isl_format_layout *aux_fmtl =
+ isl_format_get_layout(surf->aux_surf->format);
+ assert(aux_fmtl->txc == ISL_TXC_CCS);
+
+ params.dst = (struct brw_blorp_surface_info) {
+ .enabled = true,
+ .addr = surf->aux_addr,
+ .view = {
+ .usage = ISL_SURF_USAGE_RENDER_TARGET_BIT,
+ .format = ISL_FORMAT_R32G32B32A32_UINT,
+ .base_level = 0,
+ .base_array_layer = 0,
+ .levels = 1,
+ .array_len = 1,
+ .swizzle = ISL_SWIZZLE_IDENTITY,
+ },
+ };
+
+ uint32_t z = 0;
+ if (surf->surf->dim == ISL_SURF_DIM_3D) {
+ z = layer;
+ layer = 0;
+ }
+
+ uint32_t offset_B, x_offset_el, y_offset_el;
+ isl_surf_get_image_offset_el(surf->aux_surf, level, layer, z,
+ &x_offset_el, &y_offset_el);
+ isl_tiling_get_intratile_offset_el(surf->aux_surf->tiling, aux_fmtl->bpb,
+ surf->aux_surf->row_pitch,
+ x_offset_el, y_offset_el,
+ &offset_B, &x_offset_el, &y_offset_el);
+ params.dst.addr.offset += offset_B;
+
+ const uint32_t width_px = minify(surf->surf->logical_level0_px.width, level);
+ const uint32_t height_px = minify(surf->surf->logical_level0_px.height, level);
+ const uint32_t width_el = DIV_ROUND_UP(width_px, aux_fmtl->bw);
+ const uint32_t height_el = DIV_ROUND_UP(height_px, aux_fmtl->bh);
+
+ /* We're going to map it as a regular RGBA32_UINT surface. We need to
+ * downscale a good deal. From the Sky Lake PRM Vol. 12 in the section on
+ * planes:
+ *
+ * "The Color Control Surface (CCS) contains the compression status
+ * of the cache-line pairs. The compression state of the cache-line
+ * pair is specified by 2 bits in the CCS. Each CCS cache-line
+ * represents an area on the main surface of 16x16 sets of 128 byte
+ * Y-tiled cache-line-pairs. CCS is always Y tiled."
+ *
+ * Each 2-bit surface element in the CCS corresponds to a single cache-line
+ * pair in the main surface. This means that 16x16 el block in the CCS
+ * maps to a Y-tiled cache line. Fortunately, CCS layouts are calculated
+ * with a very large alignment so we can round up without worrying about
+ * overdraw.
+ */
+ assert(x_offset_el % 16 == 0 && y_offset_el % 4 == 0);
+ const uint32_t x_offset_rgba_px = x_offset_el / 16;
+ const uint32_t y_offset_rgba_px = y_offset_el / 4;
+ const uint32_t width_rgba_px = DIV_ROUND_UP(width_el, 16);
+ const uint32_t height_rgba_px = DIV_ROUND_UP(height_el, 4);
+
+ MAYBE_UNUSED bool ok =
+ isl_surf_init(batch->blorp->isl_dev, &params.dst.surf,
+ .dim = ISL_SURF_DIM_2D,
+ .format = ISL_FORMAT_R32G32B32A32_UINT,
+ .width = width_rgba_px + x_offset_rgba_px,
+ .height = height_rgba_px + y_offset_rgba_px,
+ .depth = 1,
+ .levels = 1,
+ .array_len = 1,
+ .samples = 1,
+ .row_pitch = surf->aux_surf->row_pitch,
+ .usage = ISL_SURF_USAGE_RENDER_TARGET_BIT,
+ .tiling_flags = ISL_TILING_Y0_BIT);
+ assert(ok);
+
+ params.x0 = x_offset_rgba_px;
+ params.y0 = y_offset_rgba_px;
+ params.x1 = x_offset_rgba_px + width_rgba_px;
+ params.y1 = y_offset_rgba_px + height_rgba_px;
+
+ /* A CCS value of 0 means "uncompressed." */
+ memset(&params.wm_inputs.clear_color, 0,
+ sizeof(params.wm_inputs.clear_color));
+
+ if (!blorp_params_get_clear_kernel(batch->blorp, &params, true))
+ return;
+
+ batch->blorp->exec(batch, &params);
+}
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:00 UTC
Reply
Permalink
Raw Message
This is copied and pasted from the similar macro we added to ISL.
---
src/intel/vulkan/anv_cmd_buffer.c | 40 ++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c
index 69acafa..7e7580c 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -323,24 +323,34 @@ VkResult anv_ResetCommandBuffer(
return anv_cmd_buffer_reset(cmd_buffer);
}

+#define anv_genX_call(devinfo, func, ...) \
+ switch ((devinfo)->gen) { \
+ case 7: \
+ if ((devinfo)->is_haswell) { \
+ gen75_##func(__VA_ARGS__); \
+ } else { \
+ gen7_##func(__VA_ARGS__); \
+ } \
+ break; \
+ case 8: \
+ gen8_##func(__VA_ARGS__); \
+ break; \
+ case 9: \
+ gen9_##func(__VA_ARGS__); \
+ break; \
+ case 10: \
+ gen10_##func(__VA_ARGS__); \
+ break; \
+ default: \
+ assert(!"Unknown hardware generation"); \
+ }
+
void
anv_cmd_buffer_emit_state_base_address(struct anv_cmd_buffer *cmd_buffer)
{
- switch (cmd_buffer->device->info.gen) {
- case 7:
- if (cmd_buffer->device->info.is_haswell)
- return gen75_cmd_buffer_emit_state_base_address(cmd_buffer);
- else
- return gen7_cmd_buffer_emit_state_base_address(cmd_buffer);
- case 8:
- return gen8_cmd_buffer_emit_state_base_address(cmd_buffer);
- case 9:
- return gen9_cmd_buffer_emit_state_base_address(cmd_buffer);
- case 10:
- return gen10_cmd_buffer_emit_state_base_address(cmd_buffer);
- default:
- unreachable("unsupported gen\n");
- }
+ anv_genX_call(&cmd_buffer->device->info,
+ cmd_buffer_emit_state_base_address,
+ cmd_buffer);
}

void anv_CmdBindPipeline(
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-30 16:22:09 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This is copied and pasted from the similar macro we added to ISL.
---
src/intel/vulkan/anv_cmd_buffer.c | 40 ++++++++++++++++++++++++---------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c
index 69acafa..7e7580c 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -323,24 +323,34 @@ VkResult anv_ResetCommandBuffer(
return anv_cmd_buffer_reset(cmd_buffer);
}
+#define anv_genX_call(devinfo, func, ...) \
+ switch ((devinfo)->gen) { \
+ case 7: \
+ if ((devinfo)->is_haswell) { \
+ gen75_##func(__VA_ARGS__); \
+ } else { \
+ gen7_##func(__VA_ARGS__); \
+ } \
+ break; \
+ case 8: \
+ gen8_##func(__VA_ARGS__); \
+ break; \
+ case 9: \
+ gen9_##func(__VA_ARGS__); \
+ break; \
+ case 10: \
+ gen10_##func(__VA_ARGS__); \
+ break; \
+ default: \
+ assert(!"Unknown hardware generation"); \
+ }
+
void
anv_cmd_buffer_emit_state_base_address(struct anv_cmd_buffer *cmd_buffer)
{
- switch (cmd_buffer->device->info.gen) {
- if (cmd_buffer->device->info.is_haswell)
- return gen75_cmd_buffer_emit_state_base_address(cmd_buffer);
- else
- return gen7_cmd_buffer_emit_state_base_address(cmd_buffer);
- return gen8_cmd_buffer_emit_state_base_address(cmd_buffer);
- return gen9_cmd_buffer_emit_state_base_address(cmd_buffer);
- return gen10_cmd_buffer_emit_state_base_address(cmd_buffer);
- unreachable("unsupported gen\n");
- }
+ anv_genX_call(&cmd_buffer->device->info,
+ cmd_buffer_emit_state_base_address,
+ cmd_buffer);
}
void anv_CmdBindPipeline(
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:08 UTC
Reply
Permalink
Raw Message
This is similar to blorp_gen8_hiz_clear_attachments except that it takes
actual images instead of trusting in the already set depth state.
---
src/intel/blorp/blorp.h | 11 ++++++++++
src/intel/blorp/blorp_clear.c | 50 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)

diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
index a1dd571..208b2db 100644
--- a/src/intel/blorp/blorp.h
+++ b/src/intel/blorp/blorp.h
@@ -170,6 +170,17 @@ blorp_can_hiz_clear_depth(uint8_t gen, enum isl_format format,
uint32_t num_samples,
uint32_t x0, uint32_t y0,
uint32_t x1, uint32_t y1);
+void
+blorp_hiz_clear_depth_stencil(struct blorp_batch *batch,
+ const struct blorp_surf *depth,
+ const struct blorp_surf *stencil,
+ uint32_t level,
+ uint32_t start_layer, uint32_t num_layers,
+ uint32_t x0, uint32_t y0,
+ uint32_t x1, uint32_t y1,
+ bool clear_depth, float depth_value,
+ bool clear_stencil, uint8_t stencil_value);
+

void
blorp_gen8_hiz_clear_attachments(struct blorp_batch *batch,
diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
index 8e7bc9f..ec859c2 100644
--- a/src/intel/blorp/blorp_clear.c
+++ b/src/intel/blorp/blorp_clear.c
@@ -612,6 +612,56 @@ blorp_can_hiz_clear_depth(uint8_t gen, enum isl_format format,
return true;
}

+void
+blorp_hiz_clear_depth_stencil(struct blorp_batch *batch,
+ const struct blorp_surf *depth,
+ const struct blorp_surf *stencil,
+ uint32_t level,
+ uint32_t start_layer, uint32_t num_layers,
+ uint32_t x0, uint32_t y0,
+ uint32_t x1, uint32_t y1,
+ bool clear_depth, float depth_value,
+ bool clear_stencil, uint8_t stencil_value)
+{
+ struct blorp_params params;
+ blorp_params_init(&params);
+
+ /* This requires WM_HZ_OP which only exists on gen8+ */
+ assert(ISL_DEV_GEN(batch->blorp->isl_dev) >= 8);
+
+ params.hiz_op = BLORP_HIZ_OP_DEPTH_CLEAR;
+ params.num_layers = 1;
+
+ params.x0 = x0;
+ params.y0 = y0;
+ params.x1 = x1;
+ params.y1 = y1;
+
+ for (uint32_t l = 0; l < num_layers; l++) {
+ const uint32_t layer = start_layer + l;
+ if (clear_stencil) {
+ brw_blorp_surface_info_init(batch->blorp, &params.stencil, stencil,
+ level, layer,
+ ISL_FORMAT_UNSUPPORTED, true);
+ params.stencil_mask = 0xff;
+ params.stencil_ref = stencil_value;
+ params.num_samples = params.stencil.surf.samples;
+ }
+
+ if (clear_depth) {
+ brw_blorp_surface_info_init(batch->blorp, &params.depth, depth,
+ level, layer,
+ ISL_FORMAT_UNSUPPORTED, true);
+ params.depth.clear_color.f32[0] = depth_value;
+ params.depth_format =
+ isl_format_get_depth_format(depth->surf->format, false);
+ params.num_samples = params.depth.surf.samples;
+ }
+
+ batch->blorp->exec(batch, &params);
+ }
+}
+
/* Given a depth stencil attachment, this function performs a fast depth clear
* on a depth portion and a regular clear on the stencil portion. When
* performing a fast depth clear on the depth portion, the HiZ buffer is simply
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:05:54 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/anv_blorp.c | 38 ++++++++++++++++++++++----------------
src/intel/vulkan/anv_private.h | 9 +++++----
src/intel/vulkan/genX_cmd_buffer.c | 11 ++++++-----
3 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index f10adf0..da273d6 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1568,26 +1568,30 @@ anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
blorp_batch_finish(&batch);
}

-void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op)
+static enum blorp_hiz_op
+isl_to_blorp_hiz_op(enum isl_aux_op isl_op)
{
- assert(image);
+ switch (isl_op) {
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_HIZ_OP_DEPTH_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_HIZ_OP_DEPTH_RESOLVE;
+ case ISL_AUX_OP_AMBIGUATE: return BLORP_HIZ_OP_HIZ_RESOLVE;
+ default:
+ unreachable("Unsupported HiZ aux op");
+ }
+}

+void
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op)
+{
+ assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
assert(anv_image_aspect_to_plane(image->aspects,
VK_IMAGE_ASPECT_DEPTH_BIT) == 0);

- /* Don't resolve depth buffers without an auxiliary HiZ buffer and
- * don't perform such a resolve on gens that don't support it.
- */
- if (cmd_buffer->device->info.gen < 8 ||
- image->planes[0].aux_usage != ISL_AUX_USAGE_HIZ)
- return;
-
- assert(op == BLORP_HIZ_OP_HIZ_RESOLVE ||
- op == BLORP_HIZ_OP_DEPTH_RESOLVE);
-
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);

@@ -1597,7 +1601,9 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
ISL_AUX_USAGE_HIZ, &surf);
surf.clear_color.f32[0] = ANV_HZ_FC_VAL;

- blorp_hiz_op(&batch, &surf, 0, 0, 1, op);
+ blorp_hiz_op(&batch, &surf, level, base_layer, layer_count,
+ isl_to_blorp_hiz_op(hiz_op));
+
blorp_batch_finish(&batch);
}

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index dc44ab6..5dd95a3 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2530,10 +2530,11 @@ anv_can_sample_with_hiz(const struct gen_device_info * const devinfo,
}

void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op);
-
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op);
void
anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 2e7a2cc..0c1ae83 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -388,19 +388,20 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
VK_IMAGE_ASPECT_DEPTH_BIT, final_layout);

- enum blorp_hiz_op hiz_op;
+ enum isl_aux_op hiz_op;
if (hiz_enabled && !enable_hiz) {
- hiz_op = BLORP_HIZ_OP_DEPTH_RESOLVE;
+ hiz_op = ISL_AUX_OP_FULL_RESOLVE;
} else if (!hiz_enabled && enable_hiz) {
- hiz_op = BLORP_HIZ_OP_HIZ_RESOLVE;
+ hiz_op = ISL_AUX_OP_AMBIGUATE;
} else {
assert(hiz_enabled == enable_hiz);
/* If the same buffer will be used, no resolves are necessary. */
- hiz_op = BLORP_HIZ_OP_NONE;
+ hiz_op = ISL_AUX_OP_NONE;
}

if (hiz_op != BLORP_HIZ_OP_NONE)
- anv_gen8_hiz_op_resolve(cmd_buffer, image, hiz_op);
+ anv_image_hiz_op(cmd_buffer, image, VK_IMAGE_ASPECT_DEPTH_BIT,
+ 0, 0, 1, hiz_op);
}

#define MI_PREDICATE_SRC0 0x2400
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-28 13:56:11 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/anv_blorp.c | 38 ++++++++++++++++++++++----------------
src/intel/vulkan/anv_private.h | 9 +++++----
src/intel/vulkan/genX_cmd_buffer.c | 11 ++++++-----
3 files changed, 33 insertions(+), 25 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index f10adf0..da273d6 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1568,26 +1568,30 @@ anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
blorp_batch_finish(&batch);
}
-void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op)
+static enum blorp_hiz_op
+isl_to_blorp_hiz_op(enum isl_aux_op isl_op)
{
- assert(image);
+ switch (isl_op) {
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_HIZ_OP_DEPTH_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_HIZ_OP_DEPTH_RESOLVE;
+ case ISL_AUX_OP_AMBIGUATE: return BLORP_HIZ_OP_HIZ_RESOLVE;
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+void
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op)
+{
+ assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
assert(anv_image_aspect_to_plane(image->aspects,
VK_IMAGE_ASPECT_DEPTH_BIT) == 0);
- /* Don't resolve depth buffers without an auxiliary HiZ buffer and
- * don't perform such a resolve on gens that don't support it.
- */
- if (cmd_buffer->device->info.gen < 8 ||
- image->planes[0].aux_usage != ISL_AUX_USAGE_HIZ)
- return;
-
- assert(op == BLORP_HIZ_OP_HIZ_RESOLVE ||
- op == BLORP_HIZ_OP_DEPTH_RESOLVE);
-
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
@@ -1597,7 +1601,9 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
ISL_AUX_USAGE_HIZ, &surf);
surf.clear_color.f32[0] = ANV_HZ_FC_VAL;
- blorp_hiz_op(&batch, &surf, 0, 0, 1, op);
+ blorp_hiz_op(&batch, &surf, level, base_layer, layer_count,
+ isl_to_blorp_hiz_op(hiz_op));
+
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index dc44ab6..5dd95a3 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2530,10 +2530,11 @@ anv_can_sample_with_hiz(const struct gen_device_info * const devinfo,
}
void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op);
-
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op);
void
anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 2e7a2cc..0c1ae83 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -388,19 +388,20 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
VK_IMAGE_ASPECT_DEPTH_BIT, final_layout);
- enum blorp_hiz_op hiz_op;
+ enum isl_aux_op hiz_op;
if (hiz_enabled && !enable_hiz) {
- hiz_op = BLORP_HIZ_OP_DEPTH_RESOLVE;
+ hiz_op = ISL_AUX_OP_FULL_RESOLVE;
} else if (!hiz_enabled && enable_hiz) {
- hiz_op = BLORP_HIZ_OP_HIZ_RESOLVE;
+ hiz_op = ISL_AUX_OP_AMBIGUATE;
} else {
assert(hiz_enabled == enable_hiz);
/* If the same buffer will be used, no resolves are necessary. */
- hiz_op = BLORP_HIZ_OP_NONE;
+ hiz_op = ISL_AUX_OP_NONE;
}
if (hiz_op != BLORP_HIZ_OP_NONE)
ISL_AUX_OP_NONE
Post by Jason Ekstrand
- anv_gen8_hiz_op_resolve(cmd_buffer, image, hiz_op);
+ anv_image_hiz_op(cmd_buffer, image, VK_IMAGE_ASPECT_DEPTH_BIT,
+ 0, 0, 1, hiz_op);
}
#define MI_PREDICATE_SRC0 0x2400
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 15:16:52 UTC
Reply
Permalink
Raw Message
On Tue, Nov 28, 2017 at 5:56 AM, Pohjolainen, Topi <
Post by Jason Ekstrand
Post by Jason Ekstrand
---
src/intel/vulkan/anv_blorp.c | 38 ++++++++++++++++++++++--------
--------
Post by Jason Ekstrand
src/intel/vulkan/anv_private.h | 9 +++++----
src/intel/vulkan/genX_cmd_buffer.c | 11 ++++++-----
3 files changed, 33 insertions(+), 25 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index f10adf0..da273d6 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1568,26 +1568,30 @@ anv_image_copy_to_shadow(struct anv_cmd_buffer
*cmd_buffer,
Post by Jason Ekstrand
blorp_batch_finish(&batch);
}
-void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op)
+static enum blorp_hiz_op
+isl_to_blorp_hiz_op(enum isl_aux_op isl_op)
{
- assert(image);
+ switch (isl_op) {
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_HIZ_OP_DEPTH_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_HIZ_OP_DEPTH_RESOLVE;
+ case ISL_AUX_OP_AMBIGUATE: return BLORP_HIZ_OP_HIZ_RESOLVE;
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+void
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op)
+{
+ assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
+ assert(base_layer + layer_count <= anv_image_aux_layers(image,
aspect, 0));
Post by Jason Ekstrand
assert(anv_image_aspect_to_plane(image->aspects,
VK_IMAGE_ASPECT_DEPTH_BIT) == 0);
- /* Don't resolve depth buffers without an auxiliary HiZ buffer and
- * don't perform such a resolve on gens that don't support it.
- */
- if (cmd_buffer->device->info.gen < 8 ||
- image->planes[0].aux_usage != ISL_AUX_USAGE_HIZ)
- return;
-
- assert(op == BLORP_HIZ_OP_HIZ_RESOLVE ||
- op == BLORP_HIZ_OP_DEPTH_RESOLVE);
-
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
@@ -1597,7 +1601,9 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer
*cmd_buffer,
Post by Jason Ekstrand
ISL_AUX_USAGE_HIZ, &surf);
surf.clear_color.f32[0] = ANV_HZ_FC_VAL;
- blorp_hiz_op(&batch, &surf, 0, 0, 1, op);
+ blorp_hiz_op(&batch, &surf, level, base_layer, layer_count,
+ isl_to_blorp_hiz_op(hiz_op));
+
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
private.h
Post by Jason Ekstrand
index dc44ab6..5dd95a3 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2530,10 +2530,11 @@ anv_can_sample_with_hiz(const struct
gen_device_info * const devinfo,
Post by Jason Ekstrand
}
void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op);
-
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op);
void
anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
Post by Jason Ekstrand
index 2e7a2cc..0c1ae83 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -388,19 +388,20 @@ transition_depth_buffer(struct anv_cmd_buffer
*cmd_buffer,
Post by Jason Ekstrand
anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
VK_IMAGE_ASPECT_DEPTH_BIT, final_layout);
- enum blorp_hiz_op hiz_op;
+ enum isl_aux_op hiz_op;
if (hiz_enabled && !enable_hiz) {
- hiz_op = BLORP_HIZ_OP_DEPTH_RESOLVE;
+ hiz_op = ISL_AUX_OP_FULL_RESOLVE;
} else if (!hiz_enabled && enable_hiz) {
- hiz_op = BLORP_HIZ_OP_HIZ_RESOLVE;
+ hiz_op = ISL_AUX_OP_AMBIGUATE;
} else {
assert(hiz_enabled == enable_hiz);
/* If the same buffer will be used, no resolves are necessary. */
- hiz_op = BLORP_HIZ_OP_NONE;
+ hiz_op = ISL_AUX_OP_NONE;
}
if (hiz_op != BLORP_HIZ_OP_NONE)
ISL_AUX_OP_NONE
Thanks! Fixed locally. Fortunately, it was never a problem because both
are 0 but it's a good catch none the less.
Post by Jason Ekstrand
Post by Jason Ekstrand
- anv_gen8_hiz_op_resolve(cmd_buffer, image, hiz_op);
+ anv_image_hiz_op(cmd_buffer, image, VK_IMAGE_ASPECT_DEPTH_BIT,
+ 0, 0, 1, hiz_op);
}
#define MI_PREDICATE_SRC0 0x2400
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nanley Chery
2017-12-06 00:48:18 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
---
src/intel/vulkan/anv_blorp.c | 38 ++++++++++++++++++++++----------------
src/intel/vulkan/anv_private.h | 9 +++++----
src/intel/vulkan/genX_cmd_buffer.c | 11 ++++++-----
3 files changed, 33 insertions(+), 25 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index f10adf0..da273d6 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1568,26 +1568,30 @@ anv_image_copy_to_shadow(struct anv_cmd_buffer *cmd_buffer,
blorp_batch_finish(&batch);
}
-void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op)
+static enum blorp_hiz_op
+isl_to_blorp_hiz_op(enum isl_aux_op isl_op)
{
- assert(image);
+ switch (isl_op) {
Is the NONE case missing?
Post by Jason Ekstrand
+ case ISL_AUX_OP_FAST_CLEAR: return BLORP_HIZ_OP_DEPTH_CLEAR;
+ case ISL_AUX_OP_FULL_RESOLVE: return BLORP_HIZ_OP_DEPTH_RESOLVE;
+ case ISL_AUX_OP_AMBIGUATE: return BLORP_HIZ_OP_HIZ_RESOLVE;
+ unreachable("Unsupported HiZ aux op");
+ }
+}
+void
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op)
+{
+ assert(aspect == VK_IMAGE_ASPECT_DEPTH_BIT);
Assuming this will handle fast depth/stencil clears, do we want to
future-proof this assert by replacing `==` with `&`?
Post by Jason Ekstrand
+ assert(base_layer + layer_count <= anv_image_aux_layers(image, aspect, 0));
^
I think we should replace `0` with `level`.

-Nanley
Post by Jason Ekstrand
assert(anv_image_aspect_to_plane(image->aspects,
VK_IMAGE_ASPECT_DEPTH_BIT) == 0);
- /* Don't resolve depth buffers without an auxiliary HiZ buffer and
- * don't perform such a resolve on gens that don't support it.
- */
- if (cmd_buffer->device->info.gen < 8 ||
- image->planes[0].aux_usage != ISL_AUX_USAGE_HIZ)
- return;
-
- assert(op == BLORP_HIZ_OP_HIZ_RESOLVE ||
- op == BLORP_HIZ_OP_DEPTH_RESOLVE);
-
struct blorp_batch batch;
blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
@@ -1597,7 +1601,9 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
ISL_AUX_USAGE_HIZ, &surf);
surf.clear_color.f32[0] = ANV_HZ_FC_VAL;
- blorp_hiz_op(&batch, &surf, 0, 0, 1, op);
+ blorp_hiz_op(&batch, &surf, level, base_layer, layer_count,
+ isl_to_blorp_hiz_op(hiz_op));
+
blorp_batch_finish(&batch);
}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index dc44ab6..5dd95a3 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2530,10 +2530,11 @@ anv_can_sample_with_hiz(const struct gen_device_info * const devinfo,
}
void
-anv_gen8_hiz_op_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- enum blorp_hiz_op op);
-
+anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect, uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ enum isl_aux_op hiz_op);
void
anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 2e7a2cc..0c1ae83 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -388,19 +388,20 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
VK_IMAGE_ASPECT_DEPTH_BIT, final_layout);
- enum blorp_hiz_op hiz_op;
+ enum isl_aux_op hiz_op;
if (hiz_enabled && !enable_hiz) {
- hiz_op = BLORP_HIZ_OP_DEPTH_RESOLVE;
+ hiz_op = ISL_AUX_OP_FULL_RESOLVE;
} else if (!hiz_enabled && enable_hiz) {
- hiz_op = BLORP_HIZ_OP_HIZ_RESOLVE;
+ hiz_op = ISL_AUX_OP_AMBIGUATE;
} else {
assert(hiz_enabled == enable_hiz);
/* If the same buffer will be used, no resolves are necessary. */
- hiz_op = BLORP_HIZ_OP_NONE;
+ hiz_op = ISL_AUX_OP_NONE;
}
if (hiz_op != BLORP_HIZ_OP_NONE)
- anv_gen8_hiz_op_resolve(cmd_buffer, image, hiz_op);
+ anv_image_hiz_op(cmd_buffer, image, VK_IMAGE_ASPECT_DEPTH_BIT,
+ 0, 0, 1, hiz_op);
}
#define MI_PREDICATE_SRC0 0x2400
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:14 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/genX_cmd_buffer.c | 187 +++++++++++++------------------------
1 file changed, 65 insertions(+), 122 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 7901b0c..2c4ab38 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2982,120 +2982,6 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
cmd_buffer->state.hiz_enabled = info.hiz_usage == ISL_AUX_USAGE_HIZ;
}

-
-/**
- * @brief Perform any layout transitions required at the beginning and/or end
- * of the current subpass for depth buffers.
- *
- * TODO: Consider preprocessing the attachment reference array at render pass
- * create time to determine if no layout transition is needed at the
- * beginning and/or end of each subpass.
- *
- * @param cmd_buffer The command buffer the transition is happening within.
- * @param subpass_end If true, marks that the transition is happening at the
- * end of the subpass.
- */
-static void
-cmd_buffer_subpass_transition_layouts(struct anv_cmd_buffer * const cmd_buffer,
- const bool subpass_end)
-{
- /* We need a non-NULL command buffer. */
- assert(cmd_buffer);
-
- const struct anv_cmd_state * const cmd_state = &cmd_buffer->state;
- const struct anv_subpass * const subpass = cmd_state->subpass;
-
- /* This function must be called within a subpass. */
- assert(subpass);
-
- /* If there are attachment references, the array shouldn't be NULL.
- */
- if (subpass->attachment_count > 0)
- assert(subpass->attachments);
-
- /* Iterate over the array of attachment references. */
- for (const VkAttachmentReference *att_ref = subpass->attachments;
- att_ref < subpass->attachments + subpass->attachment_count; att_ref++) {
-
- /* If the attachment is unused, we can't perform a layout transition. */
- if (att_ref->attachment == VK_ATTACHMENT_UNUSED)
- continue;
-
- /* This attachment index shouldn't go out of bounds. */
- assert(att_ref->attachment < cmd_state->pass->attachment_count);
-
- const struct anv_render_pass_attachment * const att_desc =
- &cmd_state->pass->attachments[att_ref->attachment];
- struct anv_attachment_state * const att_state =
- &cmd_buffer->state.attachments[att_ref->attachment];
-
- /* The attachment should not be used in a subpass after its last. */
- assert(att_desc->last_subpass_idx >= anv_get_subpass_id(cmd_state));
-
- if (subpass_end && anv_get_subpass_id(cmd_state) <
- att_desc->last_subpass_idx) {
- /* We're calling this function on a buffer twice in one subpass and
- * this is not the last use of the buffer. The layout should not have
- * changed from the first call and no transition is necessary.
- */
- assert(att_state->current_layout == att_ref->layout ||
- att_state->current_layout ==
- VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL);
- continue;
- }
-
- /* The attachment index must be less than the number of attachments
- * within the framebuffer.
- */
- assert(att_ref->attachment < cmd_state->framebuffer->attachment_count);
-
- const struct anv_image_view * const iview =
- cmd_state->framebuffer->attachments[att_ref->attachment];
- const struct anv_image * const image = iview->image;
-
- /* Get the appropriate target layout for this attachment. */
- VkImageLayout target_layout;
-
- /* A resolve is necessary before use as an input attachment if the clear
- * color or auxiliary buffer usage isn't supported by the sampler.
- */
- const bool input_needs_resolve =
- (att_state->fast_clear && !att_state->clear_color_is_zero_one) ||
- att_state->input_aux_usage != att_state->aux_usage;
- if (subpass_end) {
- target_layout = att_desc->final_layout;
- } else if (iview->aspect_mask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV &&
- !input_needs_resolve) {
- /* Layout transitions before the final only help to enable sampling as
- * an input attachment. If the input attachment supports sampling
- * using the auxiliary surface, we can skip such transitions by making
- * the target layout one that is CCS-aware.
- */
- target_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
- } else {
- target_layout = att_ref->layout;
- }
-
- /* Perform the layout transition. */
- if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
- transition_depth_buffer(cmd_buffer, image,
- att_state->current_layout, target_layout);
- att_state->aux_usage =
- anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
- VK_IMAGE_ASPECT_DEPTH_BIT, target_layout);
- } else if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- transition_color_buffer(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level, 1,
- iview->planes[0].isl.base_array_layer,
- iview->planes[0].isl.array_len,
- att_state->current_layout, target_layout);
- }
-
- att_state->current_layout = target_layout;
- }
-}
-
static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
uint32_t subpass_id)
@@ -3120,11 +3006,6 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[subpass_id];

- /* Perform transitions to the subpass layout before any writes have
- * occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, false);
-
VkRect2D render_area = cmd_buffer->state.render_area;
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;

@@ -3139,6 +3020,39 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
struct anv_image_view *iview = fb->attachments[a];
const struct anv_image *image = iview->image;

+ /* A resolve is necessary before use as an input attachment if the clear
+ * color or auxiliary buffer usage isn't supported by the sampler.
+ */
+ const bool input_needs_resolve =
+ (att_state->fast_clear && !att_state->clear_color_is_zero_one) ||
+ att_state->input_aux_usage != att_state->aux_usage;
+
+ VkImageLayout target_layout;
+ if (iview->aspect_mask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV &&
+ !input_needs_resolve) {
+ /* Layout transitions before the final only help to enable sampling
+ * as an input attachment. If the input attachment supports sampling
+ * using the auxiliary surface, we can skip such transitions by
+ * making the target layout one that is CCS-aware.
+ */
+ target_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
+ } else {
+ target_layout = subpass->attachments[i].layout;
+ }
+
+ if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ transition_color_buffer(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level, 1,
+ iview->planes[0].isl.base_array_layer,
+ iview->planes[0].isl.array_len,
+ att_state->current_layout, target_layout);
+ } else if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ transition_depth_buffer(cmd_buffer, image,
+ att_state->current_layout, target_layout);
+ }
+ att_state->current_layout = target_layout;
+
if (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_COLOR_BIT) {
assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);

@@ -3251,13 +3165,42 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
static void
cmd_buffer_end_subpass(struct anv_cmd_buffer *cmd_buffer)
{
+ struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+ struct anv_subpass *subpass = cmd_state->subpass;
uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);

anv_cmd_buffer_resolve_subpass(cmd_buffer);

- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
+ struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
+ for (uint32_t i = 0; i < subpass->attachment_count; ++i) {
+ const uint32_t a = subpass->attachments[i].attachment;
+ if (a == VK_ATTACHMENT_UNUSED)
+ continue;
+
+ if (cmd_state->pass->attachments[a].last_subpass_idx != subpass_id)
+ continue;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state = &cmd_state->attachments[a];
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ /* Transition the image into the final layout for this render pass */
+ VkImageLayout target_layout =
+ cmd_state->pass->attachments[a].final_layout;
+
+ if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ transition_color_buffer(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level, 1,
+ iview->planes[0].isl.base_array_layer,
+ iview->planes[0].isl.array_len,
+ att_state->current_layout, target_layout);
+ } else if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ transition_depth_buffer(cmd_buffer, image,
+ att_state->current_layout, target_layout);
+ }
+ }

/* Accumulate any subpass flushes that need to happen after the subpass.
* Yes, they do get accumulated twice in the NextSubpass case but since
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 18:07:44 UTC
Reply
Permalink
Raw Message
This patch causes a perf drop in sascha gears. I'm investigating.
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 187 +++++++++++++-----------------
-------
1 file changed, 65 insertions(+), 122 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
index 7901b0c..2c4ab38 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2982,120 +2982,6 @@ cmd_buffer_emit_depth_stencil(struct
anv_cmd_buffer *cmd_buffer)
cmd_buffer->state.hiz_enabled = info.hiz_usage == ISL_AUX_USAGE_HIZ;
}
-
-/**
- * of the current subpass for depth buffers.
- *
- * TODO: Consider preprocessing the attachment reference array at render pass
- * create time to determine if no layout transition is needed at the
- * beginning and/or end of each subpass.
- *
- * end of the subpass.
- */
-static void
-cmd_buffer_subpass_transition_layouts(struct anv_cmd_buffer * const cmd_buffer,
- const bool subpass_end)
-{
- /* We need a non-NULL command buffer. */
- assert(cmd_buffer);
-
- const struct anv_cmd_state * const cmd_state = &cmd_buffer->state;
- const struct anv_subpass * const subpass = cmd_state->subpass;
-
- /* This function must be called within a subpass. */
- assert(subpass);
-
- /* If there are attachment references, the array shouldn't be NULL.
- */
- if (subpass->attachment_count > 0)
- assert(subpass->attachments);
-
- /* Iterate over the array of attachment references. */
- for (const VkAttachmentReference *att_ref = subpass->attachments;
- att_ref < subpass->attachments + subpass->attachment_count; att_ref++) {
-
- /* If the attachment is unused, we can't perform a layout transition. */
- if (att_ref->attachment == VK_ATTACHMENT_UNUSED)
- continue;
-
- /* This attachment index shouldn't go out of bounds. */
- assert(att_ref->attachment < cmd_state->pass->attachment_count);
-
- const struct anv_render_pass_attachment * const att_desc =
- &cmd_state->pass->attachments[att_ref->attachment];
- struct anv_attachment_state * const att_state =
- &cmd_buffer->state.attachments[att_ref->attachment];
-
- /* The attachment should not be used in a subpass after its last. */
- assert(att_desc->last_subpass_idx >= anv_get_subpass_id(cmd_state))
;
-
- if (subpass_end && anv_get_subpass_id(cmd_state) <
- att_desc->last_subpass_idx) {
- /* We're calling this function on a buffer twice in one subpass and
- * this is not the last use of the buffer. The layout should not have
- * changed from the first call and no transition is necessary.
- */
- assert(att_state->current_layout == att_ref->layout ||
- att_state->current_layout ==
- VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL);
- continue;
- }
-
- /* The attachment index must be less than the number of attachments
- * within the framebuffer.
- */
- assert(att_ref->attachment < cmd_state->framebuffer->
attachment_count);
-
- const struct anv_image_view * const iview =
- cmd_state->framebuffer->attachments[att_ref->attachment];
- const struct anv_image * const image = iview->image;
-
- /* Get the appropriate target layout for this attachment. */
- VkImageLayout target_layout;
-
- /* A resolve is necessary before use as an input attachment if the clear
- * color or auxiliary buffer usage isn't supported by the sampler.
- */
- const bool input_needs_resolve =
- (att_state->fast_clear && !att_state->clear_color_is_zero_one) ||
- att_state->input_aux_usage != att_state->aux_usage;
- if (subpass_end) {
- target_layout = att_desc->final_layout;
- } else if (iview->aspect_mask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV &&
- !input_needs_resolve) {
- /* Layout transitions before the final only help to enable sampling as
- * an input attachment. If the input attachment supports sampling
- * using the auxiliary surface, we can skip such transitions by making
- * the target layout one that is CCS-aware.
- */
- target_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
- } else {
- target_layout = att_ref->layout;
- }
-
- /* Perform the layout transition. */
- if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
- transition_depth_buffer(cmd_buffer, image,
- att_state->current_layout,
target_layout);
- att_state->aux_usage =
- anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
- VK_IMAGE_ASPECT_DEPTH_BIT, target_layout);
- } else if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- transition_color_buffer(cmd_buffer, image,
VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level, 1,
- iview->planes[0].isl.base_array_layer,
- iview->planes[0].isl.array_len,
- att_state->current_layout,
target_layout);
- }
-
- att_state->current_layout = target_layout;
- }
-}
-
static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
uint32_t subpass_id)
@@ -3120,11 +3006,6 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[subpass_id];
- /* Perform transitions to the subpass layout before any writes have
- * occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, false);
-
VkRect2D render_area = cmd_buffer->state.render_area;
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -3139,6 +3020,39 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
struct anv_image_view *iview = fb->attachments[a];
const struct anv_image *image = iview->image;
+ /* A resolve is necessary before use as an input attachment if the clear
+ * color or auxiliary buffer usage isn't supported by the sampler.
+ */
+ const bool input_needs_resolve =
+ (att_state->fast_clear && !att_state->clear_color_is_zero_one) ||
+ att_state->input_aux_usage != att_state->aux_usage;
+
+ VkImageLayout target_layout;
+ if (iview->aspect_mask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV &&
+ !input_needs_resolve) {
+ /* Layout transitions before the final only help to enable sampling
+ * as an input attachment. If the input attachment supports sampling
+ * using the auxiliary surface, we can skip such transitions by
+ * making the target layout one that is CCS-aware.
+ */
+ target_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
+ } else {
+ target_layout = subpass->attachments[i].layout;
+ }
+
+ if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ transition_color_buffer(cmd_buffer, image,
VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level, 1,
+ iview->planes[0].isl.base_array_layer,
+ iview->planes[0].isl.array_len,
+ att_state->current_layout,
target_layout);
+ } else if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ transition_depth_buffer(cmd_buffer, image,
+ att_state->current_layout,
target_layout);
+ }
+ att_state->current_layout = target_layout;
+
if (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_COLOR_BIT) {
assert(att_state->pending_clear_aspects ==
VK_IMAGE_ASPECT_COLOR_BIT);
@@ -3251,13 +3165,42 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
static void
cmd_buffer_end_subpass(struct anv_cmd_buffer *cmd_buffer)
{
+ struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+ struct anv_subpass *subpass = cmd_state->subpass;
uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
anv_cmd_buffer_resolve_subpass(cmd_buffer);
- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
+ struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
+ for (uint32_t i = 0; i < subpass->attachment_count; ++i) {
+ const uint32_t a = subpass->attachments[i].attachment;
+ if (a == VK_ATTACHMENT_UNUSED)
+ continue;
+
+ if (cmd_state->pass->attachments[a].last_subpass_idx != subpass_id)
+ continue;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state = &cmd_state->attachments[a];
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ /* Transition the image into the final layout for this render pass */
+ VkImageLayout target_layout =
+ cmd_state->pass->attachments[a].final_layout;
+
+ if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ transition_color_buffer(cmd_buffer, image,
VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level, 1,
+ iview->planes[0].isl.base_array_layer,
+ iview->planes[0].isl.array_len,
+ att_state->current_layout,
target_layout);
+ } else if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ transition_depth_buffer(cmd_buffer, image,
+ att_state->current_layout,
target_layout);
+ }
+ }
/* Accumulate any subpass flushes that need to happen after the subpass.
* Yes, they do get accumulated twice in the NextSubpass case but since
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 18:13:52 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This patch causes a perf drop in sascha gears. I'm investigating.
Found it! Read below.
Post by Jason Ekstrand
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 187 +++++++++++++-----------------
-------
1 file changed, 65 insertions(+), 122 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
index 7901b0c..2c4ab38 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2982,120 +2982,6 @@ cmd_buffer_emit_depth_stencil(struct
anv_cmd_buffer *cmd_buffer)
cmd_buffer->state.hiz_enabled = info.hiz_usage == ISL_AUX_USAGE_HIZ;
}
-
-/**
- * of the current subpass for depth buffers.
- *
- * TODO: Consider preprocessing the attachment reference array at render pass
- * create time to determine if no layout transition is needed at the
- * beginning and/or end of each subpass.
- *
- * end of the subpass.
- */
-static void
-cmd_buffer_subpass_transition_layouts(struct anv_cmd_buffer * const cmd_buffer,
- const bool subpass_end)
-{
- /* We need a non-NULL command buffer. */
- assert(cmd_buffer);
-
- const struct anv_cmd_state * const cmd_state = &cmd_buffer->state;
- const struct anv_subpass * const subpass = cmd_state->subpass;
-
- /* This function must be called within a subpass. */
- assert(subpass);
-
- /* If there are attachment references, the array shouldn't be NULL.
- */
- if (subpass->attachment_count > 0)
- assert(subpass->attachments);
-
- /* Iterate over the array of attachment references. */
- for (const VkAttachmentReference *att_ref = subpass->attachments;
- att_ref < subpass->attachments + subpass->attachment_count; att_ref++) {
-
- /* If the attachment is unused, we can't perform a layout transition. */
- if (att_ref->attachment == VK_ATTACHMENT_UNUSED)
- continue;
-
- /* This attachment index shouldn't go out of bounds. */
- assert(att_ref->attachment < cmd_state->pass->attachment_count);
-
- const struct anv_render_pass_attachment * const att_desc =
- &cmd_state->pass->attachments[att_ref->attachment];
- struct anv_attachment_state * const att_state =
- &cmd_buffer->state.attachments[att_ref->attachment];
-
- /* The attachment should not be used in a subpass after its last. */
- assert(att_desc->last_subpass_idx >=
anv_get_subpass_id(cmd_state));
-
- if (subpass_end && anv_get_subpass_id(cmd_state) <
- att_desc->last_subpass_idx) {
- /* We're calling this function on a buffer twice in one subpass and
- * this is not the last use of the buffer. The layout should not have
- * changed from the first call and no transition is necessary.
- */
- assert(att_state->current_layout == att_ref->layout ||
- att_state->current_layout ==
- VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL);
- continue;
- }
-
- /* The attachment index must be less than the number of attachments
- * within the framebuffer.
- */
- assert(att_ref->attachment < cmd_state->framebuffer->attach
ment_count);
-
- const struct anv_image_view * const iview =
- cmd_state->framebuffer->attachments[att_ref->attachment];
- const struct anv_image * const image = iview->image;
-
- /* Get the appropriate target layout for this attachment. */
- VkImageLayout target_layout;
-
- /* A resolve is necessary before use as an input attachment if the clear
- * color or auxiliary buffer usage isn't supported by the sampler.
- */
- const bool input_needs_resolve =
- (att_state->fast_clear && !att_state->clear_color_is_zero_one) ||
- att_state->input_aux_usage != att_state->aux_usage;
- if (subpass_end) {
- target_layout = att_desc->final_layout;
- } else if (iview->aspect_mask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV &&
- !input_needs_resolve) {
- /* Layout transitions before the final only help to enable sampling as
- * an input attachment. If the input attachment supports sampling
- * using the auxiliary surface, we can skip such transitions by making
- * the target layout one that is CCS-aware.
- */
- target_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
- } else {
- target_layout = att_ref->layout;
- }
-
- /* Perform the layout transition. */
- if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
- transition_depth_buffer(cmd_buffer, image,
- att_state->current_layout,
target_layout);
- att_state->aux_usage =
- anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
- VK_IMAGE_ASPECT_DEPTH_BIT, target_layout);
- } else if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- transition_color_buffer(cmd_buffer, image,
VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level, 1,
- iview->planes[0].isl.base_array_layer,
- iview->planes[0].isl.array_len,
- att_state->current_layout,
target_layout);
- }
-
- att_state->current_layout = target_layout;
- }
-}
-
static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
uint32_t subpass_id)
@@ -3120,11 +3006,6 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[subpass_id];
- /* Perform transitions to the subpass layout before any writes have
- * occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, false);
-
VkRect2D render_area = cmd_buffer->state.render_area;
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
@@ -3139,6 +3020,39 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
struct anv_image_view *iview = fb->attachments[a];
const struct anv_image *image = iview->image;
+ /* A resolve is necessary before use as an input attachment if the clear
+ * color or auxiliary buffer usage isn't supported by the sampler.
+ */
+ const bool input_needs_resolve =
+ (att_state->fast_clear && !att_state->clear_color_is_zero_one) ||
+ att_state->input_aux_usage != att_state->aux_usage;
+
+ VkImageLayout target_layout;
+ if (iview->aspect_mask & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV &&
+ !input_needs_resolve) {
+ /* Layout transitions before the final only help to enable sampling
+ * as an input attachment. If the input attachment supports sampling
+ * using the auxiliary surface, we can skip such transitions by
+ * making the target layout one that is CCS-aware.
+ */
+ target_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
+ } else {
+ target_layout = subpass->attachments[i].layout;
+ }
+
+ if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ transition_color_buffer(cmd_buffer, image,
VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level, 1,
+ iview->planes[0].isl.base_array_layer,
+ iview->planes[0].isl.array_len,
+ att_state->current_layout,
target_layout);
+ } else if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ transition_depth_buffer(cmd_buffer, image,
+ att_state->current_layout,
target_layout);
I accidentally dropped a bit here:

+ att_state->aux_usage =
+ anv_layout_to_aux_usage(&cmd_buffer->device->info, image,
+ VK_IMAGE_ASPECT_DEPTH_BIT,
target_layout);

I've added it locally.
Post by Jason Ekstrand
+ }
Post by Jason Ekstrand
+ att_state->current_layout = target_layout;
+
if (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_COLOR_BIT) {
assert(att_state->pending_clear_aspects ==
VK_IMAGE_ASPECT_COLOR_BIT);
@@ -3251,13 +3165,42 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
static void
cmd_buffer_end_subpass(struct anv_cmd_buffer *cmd_buffer)
{
+ struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+ struct anv_subpass *subpass = cmd_state->subpass;
uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
anv_cmd_buffer_resolve_subpass(cmd_buffer);
- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
+ struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
+ for (uint32_t i = 0; i < subpass->attachment_count; ++i) {
+ const uint32_t a = subpass->attachments[i].attachment;
+ if (a == VK_ATTACHMENT_UNUSED)
+ continue;
+
+ if (cmd_state->pass->attachments[a].last_subpass_idx != subpass_id)
+ continue;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state =
&cmd_state->attachments[a];
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ /* Transition the image into the final layout for this render pass */
+ VkImageLayout target_layout =
+ cmd_state->pass->attachments[a].final_layout;
+
+ if (image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV) {
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ transition_color_buffer(cmd_buffer, image,
VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level, 1,
+ iview->planes[0].isl.base_array_layer,
+ iview->planes[0].isl.array_len,
+ att_state->current_layout,
target_layout);
+ } else if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ transition_depth_buffer(cmd_buffer, image,
+ att_state->current_layout,
target_layout);
+ }
+ }
/* Accumulate any subpass flushes that need to happen after the subpass.
* Yes, they do get accumulated twice in the NextSubpass case but since
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:01 UTC
Reply
Permalink
Raw Message
Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline. This will allow us to
properly mark the aux state so that we can handle resolves correctly.
---
src/intel/vulkan/anv_blorp.c | 36 ++++++++++++++++++++++++++++++++++++
src/intel/vulkan/anv_cmd_buffer.c | 12 ++++++++++++
src/intel/vulkan/anv_genX.h | 6 ++++++
src/intel/vulkan/anv_private.h | 7 +++++++
src/intel/vulkan/genX_cmd_buffer.c | 17 +++++++++++++++++
5 files changed, 78 insertions(+)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index da273d6..46e2eb0 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -271,6 +271,7 @@ void anv_CmdCopyImage(

assert(anv_image_aspects_compatible(src_mask, dst_mask));

+ const uint32_t dst_level = pRegions[r].dstSubresource.mipLevel;
if (_mesa_bitcount(src_mask) > 1) {
uint32_t aspect_bit;
anv_foreach_image_aspect_bit(aspect_bit, src_image, src_mask) {
@@ -281,6 +282,10 @@ void anv_CmdCopyImage(
get_blorp_surf_for_anv_image(cmd_buffer->device,
dst_image, 1UL << aspect_bit,
ANV_AUX_USAGE_DEFAULT, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
+ 1UL << aspect_bit,
+ dst_surf.aux_usage,
+ dst_level);

for (unsigned i = 0; i < layer_count; i++) {
blorp_copy(&batch, &src_surf, pRegions[r].srcSubresource.mipLevel,
@@ -298,6 +303,8 @@ void anv_CmdCopyImage(
ANV_AUX_USAGE_DEFAULT, &src_surf);
get_blorp_surf_for_anv_image(cmd_buffer->device, dst_image, dst_mask,
ANV_AUX_USAGE_DEFAULT, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image, dst_mask,
+ dst_surf.aux_usage, dst_level);

for (unsigned i = 0; i < layer_count; i++) {
blorp_copy(&batch, &src_surf, pRegions[r].srcSubresource.mipLevel,
@@ -387,6 +394,12 @@ copy_buffer_to_image(struct anv_cmd_buffer *cmd_buffer,
extent.width, extent.height,
buffer_row_pitch, buffer_format,
&buffer.surf, &buffer_isl_surf);
+ if (dst->surf.aux_usage != ISL_AUX_USAGE_NONE) {
+ assert(dst == &image);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, anv_image,
+ aspect, dst->surf.aux_usage,
+ dst->level);
+ }

for (unsigned z = 0; z < extent.depth; z++) {
blorp_copy(&batch, &src->surf, src->level, src->offset.z,
@@ -497,6 +510,10 @@ void anv_CmdBlitImage(
get_blorp_surf_for_anv_image(cmd_buffer->device,
dst_image, dst_res->aspectMask,
ANV_AUX_USAGE_DEFAULT, &dst);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
+ dst_res->aspectMask,
+ dst.aux_usage,
+ dst_res->mipLevel);

struct anv_format_plane src_format =
anv_get_format_plane(&cmd_buffer->device->info, src_image->vk_format,
@@ -820,6 +837,10 @@ void anv_CmdClearColorImage(
layer_count = anv_minify(image->extent.depth, level);
}

+ anv_cmd_buffer_mark_image_written(cmd_buffer, image,
+ pRanges[r].aspectMask,
+ surf.aux_usage, level);
+
blorp_clear(&batch, &surf,
src_format.isl_format, src_format.swizzle,
level, base_layer, layer_count,
@@ -1215,6 +1236,11 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer)
ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
} else {
assert(image->n_planes == 1);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.base_level);
+
blorp_clear(&batch, &surf, iview->planes[0].isl.format,
anv_swizzle_for_render(iview->planes[0].isl.swizzle),
iview->planes[0].isl.base_level,
@@ -1355,6 +1381,8 @@ resolve_image(struct anv_device *device,
uint32_t src_x, uint32_t src_y, uint32_t dst_x, uint32_t dst_y,
uint32_t width, uint32_t height)
{
+ struct anv_cmd_buffer *cmd_buffer = batch->driver_batch;
+
assert(src_image->type == VK_IMAGE_TYPE_2D);
assert(src_image->samples > 1);
assert(dst_image->type == VK_IMAGE_TYPE_2D);
@@ -1369,6 +1397,10 @@ resolve_image(struct anv_device *device,
ANV_AUX_USAGE_DEFAULT, &src_surf);
get_blorp_surf_for_anv_image(device, dst_image, 1UL << aspect_bit,
ANV_AUX_USAGE_DEFAULT, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
+ 1UL << aspect_mask,
+ dst_surf.aux_usage,
+ dst_level);

assert(!src_image->format->can_ycbcr);
assert(!dst_image->format->can_ycbcr);
@@ -1498,6 +1530,10 @@ anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
get_blorp_surf_for_anv_image(cmd_buffer->device, dst_iview->image,
VK_IMAGE_ASPECT_COLOR_BIT,
dst_aux_usage, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ dst_surf.aux_usage,
+ dst_iview->planes[0].isl.base_level);

assert(!src_iview->image->format->can_ycbcr);
assert(!dst_iview->image->format->can_ycbcr);
diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c
index 7e7580c..2c9e919 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -353,6 +353,18 @@ anv_cmd_buffer_emit_state_base_address(struct anv_cmd_buffer *cmd_buffer)
cmd_buffer);
}

+void
+anv_cmd_buffer_mark_image_written(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level)
+{
+ anv_genX_call(&cmd_buffer->device->info,
+ cmd_buffer_mark_image_written,
+ cmd_buffer, image, aspect, aux_usage, level);
+}
+
void anv_CmdBindPipeline(
VkCommandBuffer commandBuffer,
VkPipelineBindPoint pipelineBindPoint,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 0b7322e..85b893e 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -58,6 +58,12 @@ void genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer);
void genX(cmd_buffer_enable_pma_fix)(struct anv_cmd_buffer *cmd_buffer,
bool enable);

+void genX(cmd_buffer_mark_image_written)(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level);
+
void
genX(emit_urb_setup)(struct anv_device *device, struct anv_batch *batch,
const struct gen_l3_config *l3_config,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 461bfed..f805246 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2530,6 +2530,13 @@ anv_can_sample_with_hiz(const struct gen_device_info * const devinfo,
}

void
+anv_cmd_buffer_mark_image_written(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level);
+
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 65cc85d..7d040bd 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -460,6 +460,15 @@ genX(load_needs_resolve_predicate)(struct anv_cmd_buffer *cmd_buffer,
}
}

+void
+genX(cmd_buffer_mark_image_written)(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level)
+{
+}
+
static void
init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
@@ -3014,6 +3023,14 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
false /* copy to ss */);
}
}
+
+ /* We assume that if we're starting a subpass, we're going to do some
+ * rendering so we may end up with compressed data.
+ */
+ genX(cmd_buffer_mark_image_written)(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.base_level);
}
}
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-30 16:20:51 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline. This will allow us to
properly mark the aux state so that we can handle resolves correctly.
---
src/intel/vulkan/anv_blorp.c | 36 ++++++++++++++++++++++++++++++++++++
src/intel/vulkan/anv_cmd_buffer.c | 12 ++++++++++++
src/intel/vulkan/anv_genX.h | 6 ++++++
src/intel/vulkan/anv_private.h | 7 +++++++
src/intel/vulkan/genX_cmd_buffer.c | 17 +++++++++++++++++
5 files changed, 78 insertions(+)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index da273d6..46e2eb0 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -271,6 +271,7 @@ void anv_CmdCopyImage(
assert(anv_image_aspects_compatible(src_mask, dst_mask));
+ const uint32_t dst_level = pRegions[r].dstSubresource.mipLevel;
if (_mesa_bitcount(src_mask) > 1) {
uint32_t aspect_bit;
anv_foreach_image_aspect_bit(aspect_bit, src_image, src_mask) {
@@ -281,6 +282,10 @@ void anv_CmdCopyImage(
get_blorp_surf_for_anv_image(cmd_buffer->device,
dst_image, 1UL << aspect_bit,
ANV_AUX_USAGE_DEFAULT, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
+ 1UL << aspect_bit,
+ dst_surf.aux_usage,
+ dst_level);
for (unsigned i = 0; i < layer_count; i++) {
blorp_copy(&batch, &src_surf, pRegions[r].srcSubresource.mipLevel,
@@ -298,6 +303,8 @@ void anv_CmdCopyImage(
ANV_AUX_USAGE_DEFAULT, &src_surf);
get_blorp_surf_for_anv_image(cmd_buffer->device, dst_image, dst_mask,
ANV_AUX_USAGE_DEFAULT, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image, dst_mask,
+ dst_surf.aux_usage, dst_level);
for (unsigned i = 0; i < layer_count; i++) {
blorp_copy(&batch, &src_surf, pRegions[r].srcSubresource.mipLevel,
@@ -387,6 +394,12 @@ copy_buffer_to_image(struct anv_cmd_buffer *cmd_buffer,
extent.width, extent.height,
buffer_row_pitch, buffer_format,
&buffer.surf, &buffer_isl_surf);
+ if (dst->surf.aux_usage != ISL_AUX_USAGE_NONE) {
In all the other call sites you call anv_cmd_buffer_mark_image_written()
regardless if aux usage is none. Is there something special here?
Post by Jason Ekstrand
+ assert(dst == &image);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, anv_image,
+ aspect, dst->surf.aux_usage,
+ dst->level);
+ }
for (unsigned z = 0; z < extent.depth; z++) {
blorp_copy(&batch, &src->surf, src->level, src->offset.z,
@@ -497,6 +510,10 @@ void anv_CmdBlitImage(
get_blorp_surf_for_anv_image(cmd_buffer->device,
dst_image, dst_res->aspectMask,
ANV_AUX_USAGE_DEFAULT, &dst);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
+ dst_res->aspectMask,
+ dst.aux_usage,
+ dst_res->mipLevel);
struct anv_format_plane src_format =
anv_get_format_plane(&cmd_buffer->device->info, src_image->vk_format,
@@ -820,6 +837,10 @@ void anv_CmdClearColorImage(
layer_count = anv_minify(image->extent.depth, level);
}
+ anv_cmd_buffer_mark_image_written(cmd_buffer, image,
+ pRanges[r].aspectMask,
+ surf.aux_usage, level);
+
blorp_clear(&batch, &surf,
src_format.isl_format, src_format.swizzle,
level, base_layer, layer_count,
@@ -1215,6 +1236,11 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer)
ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
} else {
assert(image->n_planes == 1);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.base_level);
+
blorp_clear(&batch, &surf, iview->planes[0].isl.format,
anv_swizzle_for_render(iview->planes[0].isl.swizzle),
iview->planes[0].isl.base_level,
@@ -1355,6 +1381,8 @@ resolve_image(struct anv_device *device,
uint32_t src_x, uint32_t src_y, uint32_t dst_x, uint32_t dst_y,
uint32_t width, uint32_t height)
{
+ struct anv_cmd_buffer *cmd_buffer = batch->driver_batch;
+
assert(src_image->type == VK_IMAGE_TYPE_2D);
assert(src_image->samples > 1);
assert(dst_image->type == VK_IMAGE_TYPE_2D);
@@ -1369,6 +1397,10 @@ resolve_image(struct anv_device *device,
ANV_AUX_USAGE_DEFAULT, &src_surf);
get_blorp_surf_for_anv_image(device, dst_image, 1UL << aspect_bit,
ANV_AUX_USAGE_DEFAULT, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_image,
+ 1UL << aspect_mask,
+ dst_surf.aux_usage,
+ dst_level);
assert(!src_image->format->can_ycbcr);
assert(!dst_image->format->can_ycbcr);
@@ -1498,6 +1530,10 @@ anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
get_blorp_surf_for_anv_image(cmd_buffer->device, dst_iview->image,
VK_IMAGE_ASPECT_COLOR_BIT,
dst_aux_usage, &dst_surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, dst_iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ dst_surf.aux_usage,
+ dst_iview->planes[0].isl.base_level);
Indentation of the last three arguments seems to be one step too far. (Even
with that fixed last one seems to run over 80 columns).
Post by Jason Ekstrand
assert(!src_iview->image->format->can_ycbcr);
assert(!dst_iview->image->format->can_ycbcr);
diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_buffer.c
index 7e7580c..2c9e919 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -353,6 +353,18 @@ anv_cmd_buffer_emit_state_base_address(struct anv_cmd_buffer *cmd_buffer)
cmd_buffer);
}
+void
+anv_cmd_buffer_mark_image_written(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level)
+{
+ anv_genX_call(&cmd_buffer->device->info,
+ cmd_buffer_mark_image_written,
+ cmd_buffer, image, aspect, aux_usage, level);
+}
+
void anv_CmdBindPipeline(
VkCommandBuffer commandBuffer,
VkPipelineBindPoint pipelineBindPoint,
diff --git a/src/intel/vulkan/anv_genX.h b/src/intel/vulkan/anv_genX.h
index 0b7322e..85b893e 100644
--- a/src/intel/vulkan/anv_genX.h
+++ b/src/intel/vulkan/anv_genX.h
@@ -58,6 +58,12 @@ void genX(cmd_buffer_flush_compute_state)(struct anv_cmd_buffer *cmd_buffer);
void genX(cmd_buffer_enable_pma_fix)(struct anv_cmd_buffer *cmd_buffer,
bool enable);
+void genX(cmd_buffer_mark_image_written)(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level);
+
void
genX(emit_urb_setup)(struct anv_device *device, struct anv_batch *batch,
const struct gen_l3_config *l3_config,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 461bfed..f805246 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2530,6 +2530,13 @@ anv_can_sample_with_hiz(const struct gen_device_info * const devinfo,
}
void
+anv_cmd_buffer_mark_image_written(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level);
+
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 65cc85d..7d040bd 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -460,6 +460,15 @@ genX(load_needs_resolve_predicate)(struct anv_cmd_buffer *cmd_buffer,
}
}
+void
+genX(cmd_buffer_mark_image_written)(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ unsigned level)
+{
+}
+
static void
init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
@@ -3014,6 +3023,14 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
false /* copy to ss */);
}
}
+
+ /* We assume that if we're starting a subpass, we're going to do some
+ * rendering so we may end up with compressed data.
+ */
+ genX(cmd_buffer_mark_image_written)(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.base_level);
}
}
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:18 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/genX_cmd_buffer.c | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 9e584c1..dcd5a8f 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -823,29 +823,26 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
* We don't have any data to show that this is a problem, but we want to
* avoid causing difficult-to-debug problems.
*/
- if ((GEN_GEN >= 9 && image->samples == 1) || image->samples > 1) {
+ if (GEN_GEN >= 9 && image->samples == 1) {
+ for (uint32_t l = 0; l < level_count; l++) {
+ const uint32_t level = base_level + l;
+ const uint32_t level_layer_count =
+ MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
+ anv_image_ccs_op(cmd_buffer, image, aspect, level,
+ base_layer, level_layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ }
+ } else if (image->samples > 1) {
if (image->samples == 4 || image->samples == 16) {
anv_perf_warn(cmd_buffer->device->instance, image,
"Doing a potentially unnecessary fast-clear to "
"define an MCS buffer.");
}

- if (image->samples == 1) {
- for (uint32_t l = 0; l < level_count; l++) {
- const uint32_t level = base_level + l;
- const uint32_t level_layer_count =
- MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
- anv_image_ccs_op(cmd_buffer, image, aspect, level,
- base_layer, level_layer_count,
- ISL_AUX_OP_FAST_CLEAR, false);
- }
- } else {
- assert(image->samples > 1);
- assert(base_level == 0 && level_count == 1);
- anv_image_mcs_op(cmd_buffer, image, aspect,
- base_layer, layer_count,
- ISL_AUX_OP_FAST_CLEAR, false);
- }
+ assert(base_level == 0 && level_count == 1);
+ anv_image_mcs_op(cmd_buffer, image, aspect,
+ base_layer, layer_count,
+ ISL_AUX_OP_FAST_CLEAR, false);
}
/* At this point, some elements of the CCS buffer may have the fast-clear
* bit-arrangement. As the user writes to a subresource, we need to have
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:17 UTC
Reply
Permalink
Raw Message
Now that this isn't a multi-case if and it's just the one case, it's a
bit clearer if the condition is just part of the if instead of being
pulled out into a boolean variable.
---
src/intel/vulkan/genX_cmd_buffer.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 3473fdd..9e584c1 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -793,20 +793,15 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
anv_image_aux_layers(image, aspect, base_level) - base_layer);
last_level_num = base_level + level_count;

- /* Record whether or not the layout is undefined. Pre-initialized images
- * with auxiliary buffers have a non-linear layout and are thus undefined.
- */
assert(image->tiling == VK_IMAGE_TILING_OPTIMAL);
- const bool undef_layout = initial_layout == VK_IMAGE_LAYOUT_UNDEFINED ||
- initial_layout == VK_IMAGE_LAYOUT_PREINITIALIZED;

- /* Do preparatory work before the resolve operation or return early if no
- * resolve is actually needed.
- */
- if (undef_layout) {
+ if (initial_layout == VK_IMAGE_LAYOUT_UNDEFINED ||
+ initial_layout == VK_IMAGE_LAYOUT_PREINITIALIZED) {
/* A subresource in the undefined layout may have been aliased and
* populated with any arrangement of bits. Therefore, we must initialize
* the related aux buffer and clear buffer entry with desirable values.
+ * An initial layout of PREINITIALIZED is the same as UNDEFINED for
+ * images with VK_IMAGE_TILING_OPTIMAL.
*
* Initialize the relevant clear buffer entries.
*/
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:05:59 UTC
Reply
Permalink
Raw Message
This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts. This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.

This fixes a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time. When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier. The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.
---
src/intel/vulkan/genX_cmd_buffer.c | 53 +++++++++++++++++++++++++++++---------
1 file changed, 41 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index be717eb..65cc85d 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -593,6 +593,7 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
VkImageLayout initial_layout,
VkImageLayout final_layout)
{
+ const struct gen_device_info *devinfo = &cmd_buffer->device->info;
/* Validate the inputs. */
assert(cmd_buffer);
assert(image && image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
@@ -733,17 +734,48 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
final_layout);
}
- } else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* Resolves are only necessary if the subresource may contain blocks
- * fast-cleared to values unsupported in other layouts. This only occurs
- * if the initial layout is COLOR_ATTACHMENT_OPTIMAL.
- */
- return;
- } else if (image->samples > 1) {
- /* MCS buffers don't need resolving. */
return;
}

+ /* If initial aux usage is NONE, there is nothing to resolve */
+ enum isl_aux_usage initial_aux_usage =
+ anv_layout_to_aux_usage(devinfo, image, aspect, initial_layout);
+ if (initial_aux_usage == ISL_AUX_USAGE_NONE)
+ return;
+
+ enum isl_aux_op resolve_op = ISL_AUX_OP_NONE;
+
+ /* If the initial layout supports fast clear but the final one does not,
+ * then we need at least a partial resolve.
+ */
+ if (anv_layout_supports_fast_clear(devinfo, image, aspect, initial_layout) &&
+ !anv_layout_supports_fast_clear(devinfo, image, aspect, final_layout))
+ resolve_op = ISL_AUX_OP_PARTIAL_RESOLVE;
+
+ enum isl_aux_usage final_aux_usage =
+ anv_layout_to_aux_usage(devinfo, image, aspect, final_layout);
+ if (initial_aux_usage == ISL_AUX_USAGE_CCS_E &&
+ final_aux_usage != ISL_AUX_USAGE_CCS_E)
+ resolve_op = ISL_AUX_OP_FULL_RESOLVE;
+
+ /* CCS_D only supports full resolves and BLORP will assert on us if we try
+ * to do a partial resolve on a CCS_D surface.
+ */
+ if (resolve_op == ISL_AUX_OP_PARTIAL_RESOLVE &&
+ initial_aux_usage == ISL_AUX_USAGE_CCS_D)
+ resolve_op = ISL_AUX_OP_FULL_RESOLVE;
+
+ if (resolve_op == ISL_AUX_OP_NONE)
+ return;
+
+ /* Even though the above code can theoretically handle multiple resolve
+ * types such as CCS_D -> CCS_E, the predication code below can't. We only
+ * really handle a couple of cases.
+ */
+ assert(initial_aux_usage == ISL_AUX_USAGE_NONE ||
+ final_aux_usage == ISL_AUX_USAGE_NONE ||
+ initial_aux_usage == final_aux_usage);
+
/* Perform a resolve to synchronize data between the main and aux buffer.
* Before we begin, we must satisfy the cache flushing requirement specified
* in the Sky Lake PRM Vol. 7, "MCS Buffer for Render Target(s)":
@@ -774,10 +806,7 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);

anv_image_ccs_op(cmd_buffer, image, aspect, level,
- base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
- true);
+ base_layer, layer_count, resolve_op, true);

genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-30 08:29:05 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts. This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.
This fixes a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time. When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier. The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.
---
src/intel/vulkan/genX_cmd_buffer.c | 53 +++++++++++++++++++++++++++++---------
1 file changed, 41 insertions(+), 12 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index be717eb..65cc85d 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -593,6 +593,7 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
VkImageLayout initial_layout,
VkImageLayout final_layout)
{
+ const struct gen_device_info *devinfo = &cmd_buffer->device->info;
/* Validate the inputs. */
assert(cmd_buffer);
assert(image && image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
@@ -733,17 +734,48 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
final_layout);
}
- } else if (initial_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* Resolves are only necessary if the subresource may contain blocks
- * fast-cleared to values unsupported in other layouts. This only occurs
- * if the initial layout is COLOR_ATTACHMENT_OPTIMAL.
- */
- return;
- } else if (image->samples > 1) {
- /* MCS buffers don't need resolving. */
return;
Okay, as you said, this "return" belongs to the preceding patch (and then
Post by Jason Ekstrand
}
+ /* If initial aux usage is NONE, there is nothing to resolve */
+ enum isl_aux_usage initial_aux_usage =
+ anv_layout_to_aux_usage(devinfo, image, aspect, initial_layout);
+ if (initial_aux_usage == ISL_AUX_USAGE_NONE)
+ return;
+
+ enum isl_aux_op resolve_op = ISL_AUX_OP_NONE;
+
+ /* If the initial layout supports fast clear but the final one does not,
+ * then we need at least a partial resolve.
+ */
+ if (anv_layout_supports_fast_clear(devinfo, image, aspect, initial_layout) &&
+ !anv_layout_supports_fast_clear(devinfo, image, aspect, final_layout))
+ resolve_op = ISL_AUX_OP_PARTIAL_RESOLVE;
+
+ enum isl_aux_usage final_aux_usage =
+ anv_layout_to_aux_usage(devinfo, image, aspect, final_layout);
+ if (initial_aux_usage == ISL_AUX_USAGE_CCS_E &&
+ final_aux_usage != ISL_AUX_USAGE_CCS_E)
+ resolve_op = ISL_AUX_OP_FULL_RESOLVE;
+
+ /* CCS_D only supports full resolves and BLORP will assert on us if we try
+ * to do a partial resolve on a CCS_D surface.
+ */
+ if (resolve_op == ISL_AUX_OP_PARTIAL_RESOLVE &&
+ initial_aux_usage == ISL_AUX_USAGE_CCS_D)
+ resolve_op = ISL_AUX_OP_FULL_RESOLVE;
+
+ if (resolve_op == ISL_AUX_OP_NONE)
+ return;
+
+ /* Even though the above code can theoretically handle multiple resolve
+ * types such as CCS_D -> CCS_E, the predication code below can't. We only
+ * really handle a couple of cases.
+ */
+ assert(initial_aux_usage == ISL_AUX_USAGE_NONE ||
+ final_aux_usage == ISL_AUX_USAGE_NONE ||
+ initial_aux_usage == final_aux_usage);
+
/* Perform a resolve to synchronize data between the main and aux buffer.
* Before we begin, we must satisfy the cache flushing requirement specified
@@ -774,10 +806,7 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
anv_image_ccs_op(cmd_buffer, image, aspect, level,
- base_layer, layer_count,
- image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
- ISL_AUX_OP_PARTIAL_RESOLVE : ISL_AUX_OP_FULL_RESOLVE,
- true);
+ base_layer, layer_count, resolve_op, true);
genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
}
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:15 UTC
Reply
Permalink
Raw Message
Previously, we would always apply the layout transition at the beginning
of the subpass and then do the clear whether fast or slow. This meant
that there were some cases, specifically when the initial layout is
VK_IMAGE_LAYOUT_UNDEFINED, where we would end up doing a fast-clear or
ambiguate followed immediately by a fast-clear. This probably isn't
terribly expensive, but it is a waste that we can avoid easily enough
now that we're doing everything at the same time in begin_subpass.
---
src/intel/vulkan/genX_cmd_buffer.c | 146 +++++++++++++++++++++++++++----------
1 file changed, 106 insertions(+), 40 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 2c4ab38..3473fdd 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3009,6 +3009,101 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
VkRect2D render_area = cmd_buffer->state.render_area;
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;

+ /* First, walk through the attachments and handle any full-surface
+ * fast-clears. These are special because they do an implicit layout
+ * transition and allow us to potentially avoid a resolve later.
+ */
+ for (uint32_t i = 0; i < subpass->attachment_count; ++i) {
+ const uint32_t a = subpass->attachments[i].attachment;
+ if (a == VK_ATTACHMENT_UNUSED)
+ continue;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state = &cmd_state->attachments[a];
+
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ if (!att_state->pending_clear_aspects)
+ continue;
+
+ if (!att_state->fast_clear)
+ continue;
+
+ if (render_area.offset.x != 0 ||
+ render_area.offset.y != 0 ||
+ render_area.extent.width != iview->extent.width ||
+ render_area.extent.height != iview->extent.height)
+ continue;
+
+ if (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_COLOR_BIT) {
+ assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+
+ /* Multi-planar images are not supported as attachments */
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->n_planes == 1);
+
+ anv_image_ccs_op(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers,
+ ISL_AUX_OP_FAST_CLEAR, false);
+
+ genX(copy_fast_clear_dwords)(cmd_buffer, att_state->color.state,
+ image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ true /* copy from ss */);
+
+ /* Fast-clears impact whether or not a resolve will be necessary. */
+ if (image->planes[0].aux_usage == ISL_AUX_USAGE_CCS_E &&
+ att_state->clear_color_is_zero) {
+ /* This image always has the auxiliary buffer enabled. We can
+ * mark the subresource as not needing a resolve because the
+ * clear color will match what's in every RENDER_SURFACE_STATE
+ * object when it's being used for sampling.
+ */
+ clear_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
+ } else {
+ set_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
+ }
+
+ att_state->pending_clear_aspects = 0;
+ att_state->current_layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
+ }
+
+ /* We only key the HiZ clear off of DEPTH_BIT because a fast stencil
+ * clear doesn't reset the HiZ buffer contents the way we want.
+ */
+ if (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ /* We currently only support HiZ for single-layer images */
+ assert(iview->planes[0].isl.base_level == 0);
+ assert(iview->planes[0].isl.base_array_layer == 0);
+ assert(fb->layers == 1);
+
+ anv_image_hiz_clear(cmd_buffer, image,
+ att_state->pending_clear_aspects,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ att_state->clear_value.depthStencil.stencil);
+
+ att_state->pending_clear_aspects = 0;
+ att_state->current_layout =
+ VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
+ }
+ }
+
+
+ /* Now that we've handled any implicit transitions due to full-surface fast
+ * clears, we walk the attachments again and handle all of the other clears
+ * and transitions.
+ */
for (uint32_t i = 0; i < subpass->attachment_count; ++i) {
const uint32_t a = subpass->attachments[i].attachment;
if (a == VK_ATTACHMENT_UNUSED)
@@ -3060,46 +3155,17 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
assert(image->n_planes == 1);

- if (att_state->fast_clear) {
- anv_image_ccs_op(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer,
- fb->layers,
- ISL_AUX_OP_FAST_CLEAR, false);
-
- genX(copy_fast_clear_dwords)(cmd_buffer, att_state->color.state,
- image, VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- true /* copy from ss */);
-
- /* Fast-clears impact whether or not a resolve will be necessary. */
- if (image->planes[0].aux_usage == ISL_AUX_USAGE_CCS_E &&
- att_state->clear_color_is_zero) {
- /* This image always has the auxiliary buffer enabled. We can
- * mark the subresource as not needing a resolve because the
- * clear color will match what's in every RENDER_SURFACE_STATE
- * object when it's being used for sampling.
- */
- clear_image_needs_resolve_bits(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- ANV_IMAGE_HAS_FAST_CLEAR_BIT);
- } else {
- set_image_needs_resolve_bits(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- ANV_IMAGE_HAS_FAST_CLEAR_BIT);
- }
- } else {
- anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage,
- iview->planes[0].isl.format,
- iview->planes[0].isl.swizzle,
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer,
- fb->layers, render_area,
- vk_to_isl_color(att_state->clear_value.color));
- }
+ /* Color fast-clears are handled earlier */
+ assert(!att_state->fast_clear);
+
+ anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.format,
+ iview->planes[0].isl.swizzle,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ vk_to_isl_color(att_state->clear_value.color));
} else if (att_state->pending_clear_aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
VK_IMAGE_ASPECT_STENCIL_BIT)) {
if (att_state->fast_clear) {
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:05:55 UTC
Reply
Permalink
Raw Message
This got lost in all of the aspect vs. plane rebasing of YCBCR.
---
src/intel/vulkan/anv_image.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index ba932ba..a872149 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -710,7 +710,7 @@ void anv_GetImageSubresourceLayout(
*
* @param devinfo The device information of the Intel GPU.
* @param image The image that may contain a collection of buffers.
- * @param plane The plane of the image to be accessed.
+ * @param aspect The aspect of the image to be accessed.
* @param layout The current layout of the image aspect(s).
*
* @return The primary buffer that should be used for the given layout.
--
2.5.0.400.gff86faf
Nanley Chery
2017-11-28 19:05:34 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This got lost in all of the aspect vs. plane rebasing of YCBCR.
---
src/intel/vulkan/anv_image.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
This patch is
Post by Jason Ekstrand
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index ba932ba..a872149 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -710,7 +710,7 @@ void anv_GetImageSubresourceLayout(
*
*
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:04 UTC
Reply
Permalink
Raw Message
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.
---
src/intel/vulkan/genX_cmd_buffer.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 2d47179..bbe97f5 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3197,10 +3197,10 @@ void genX(CmdBeginRenderPass)(

genX(flush_pipeline_select_3d)(cmd_buffer);

- genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
-
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[0];
+
+ genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
}

void genX(CmdNextSubpass)(
@@ -3220,11 +3220,11 @@ void genX(CmdNextSubpass)(
*/
cmd_buffer_subpass_transition_layouts(cmd_buffer, true);

- genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass + 1);
-
uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[subpass_id];
+
+ genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass + 1);
}

void genX(CmdEndRenderPass)(
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-12-01 13:47:57 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.
After reading the next patch this patch seems incomplete both before
and after. Next patch seems to explicitly consider that flushes are
needed before and after whereas at this point it would be only
before (when this patch is applied) or after (without this patch).

I guess something else holds things together, I'm just not seeing
it?
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 2d47179..bbe97f5 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3197,10 +3197,10 @@ void genX(CmdBeginRenderPass)(
genX(flush_pipeline_select_3d)(cmd_buffer);
- genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
-
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[0];
+
+ genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
}
void genX(CmdNextSubpass)(
@@ -3220,11 +3220,11 @@ void genX(CmdNextSubpass)(
*/
cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
- genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass + 1);
-
uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[subpass_id];
+
+ genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass + 1);
}
void genX(CmdEndRenderPass)(
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-12-01 20:27:23 UTC
Reply
Permalink
Raw Message
On Fri, Dec 1, 2017 at 5:47 AM, Pohjolainen, Topi <
Post by Pohjolainen, Topi
Post by Jason Ekstrand
This seems slightly more correct because it means that the flushes
happen before any clears or resolves implied by the subpass transition.
After reading the next patch this patch seems incomplete both before
and after. Next patch seems to explicitly consider that flushes are
needed before and after whereas at this point it would be only
before (when this patch is applied) or after (without this patch).
I guess something else holds things together, I'm just not seeing
it?
I think so. In either case, what really matters is that the subpass
flushes happen before the next draw call. The only change made here is
that before they would get triggered by the next draw call and now they may
get triggered by a clear or resolve that happens as part of set_subpass.
Post by Pohjolainen, Topi
Post by Jason Ekstrand
---
src/intel/vulkan/genX_cmd_buffer.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/intel/vulkan/genX_cmd_buffer.c
b/src/intel/vulkan/genX_cmd_buffer.c
Post by Jason Ekstrand
index 2d47179..bbe97f5 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3197,10 +3197,10 @@ void genX(CmdBeginRenderPass)(
genX(flush_pipeline_select_3d)(cmd_buffer);
- genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
-
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[0];
+
+ genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
}
void genX(CmdNextSubpass)(
@@ -3220,11 +3220,11 @@ void genX(CmdNextSubpass)(
*/
cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
- genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass
+ 1);
Post by Jason Ekstrand
-
uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
cmd_buffer->state.pending_pipe_bits |=
cmd_buffer->state.pass->subpass_flushes[subpass_id];
+
+ genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass
+ 1);
Post by Jason Ekstrand
}
void genX(CmdEndRenderPass)(
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:19 UTC
Reply
Permalink
Raw Message
Even though the blorp pass looks a bit on the sketchy side, the end
result in the Vulkan driver is very nice. Instead of having this weird
case where you do a fast clear and then maybe have to resolve, we just
do the ambiguate and are done with it. The ambiguate does exactly what
we want of setting all the CCS values to 0 which puts it inot the
pass-through state.
---
src/intel/vulkan/anv_blorp.c | 5 +++++
src/intel/vulkan/genX_cmd_buffer.c | 40 ++------------------------------------
2 files changed, 7 insertions(+), 38 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 45d7b12..84ac720 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1705,6 +1705,11 @@ anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
break;
case ISL_AUX_OP_AMBIGUATE:
+ for (uint32_t a = 0; a < layer_count; a++) {
+ const uint32_t layer = base_layer + a;
+ blorp_ccs_ambiguate(&batch, &surf, level, layer);
+ }
+ break;
default:
unreachable("Unsupported CCS operation");
}
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index dcd5a8f..e8a4d90 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -614,17 +614,7 @@ init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
enum isl_aux_usage aux_usage = image->planes[plane].aux_usage;

- /* The resolve flag should updated to signify that fast-clear/compression
- * data needs to be removed when leaving the undefined layout. Such data
- * may need to be removed if it would cause accesses to the color buffer
- * to return incorrect data. The fast clear data in CCS_D buffers should
- * be removed because CCS_D isn't enabled all the time.
- */
- if (aux_usage == ISL_AUX_USAGE_NONE) {
- set_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);
- } else {
- clear_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);
- }
+ clear_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);

/* The fast clear value dword(s) will be copied into a surface state object.
* Ensure that the restrictions of the fields in the dword(s) are followed.
@@ -830,7 +820,7 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
anv_image_ccs_op(cmd_buffer, image, aspect, level,
base_layer, level_layer_count,
- ISL_AUX_OP_FAST_CLEAR, false);
+ ISL_AUX_OP_AMBIGUATE, false);
}
} else if (image->samples > 1) {
if (image->samples == 4 || image->samples == 16) {
@@ -844,32 +834,6 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
base_layer, layer_count,
ISL_AUX_OP_FAST_CLEAR, false);
}
- /* At this point, some elements of the CCS buffer may have the fast-clear
- * bit-arrangement. As the user writes to a subresource, we need to have
- * the associated CCS elements enter the ambiguated state. This enables
- * reads (implicit or explicit) to reflect the user-written data instead
- * of the clear color. The only time such elements will not change their
- * state as described above, is in a final layout that doesn't have CCS
- * enabled. In this case, we must force the associated CCS buffers of the
- * specified range to enter the ambiguated state in advance.
- */
- if (image->samples == 1 &&
- image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
- final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
- /* The CCS_D buffer may not be enabled in the final layout. Call this
- * function again with a initial layout of COLOR_ATTACHMENT_OPTIMAL
- * to perform a resolve.
- */
- anv_perf_warn(cmd_buffer->device->instance, image,
- "Performing an additional resolve for CCS_D layout "
- "transition. Consider always leaving it on or "
- "performing an ambiguation pass.");
- transition_color_buffer(cmd_buffer, image, aspect,
- base_level, level_count,
- base_layer, layer_count,
- VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
- final_layout);
- }
return;
}
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:03 UTC
Reply
Permalink
Raw Message
This makes us start tracking two bits per level for aux to describe both
whether or not something is fast-cleared and whether or not it is
compressed as opposed to a single "needs resolve" bit. The previous
scheme only worked because we assumed that CCS_E compressed images would
always be read with CCS_E enabled and so they didn't need any sort of
resolve if there was no fast clear color.

The assumptions of the previous scheme held because the one case where
we do need a full resolve when CCS_E is enabled is for window-system
images. Since we only ever allowed X-tiled window-system images, CCS
was entirely disabled on gen9+ and we never got CCS_E. With the advent
of Y-tiled window-system buffers, we now need to properly support doing
a full resolve images marked CCS_E. This requires us to track things
more granularly because, if the client does two back-to-back transitions
where we first do a partial resolve and then a full resolve, we need
both resolves to happen.
---
src/intel/vulkan/anv_private.h | 10 +--
src/intel/vulkan/genX_cmd_buffer.c | 158 ++++++++++++++++++++++++++++++-------
2 files changed, 135 insertions(+), 33 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index f805246..e875705 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2476,16 +2476,16 @@ anv_fast_clear_state_entry_size(const struct anv_device *device)
{
assert(device);
/* Entry contents:
- * +--------------------------------------------+
- * | clear value dword(s) | needs resolve dword |
- * +--------------------------------------------+
+ * +---------------------------------------------+
+ * | clear value dword(s) | needs resolve dwords |
+ * +---------------------------------------------+
*/

- /* Ensure that the needs resolve dword is in fact dword-aligned to enable
+ /* Ensure that the needs resolve dwords are in fact dword-aligned to enable
* GPU memcpy operations.
*/
assert(device->isl_dev.ss.clear_value_size % 4 == 0);
- return device->isl_dev.ss.clear_value_size + 4;
+ return device->isl_dev.ss.clear_value_size + 8;
}

static inline struct anv_address
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 6aeffa3..2d47179 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -404,6 +404,22 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
0, 0, 1, hiz_op);
}

+/** Bitfield representing the needed resolves.
+ *
+ * This bitfield represents the kinds of compression we may or may not have in
+ * a given image. The ANV_IMAGE_NEEDS_* can be ANDed with the ANV_IMAGE_HAS
+ * bits to determine when a given resolve actually needs to happen.
+ *
+ * For convenience of dealing with MI_PREDICATE, we blow these two bits out
+ * into two dwords in the image meatadata page.
+ */
+enum anv_image_resolve_bits {
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT = 0x1,
+ ANV_IMAGE_HAS_COMPRESSION_BIT = 0x2,
+ ANV_IMAGE_NEEDS_PARTIAL_RESOLVE_BITS = 0x1,
+ ANV_IMAGE_NEEDS_FULL_RESOLVE_BITS = 0x3,
+};
+
#define MI_PREDICATE_SRC0 0x2400
#define MI_PREDICATE_SRC1 0x2408

@@ -411,23 +427,62 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
* performed properly.
*/
static void
-set_image_needs_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- unsigned level, bool needs_resolve)
+set_image_needs_resolve_bits(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ unsigned level,
+ enum anv_image_resolve_bits set_bits)
{
assert(cmd_buffer && image);
assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
assert(level < anv_image_aux_levels(image, aspect));

- /* The HW docs say that there is no way to guarantee the completion of
- * the following command. We use it nevertheless because it shows no
- * issues in testing is currently being used in the GL driver.
- */
- anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
- sdi.Address = anv_image_get_needs_resolve_addr(cmd_buffer->device,
- image, aspect, level);
- sdi.ImmediateData = needs_resolve;
+ const struct anv_address resolve_flag_addr =
+ anv_image_get_needs_resolve_addr(cmd_buffer->device,
+ image, aspect, level);
+
+ if (set_bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.ImmediateData = 1;
+ }
+ }
+ if (set_bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.Address.offset += 4;
+ sdi.ImmediateData = 1;
+ }
+ }
+}
+
+static void
+clear_image_needs_resolve_bits(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ unsigned level,
+ enum anv_image_resolve_bits clear_bits)
+{
+ assert(cmd_buffer && image);
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(level < anv_image_aux_levels(image, aspect));
+
+ const struct anv_address resolve_flag_addr =
+ anv_image_get_needs_resolve_addr(cmd_buffer->device,
+ image, aspect, level);
+
+ if (clear_bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.ImmediateData = 0;
+ }
+ }
+ if (clear_bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.Address.offset += 4;
+ sdi.ImmediateData = 0;
+ }
}
}

@@ -435,7 +490,8 @@ static void
load_needs_resolve_predicate(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect,
- unsigned level)
+ unsigned level,
+ enum anv_image_resolve_bits bits)
{
assert(cmd_buffer && image);
assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
@@ -450,9 +506,27 @@ load_needs_resolve_predicate(struct anv_cmd_buffer *cmd_buffer,
*/
emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 , 0);
emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 + 4, 0);
- emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 , 0);
- emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,
- resolve_flag_addr.bo, resolve_flag_addr.offset);
+
+ /* Conditionally load the two dwords into the high and low portions of
+ * MI_PREDICATE_SRC0. This effectively ANDs the bits passed into this
+ * function with the logical bits stored in the metadata page. Because
+ * they're split out with one bit per dword, we don't need to use any
+ * sort of MI math.
+ */
+ if (bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {
+ emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0,
+ resolve_flag_addr.bo, resolve_flag_addr.offset);
+ } else {
+ emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0, 0);
+ }
+
+ if (bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {
+ emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,
+ resolve_flag_addr.bo, resolve_flag_addr.offset + 4);
+ } else {
+ emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4, 0);
+ }
+
anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
mip.LoadOperation = LOAD_LOADINV;
mip.CombineOperation = COMBINE_SET;
@@ -467,6 +541,18 @@ genX(cmd_buffer_mark_image_written)(struct anv_cmd_buffer *cmd_buffer,
enum isl_aux_usage aux_usage,
unsigned level)
{
+ /* The only compression types with more than just fast-clears are MCS,
+ * CCS_E, and HiZ. With HiZ we just trust the layout and don't actually
+ * track the current fast-clear and compression state. This leaves us
+ * with just MCS and CCS_E.
+ */
+ if (aux_usage != ISL_AUX_USAGE_CCS_E &&
+ aux_usage != ISL_AUX_USAGE_MCS)
+ return;
+
+ set_image_needs_resolve_bits(cmd_buffer, image,
+ VK_IMAGE_ASPECT_COLOR_BIT, level,
+ ANV_IMAGE_HAS_COMPRESSION_BIT);
}

static void
@@ -488,8 +574,11 @@ init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
* to return incorrect data. The fast clear data in CCS_D buffers should
* be removed because CCS_D isn't enabled all the time.
*/
- set_image_needs_resolve(cmd_buffer, image, aspect, level,
- aux_usage == ISL_AUX_USAGE_NONE);
+ if (aux_usage == ISL_AUX_USAGE_NONE) {
+ set_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);
+ } else {
+ clear_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);
+ }

/* The fast clear value dword(s) will be copied into a surface state object.
* Ensure that the restrictions of the fields in the dword(s) are followed.
@@ -812,12 +901,25 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
layer_count = MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
}

- load_needs_resolve_predicate(cmd_buffer, image, aspect, level);
+ enum anv_image_resolve_bits resolve_bits;
+ switch (resolve_op) {
+ case ISL_AUX_OP_FULL_RESOLVE:
+ resolve_bits = ANV_IMAGE_NEEDS_FULL_RESOLVE_BITS;
+ break;
+ case ISL_AUX_OP_PARTIAL_RESOLVE:
+ resolve_bits = ANV_IMAGE_NEEDS_PARTIAL_RESOLVE_BITS;
+ break;
+ default:
+ unreachable("Invalid resolve op");
+ }
+ load_needs_resolve_predicate(cmd_buffer, image, aspect, level,
+ resolve_bits);

anv_image_ccs_op(cmd_buffer, image, aspect, level,
base_layer, layer_count, resolve_op, true);

- set_image_needs_resolve(cmd_buffer, image, aspect, level, false);
+ clear_image_needs_resolve_bits(cmd_buffer, image, aspect,
+ level, resolve_bits);
}

cmd_buffer->state.pending_pipe_bits |=
@@ -2992,15 +3094,15 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
* will match what's in every RENDER_SURFACE_STATE object when it's
* being used for sampling.
*/
- set_image_needs_resolve(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- false);
+ clear_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
} else {
- set_image_needs_resolve(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- true);
+ set_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
}
} else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
/* The attachment may have been fast-cleared in a previous render
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-11-30 19:59:09 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This makes us start tracking two bits per level for aux to describe both
whether or not something is fast-cleared and whether or not it is
compressed as opposed to a single "needs resolve" bit. The previous
scheme only worked because we assumed that CCS_E compressed images would
always be read with CCS_E enabled and so they didn't need any sort of
resolve if there was no fast clear color.
The assumptions of the previous scheme held because the one case where
we do need a full resolve when CCS_E is enabled is for window-system
images. Since we only ever allowed X-tiled window-system images, CCS
was entirely disabled on gen9+ and we never got CCS_E. With the advent
of Y-tiled window-system buffers, we now need to properly support doing
a full resolve images marked CCS_E. This requires us to track things
more granularly because, if the client does two back-to-back transitions
where we first do a partial resolve and then a full resolve, we need
both resolves to happen.
---
src/intel/vulkan/anv_private.h | 10 +--
src/intel/vulkan/genX_cmd_buffer.c | 158 ++++++++++++++++++++++++++++++-------
2 files changed, 135 insertions(+), 33 deletions(-)
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index f805246..e875705 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2476,16 +2476,16 @@ anv_fast_clear_state_entry_size(const struct anv_device *device)
{
assert(device);
- * +--------------------------------------------+
- * | clear value dword(s) | needs resolve dword |
- * +--------------------------------------------+
+ * +---------------------------------------------+
+ * | clear value dword(s) | needs resolve dwords |
+ * +---------------------------------------------+
*/
- /* Ensure that the needs resolve dword is in fact dword-aligned to enable
+ /* Ensure that the needs resolve dwords are in fact dword-aligned to enable
* GPU memcpy operations.
*/
assert(device->isl_dev.ss.clear_value_size % 4 == 0);
- return device->isl_dev.ss.clear_value_size + 4;
+ return device->isl_dev.ss.clear_value_size + 8;
}
static inline struct anv_address
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 6aeffa3..2d47179 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -404,6 +404,22 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
0, 0, 1, hiz_op);
}
+/** Bitfield representing the needed resolves.
+ *
+ * This bitfield represents the kinds of compression we may or may not have in
+ * a given image. The ANV_IMAGE_NEEDS_* can be ANDed with the ANV_IMAGE_HAS
+ * bits to determine when a given resolve actually needs to happen.
+ *
+ * For convenience of dealing with MI_PREDICATE, we blow these two bits out
+ * into two dwords in the image meatadata page.
s/meatadata/metadata/
Post by Jason Ekstrand
+ */
+enum anv_image_resolve_bits {
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT = 0x1,
+ ANV_IMAGE_HAS_COMPRESSION_BIT = 0x2,
+ ANV_IMAGE_NEEDS_PARTIAL_RESOLVE_BITS = 0x1,
+ ANV_IMAGE_NEEDS_FULL_RESOLVE_BITS = 0x3,
+};
+
#define MI_PREDICATE_SRC0 0x2400
#define MI_PREDICATE_SRC1 0x2408
@@ -411,23 +427,62 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
* performed properly.
*/
static void
-set_image_needs_resolve(struct anv_cmd_buffer *cmd_buffer,
- const struct anv_image *image,
- VkImageAspectFlagBits aspect,
- unsigned level, bool needs_resolve)
+set_image_needs_resolve_bits(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ unsigned level,
+ enum anv_image_resolve_bits set_bits)
{
assert(cmd_buffer && image);
assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
assert(level < anv_image_aux_levels(image, aspect));
- /* The HW docs say that there is no way to guarantee the completion of
- * the following command. We use it nevertheless because it shows no
- * issues in testing is currently being used in the GL driver.
- */
Why are we dropping this comment now?
Post by Jason Ekstrand
- anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
- sdi.Address = anv_image_get_needs_resolve_addr(cmd_buffer->device,
- image, aspect, level);
- sdi.ImmediateData = needs_resolve;
+ const struct anv_address resolve_flag_addr =
+ anv_image_get_needs_resolve_addr(cmd_buffer->device,
+ image, aspect, level);
+
+ if (set_bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.ImmediateData = 1;
+ }
+ }
+ if (set_bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.Address.offset += 4;
+ sdi.ImmediateData = 1;
+ }
+ }
+}
+
+static void
+clear_image_needs_resolve_bits(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ unsigned level,
+ enum anv_image_resolve_bits clear_bits)
+{
+ assert(cmd_buffer && image);
+ assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
+ assert(level < anv_image_aux_levels(image, aspect));
+
+ const struct anv_address resolve_flag_addr =
+ anv_image_get_needs_resolve_addr(cmd_buffer->device,
+ image, aspect, level);
+
+ if (clear_bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.ImmediateData = 0;
+ }
+ }
+ if (clear_bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {
+ anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
+ sdi.Address = resolve_flag_addr;
+ sdi.Address.offset += 4;
+ sdi.ImmediateData = 0;
+ }
Do set_image_needs_resolve_bits() and clear_image_needs_resolve_bits()
deviate in later patches? Now the only difference is the value set for
"sdi.ImmediateData". I can't help thinking of having one function and the
value as an extra argument. The argument could be an enum documenting for
the reader if SET or CLEAR is wanted. Not a big issue, just thinking aloud.
Post by Jason Ekstrand
}
}
@@ -435,7 +490,8 @@ static void
load_needs_resolve_predicate(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect,
- unsigned level)
+ unsigned level,
+ enum anv_image_resolve_bits bits)
{
assert(cmd_buffer && image);
assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
@@ -450,9 +506,27 @@ load_needs_resolve_predicate(struct anv_cmd_buffer *cmd_buffer,
*/
emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 , 0);
emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 + 4, 0);
- emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 , 0);
- emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,
- resolve_flag_addr.bo, resolve_flag_addr.offset);
+
+ /* Conditionally load the two dwords into the high and low portions of
+ * MI_PREDICATE_SRC0. This effectively ANDs the bits passed into this
+ * function with the logical bits stored in the metadata page. Because
+ * they're split out with one bit per dword, we don't need to use any
+ * sort of MI math.
+ */
+ if (bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {
+ emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0,
+ resolve_flag_addr.bo, resolve_flag_addr.offset);
+ } else {
+ emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0, 0);
+ }
+
+ if (bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {
+ emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,
+ resolve_flag_addr.bo, resolve_flag_addr.offset + 4);
+ } else {
+ emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4, 0);
+ }
+
anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
mip.LoadOperation = LOAD_LOADINV;
mip.CombineOperation = COMBINE_SET;
@@ -467,6 +541,18 @@ genX(cmd_buffer_mark_image_written)(struct anv_cmd_buffer *cmd_buffer,
enum isl_aux_usage aux_usage,
unsigned level)
{
+ /* The only compression types with more than just fast-clears are MCS,
+ * CCS_E, and HiZ. With HiZ we just trust the layout and don't actually
+ * track the current fast-clear and compression state. This leaves us
+ * with just MCS and CCS_E.
+ */
+ if (aux_usage != ISL_AUX_USAGE_CCS_E &&
+ aux_usage != ISL_AUX_USAGE_MCS)
+ return;
+
+ set_image_needs_resolve_bits(cmd_buffer, image,
+ VK_IMAGE_ASPECT_COLOR_BIT, level,
+ ANV_IMAGE_HAS_COMPRESSION_BIT);
}
static void
@@ -488,8 +574,11 @@ init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
* to return incorrect data. The fast clear data in CCS_D buffers should
* be removed because CCS_D isn't enabled all the time.
*/
- set_image_needs_resolve(cmd_buffer, image, aspect, level,
- aux_usage == ISL_AUX_USAGE_NONE);
+ if (aux_usage == ISL_AUX_USAGE_NONE) {
+ set_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);
+ } else {
+ clear_image_needs_resolve_bits(cmd_buffer, image, aspect, level, ~0);
+ }
/* The fast clear value dword(s) will be copied into a surface state object.
* Ensure that the restrictions of the fields in the dword(s) are followed.
@@ -812,12 +901,25 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
layer_count = MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
}
- load_needs_resolve_predicate(cmd_buffer, image, aspect, level);
+ enum anv_image_resolve_bits resolve_bits;
+ switch (resolve_op) {
+ resolve_bits = ANV_IMAGE_NEEDS_FULL_RESOLVE_BITS;
+ break;
+ resolve_bits = ANV_IMAGE_NEEDS_PARTIAL_RESOLVE_BITS;
+ break;
+ unreachable("Invalid resolve op");
+ }
+ load_needs_resolve_predicate(cmd_buffer, image, aspect, level,
+ resolve_bits);
anv_image_ccs_op(cmd_buffer, image, aspect, level,
base_layer, layer_count, resolve_op, true);
- set_image_needs_resolve(cmd_buffer, image, aspect, level, false);
+ clear_image_needs_resolve_bits(cmd_buffer, image, aspect,
+ level, resolve_bits);
}
cmd_buffer->state.pending_pipe_bits |=
@@ -2992,15 +3094,15 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
* will match what's in every RENDER_SURFACE_STATE object when it's
* being used for sampling.
*/
- set_image_needs_resolve(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- false);
+ clear_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
} else {
- set_image_needs_resolve(cmd_buffer, iview->image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- true);
+ set_image_needs_resolve_bits(cmd_buffer, iview->image,
+ VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ ANV_IMAGE_HAS_FAST_CLEAR_BIT);
}
} else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
/* The attachment may have been fast-cleared in a previous render
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:09 UTC
Reply
Permalink
Raw Message
---
src/intel/vulkan/anv_blorp.c | 243 ++++++++++++++++---------------------
src/intel/vulkan/anv_private.h | 17 ++-
src/intel/vulkan/genX_cmd_buffer.c | 68 ++++++++++-
3 files changed, 188 insertions(+), 140 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 7401234..45d7b12 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1132,143 +1132,6 @@ enum subpass_stage {
SUBPASS_STAGE_RESOLVE,
};

-static bool
-subpass_needs_clear(const struct anv_cmd_buffer *cmd_buffer)
-{
- const struct anv_cmd_state *cmd_state = &cmd_buffer->state;
- uint32_t ds = cmd_state->subpass->depth_stencil_attachment.attachment;
-
- if (ds != VK_ATTACHMENT_UNUSED) {
- assert(ds < cmd_state->pass->attachment_count);
- if (cmd_state->attachments[ds].pending_clear_aspects)
- return true;
- }
-
- return false;
-}
-
-void
-anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer)
-{
- const struct anv_cmd_state *cmd_state = &cmd_buffer->state;
- const VkRect2D render_area = cmd_buffer->state.render_area;
-
-
- if (!subpass_needs_clear(cmd_buffer))
- return;
-
- /* Because this gets called within a render pass, we tell blorp not to
- * trash our depth and stencil buffers.
- */
- struct blorp_batch batch;
- blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer,
- BLORP_BATCH_NO_EMIT_DEPTH_STENCIL);
-
- VkClearRect clear_rect = {
- .rect = cmd_buffer->state.render_area,
- .baseArrayLayer = 0,
- .layerCount = cmd_buffer->state.framebuffer->layers,
- };
-
- struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
-
- const uint32_t ds = cmd_state->subpass->depth_stencil_attachment.attachment;
- assert(ds == VK_ATTACHMENT_UNUSED || ds < cmd_state->pass->attachment_count);
-
- if (ds != VK_ATTACHMENT_UNUSED &&
- cmd_state->attachments[ds].pending_clear_aspects) {
-
- VkClearAttachment clear_att = {
- .aspectMask = cmd_state->attachments[ds].pending_clear_aspects,
- .clearValue = cmd_state->attachments[ds].clear_value,
- };
-
-
- const uint8_t gen = cmd_buffer->device->info.gen;
- bool clear_with_hiz = gen >= 8 && cmd_state->attachments[ds].aux_usage ==
- ISL_AUX_USAGE_HIZ;
- const struct anv_image_view *iview = fb->attachments[ds];
-
- if (clear_with_hiz) {
- const bool clear_depth = clear_att.aspectMask &
- VK_IMAGE_ASPECT_DEPTH_BIT;
- const bool clear_stencil = clear_att.aspectMask &
- VK_IMAGE_ASPECT_STENCIL_BIT;
-
- /* Check against restrictions for depth buffer clearing. A great GPU
- * performance benefit isn't expected when using the HZ sequence for
- * stencil-only clears. Therefore, we don't emit a HZ op sequence for
- * a stencil clear in addition to using the BLORP-fallback for depth.
- */
- if (clear_depth) {
- if (!blorp_can_hiz_clear_depth(gen, iview->planes[0].isl.format,
- iview->image->samples,
- render_area.offset.x,
- render_area.offset.y,
- render_area.offset.x +
- render_area.extent.width,
- render_area.offset.y +
- render_area.extent.height)) {
- clear_with_hiz = false;
- } else if (clear_att.clearValue.depthStencil.depth !=
- ANV_HZ_FC_VAL) {
- /* Don't enable fast depth clears for any color not equal to
- * ANV_HZ_FC_VAL.
- */
- clear_with_hiz = false;
- } else if (gen == 8 &&
- anv_can_sample_with_hiz(&cmd_buffer->device->info,
- iview->image)) {
- /* Only gen9+ supports returning ANV_HZ_FC_VAL when sampling a
- * fast-cleared portion of a HiZ buffer. Testing has revealed
- * that Gen8 only supports returning 0.0f. Gens prior to gen8 do
- * not support this feature at all.
- */
- clear_with_hiz = false;
- }
- }
-
- if (clear_with_hiz) {
- blorp_gen8_hiz_clear_attachments(&batch, iview->image->samples,
- render_area.offset.x,
- render_area.offset.y,
- render_area.offset.x +
- render_area.extent.width,
- render_area.offset.y +
- render_area.extent.height,
- clear_depth, clear_stencil,
- clear_att.clearValue.
- depthStencil.stencil);
-
- /* From the SKL PRM, Depth Buffer Clear:
- *
- * Depth Buffer Clear Workaround
- * Depth buffer clear pass using any of the methods (WM_STATE,
- * 3DSTATE_WM or 3DSTATE_WM_HZ_OP) must be followed by a
- * PIPE_CONTROL command with DEPTH_STALL bit and Depth FLUSH bits
- * “set” before starting to render. DepthStall and DepthFlush are
- * not needed between consecutive depth clear passes nor is it
- * required if the depth-clear pass was done with “full_surf_clear”
- * bit set in the 3DSTATE_WM_HZ_OP.
- */
- if (clear_depth) {
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | ANV_PIPE_DEPTH_STALL_BIT;
- }
- }
- }
-
- if (!clear_with_hiz) {
- clear_depth_stencil_attachment(cmd_buffer, &batch,
- &clear_att, 1, &clear_rect);
- }
-
- cmd_state->attachments[ds].pending_clear_aspects = 0;
- }
-
- blorp_batch_finish(&batch);
-}
-
static void
resolve_surface(struct blorp_batch *batch,
struct blorp_surf *src_surf,
@@ -1568,6 +1431,53 @@ anv_image_clear_color(struct anv_cmd_buffer *cmd_buffer,
}

void
+anv_image_clear_depth_stencil(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlags aspects,
+ enum isl_aux_usage depth_aux_usage,
+ uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area,
+ float depth_value, uint8_t stencil_value)
+{
+ assert(aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
+ VK_IMAGE_ASPECT_STENCIL_BIT));
+
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
+
+ struct blorp_surf depth, stencil;
+ if (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ get_blorp_surf_for_anv_image(cmd_buffer->device,
+ image, VK_IMAGE_ASPECT_DEPTH_BIT,
+ depth_aux_usage, &depth);
+ depth.clear_color.f32[0] = ANV_HZ_FC_VAL;
+ } else {
+ memset(&stencil, 0, sizeof(stencil));
+ }
+
+ if (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
+ get_blorp_surf_for_anv_image(cmd_buffer->device,
+ image, VK_IMAGE_ASPECT_STENCIL_BIT,
+ ISL_AUX_USAGE_NONE, &stencil);
+ } else {
+ memset(&stencil, 0, sizeof(stencil));
+ }
+
+ blorp_clear_depth_stencil(&batch, &depth, &stencil,
+ level, base_layer, layer_count,
+ area.offset.x, area.offset.y,
+ area.offset.x + area.extent.width,
+ area.offset.y + area.extent.height,
+ aspects & VK_IMAGE_ASPECT_DEPTH_BIT,
+ depth_value,
+ (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) ? 0xff : 0,
+ stencil_value);
+
+ blorp_batch_finish(&batch);
+}
+
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
@@ -1595,6 +1505,65 @@ anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
}

void
+anv_image_hiz_clear(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlags aspects,
+ uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area, uint8_t stencil_value)
+{
+ assert(aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
+ VK_IMAGE_ASPECT_STENCIL_BIT));
+ assert(base_layer + layer_count <=
+ anv_image_aux_layers(image, VK_IMAGE_ASPECT_DEPTH_BIT, 0));
+
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
+
+ struct blorp_surf depth;
+ get_blorp_surf_for_anv_image(cmd_buffer->device,
+ image, VK_IMAGE_ASPECT_DEPTH_BIT,
+ ISL_AUX_USAGE_HIZ, &depth);
+ depth.clear_color.f32[0] = ANV_HZ_FC_VAL;
+
+ struct blorp_surf stencil;
+ if (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
+ get_blorp_surf_for_anv_image(cmd_buffer->device,
+ image, VK_IMAGE_ASPECT_STENCIL_BIT,
+ ISL_AUX_USAGE_NONE, &stencil);
+ } else {
+ memset(&stencil, 0, sizeof(stencil));
+ }
+
+ blorp_hiz_clear_depth_stencil(&batch, &depth, &stencil,
+ level, base_layer, layer_count,
+ area.offset.x, area.offset.y,
+ area.offset.x + area.extent.width,
+ area.offset.y + area.extent.height,
+ aspects & VK_IMAGE_ASPECT_DEPTH_BIT,
+ ANV_HZ_FC_VAL,
+ aspects & VK_IMAGE_ASPECT_STENCIL_BIT,
+ stencil_value);
+
+ blorp_batch_finish(&batch);
+
+ /* From the SKL PRM, Depth Buffer Clear:
+ *
+ * Depth Buffer Clear Workaround
+ * Depth buffer clear pass using any of the methods (WM_STATE, 3DSTATE_WM
+ * or 3DSTATE_WM_HZ_OP) must be followed by a PIPE_CONTROL command with
+ * DEPTH_STALL bit and Depth FLUSH bits “set” before starting to render.
+ * DepthStall and DepthFlush are not needed between consecutive depth clear
+ * passes nor is it required if the depth-clear pass was done with
+ * “full_surf_clear” bit set in the 3DSTATE_WM_HZ_OP.
+ */
+ if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT) {
+ cmd_buffer->state.pending_pipe_bits |=
+ ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | ANV_PIPE_DEPTH_STALL_BIT;
+ }
+}
+
+void
anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index bc355bb..b881157 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1875,7 +1875,6 @@ anv_cmd_buffer_push_constants(struct anv_cmd_buffer *cmd_buffer,
struct anv_state
anv_cmd_buffer_cs_push_constants(struct anv_cmd_buffer *cmd_buffer);

-void anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer);
void anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer);

const struct anv_image_view *
@@ -2545,12 +2544,28 @@ anv_image_clear_color(struct anv_cmd_buffer *cmd_buffer,
uint32_t level, uint32_t base_layer, uint32_t layer_count,
VkRect2D area, union isl_color_value clear_color);
void
+anv_image_clear_depth_stencil(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlags aspects,
+ enum isl_aux_usage depth_aux_usage,
+ uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area,
+ float depth_value, uint8_t stencil_value);
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
uint32_t base_layer, uint32_t layer_count,
enum isl_aux_op hiz_op);
void
+anv_image_hiz_clear(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlags aspects,
+ uint32_t level,
+ uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area, uint8_t stencil_value);
+void
anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 265ae44..57685bd 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3216,9 +3216,73 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
att_state->pending_clear_aspects = 0;
}

- cmd_buffer_emit_depth_stencil(cmd_buffer);
+ if (subpass->depth_stencil_attachment.attachment != VK_ATTACHMENT_UNUSED) {
+ const uint32_t a = subpass->depth_stencil_attachment.attachment;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state = &cmd_state->attachments[a];
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ assert(image->aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
+ VK_IMAGE_ASPECT_STENCIL_BIT));
+
+ if (att_state->pending_clear_aspects) {
+ bool clear_with_hiz = att_state->aux_usage == ISL_AUX_USAGE_HIZ;
+ if (clear_with_hiz &&
+ (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {
+ if (!blorp_can_hiz_clear_depth(GEN_GEN,
+ iview->planes[0].isl.format,
+ iview->image->samples,
+ render_area.offset.x,
+ render_area.offset.y,
+ render_area.offset.x +
+ render_area.extent.width,
+ render_area.offset.y +
+ render_area.extent.height)) {
+ clear_with_hiz = false;
+ } else if (att_state->clear_value.depthStencil.depth != ANV_HZ_FC_VAL) {
+ clear_with_hiz = false;
+ } else if (GEN_GEN == 8 &&
+ anv_can_sample_with_hiz(&cmd_buffer->device->info,
+ iview->image)) {
+ /* Only gen9+ supports returning ANV_HZ_FC_VAL when sampling a
+ * fast-cleared portion of a HiZ buffer. Testing has revealed
+ * that Gen8 only supports returning 0.0f. Gens prior to gen8
+ * do not support this feature at all.
+ */
+ clear_with_hiz = false;
+ }
+ }

- anv_cmd_buffer_clear_subpass(cmd_buffer);
+ if (clear_with_hiz) {
+ /* We currently only support HiZ for single-layer images */
+ assert(iview->planes[0].isl.base_level == 0);
+ assert(iview->planes[0].isl.base_array_layer == 0);
+ assert(fb->layers == 1);
+
+ anv_image_hiz_clear(cmd_buffer, image,
+ att_state->pending_clear_aspects,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ att_state->clear_value.depthStencil.stencil);
+ } else {
+ anv_image_clear_depth_stencil(cmd_buffer, image,
+ att_state->pending_clear_aspects,
+ att_state->aux_usage,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ att_state->clear_value.depthStencil.depth,
+ att_state->clear_value.depthStencil.stencil);
+ }
+ }
+
+ att_state->pending_clear_aspects = 0;
+ }
+
+ cmd_buffer_emit_depth_stencil(cmd_buffer);
}

static void
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:11 UTC
Reply
Permalink
Raw Message
This unifies things a bit because we now handle depth and stencil at the
same time. It also ensures that clears happen for input attachments.
---
src/intel/vulkan/genX_cmd_buffer.c | 69 ++++++++++++++++----------------------
1 file changed, 28 insertions(+), 41 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 3f90c1a..e5e0d1c 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3219,58 +3219,43 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,

VkRect2D render_area = cmd_buffer->state.render_area;
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
- for (uint32_t i = 0; i < subpass->color_count; ++i) {
- const uint32_t a = subpass->color_attachments[i].attachment;
+
+ for (uint32_t i = 0; i < subpass->attachment_count; ++i) {
+ const uint32_t a = subpass->attachments[i].attachment;
if (a == VK_ATTACHMENT_UNUSED)
continue;

assert(a < cmd_state->pass->attachment_count);
struct anv_attachment_state *att_state = &cmd_state->attachments[a];

- if (!att_state->pending_clear_aspects)
- continue;
-
- assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
-
struct anv_image_view *iview = fb->attachments[a];
const struct anv_image *image = iview->image;

- /* Multi-planar images are not supported as attachments */
- assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
- assert(image->n_planes == 1);
-
- if (att_state->fast_clear) {
- anv_image_ccs_op(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer,
- fb->layers,
- ISL_AUX_OP_FAST_CLEAR, false);
- } else {
- anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage,
- iview->planes[0].isl.format,
- iview->planes[0].isl.swizzle,
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer,
- fb->layers, render_area,
- vk_to_isl_color(att_state->clear_value.color));
- }
-
- att_state->pending_clear_aspects = 0;
- }
-
- if (subpass->depth_stencil_attachment.attachment != VK_ATTACHMENT_UNUSED) {
- const uint32_t a = subpass->depth_stencil_attachment.attachment;
-
- assert(a < cmd_state->pass->attachment_count);
- struct anv_attachment_state *att_state = &cmd_state->attachments[a];
- struct anv_image_view *iview = fb->attachments[a];
- const struct anv_image *image = iview->image;
+ if (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_COLOR_BIT) {
+ assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);

- assert(image->aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
- VK_IMAGE_ASPECT_STENCIL_BIT));
+ /* Multi-planar images are not supported as attachments */
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->n_planes == 1);

- if (att_state->pending_clear_aspects) {
+ if (att_state->fast_clear) {
+ anv_image_ccs_op(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ } else {
+ anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.format,
+ iview->planes[0].isl.swizzle,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ vk_to_isl_color(att_state->clear_value.color));
+ }
+ } else if (att_state->pending_clear_aspects & (VK_IMAGE_ASPECT_DEPTH_BIT |
+ VK_IMAGE_ASPECT_STENCIL_BIT)) {
if (att_state->fast_clear) {
/* We currently only support HiZ for single-layer images */
assert(iview->planes[0].isl.base_level == 0);
@@ -3293,6 +3278,8 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
att_state->clear_value.depthStencil.depth,
att_state->clear_value.depthStencil.stencil);
}
+ } else {
+ assert(att_state->pending_clear_aspects == 0);
}

att_state->pending_clear_aspects = 0;
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:05 UTC
Reply
Permalink
Raw Message
Having begin/end_subpass is a bit nicer than the begin/next/end hooks
that Vulkan gives us.
---
src/intel/vulkan/genX_cmd_buffer.c | 55 +++++++++++++++++++++-----------------
1 file changed, 31 insertions(+), 24 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index bbe97f5..6f2fa0a 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3138,10 +3138,11 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)


static void
-genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
- struct anv_subpass *subpass)
+cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
+ struct anv_subpass *subpass)
{
cmd_buffer->state.subpass = subpass;
+ uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);

cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;

@@ -3155,6 +3156,10 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
if (GEN_GEN == 7)
cmd_buffer->state.vb_dirty |= ~0;

+ /* Accumulate any subpass flushes that need to happen before the subpass */
+ cmd_buffer->state.pending_pipe_bits |=
+ cmd_buffer->state.pass->subpass_flushes[subpass_id];
+
/* Perform transitions to the subpass layout before any writes have
* occurred.
*/
@@ -3174,6 +3179,26 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
anv_cmd_buffer_clear_subpass(cmd_buffer);
}

+static void
+cmd_buffer_end_subpass(struct anv_cmd_buffer *cmd_buffer)
+{
+ uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
+
+ anv_cmd_buffer_resolve_subpass(cmd_buffer);
+
+ /* Perform transitions to the final layout after all writes have occurred.
+ */
+ cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
+
+ /* Accumulate any subpass flushes that need to happen after the subpass.
+ * Yes, they do get accumulated twice in the NextSubpass case but since
+ * genX_CmdNextSubpass just calls end/begin back-to-back, we just end up
+ * ORing the bits in twice so it's harmless.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ cmd_buffer->state.pass->subpass_flushes[subpass_id + 1];
+}
+
void genX(CmdBeginRenderPass)(
VkCommandBuffer commandBuffer,
const VkRenderPassBeginInfo* pRenderPassBegin,
@@ -3197,10 +3222,7 @@ void genX(CmdBeginRenderPass)(

genX(flush_pipeline_select_3d)(cmd_buffer);

- cmd_buffer->state.pending_pipe_bits |=
- cmd_buffer->state.pass->subpass_flushes[0];
-
- genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
+ cmd_buffer_begin_subpass(cmd_buffer, pass->subpasses);
}

void genX(CmdNextSubpass)(
@@ -3214,17 +3236,9 @@ void genX(CmdNextSubpass)(

assert(cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY);

- anv_cmd_buffer_resolve_subpass(cmd_buffer);
-
- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
-
- uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
- cmd_buffer->state.pending_pipe_bits |=
- cmd_buffer->state.pass->subpass_flushes[subpass_id];
+ cmd_buffer_end_subpass(cmd_buffer);

- genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass + 1);
+ cmd_buffer_begin_subpass(cmd_buffer, cmd_buffer->state.subpass + 1);
}

void genX(CmdEndRenderPass)(
@@ -3235,14 +3249,7 @@ void genX(CmdEndRenderPass)(
if (anv_batch_has_error(&cmd_buffer->batch))
return;

- anv_cmd_buffer_resolve_subpass(cmd_buffer);
-
- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
-
- cmd_buffer->state.pending_pipe_bits |=
- cmd_buffer->state.pass->subpass_flushes[cmd_buffer->state.pass->subpass_count];
+ cmd_buffer_end_subpass(cmd_buffer);

cmd_buffer->state.hiz_enabled = false;
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-12-04 13:52:33 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
Having begin/end_subpass is a bit nicer than the begin/next/end hooks
that Vulkan gives us.
---
src/intel/vulkan/genX_cmd_buffer.c | 55 +++++++++++++++++++++-----------------
1 file changed, 31 insertions(+), 24 deletions(-)
Like said in the previous patch, unlike there, here things make sense as
Post by Jason Ekstrand
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index bbe97f5..6f2fa0a 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3138,10 +3138,11 @@ cmd_buffer_subpass_sync_fast_clear_values(struct anv_cmd_buffer *cmd_buffer)
static void
-genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
- struct anv_subpass *subpass)
+cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
+ struct anv_subpass *subpass)
{
cmd_buffer->state.subpass = subpass;
+ uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
@@ -3155,6 +3156,10 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
if (GEN_GEN == 7)
cmd_buffer->state.vb_dirty |= ~0;
+ /* Accumulate any subpass flushes that need to happen before the subpass */
+ cmd_buffer->state.pending_pipe_bits |=
+ cmd_buffer->state.pass->subpass_flushes[subpass_id];
+
/* Perform transitions to the subpass layout before any writes have
* occurred.
*/
@@ -3174,6 +3179,26 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
anv_cmd_buffer_clear_subpass(cmd_buffer);
}
+static void
+cmd_buffer_end_subpass(struct anv_cmd_buffer *cmd_buffer)
+{
+ uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
+
+ anv_cmd_buffer_resolve_subpass(cmd_buffer);
+
+ /* Perform transitions to the final layout after all writes have occurred.
+ */
+ cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
+
+ /* Accumulate any subpass flushes that need to happen after the subpass.
+ * Yes, they do get accumulated twice in the NextSubpass case but since
+ * genX_CmdNextSubpass just calls end/begin back-to-back, we just end up
+ * ORing the bits in twice so it's harmless.
+ */
+ cmd_buffer->state.pending_pipe_bits |=
+ cmd_buffer->state.pass->subpass_flushes[subpass_id + 1];
+}
+
void genX(CmdBeginRenderPass)(
VkCommandBuffer commandBuffer,
const VkRenderPassBeginInfo* pRenderPassBegin,
@@ -3197,10 +3222,7 @@ void genX(CmdBeginRenderPass)(
genX(flush_pipeline_select_3d)(cmd_buffer);
- cmd_buffer->state.pending_pipe_bits |=
- cmd_buffer->state.pass->subpass_flushes[0];
-
- genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
+ cmd_buffer_begin_subpass(cmd_buffer, pass->subpasses);
}
void genX(CmdNextSubpass)(
@@ -3214,17 +3236,9 @@ void genX(CmdNextSubpass)(
assert(cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY);
- anv_cmd_buffer_resolve_subpass(cmd_buffer);
-
- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
-
- uint32_t subpass_id = anv_get_subpass_id(&cmd_buffer->state);
- cmd_buffer->state.pending_pipe_bits |=
- cmd_buffer->state.pass->subpass_flushes[subpass_id];
+ cmd_buffer_end_subpass(cmd_buffer);
- genX(cmd_buffer_set_subpass)(cmd_buffer, cmd_buffer->state.subpass + 1);
+ cmd_buffer_begin_subpass(cmd_buffer, cmd_buffer->state.subpass + 1);
}
void genX(CmdEndRenderPass)(
@@ -3235,14 +3249,7 @@ void genX(CmdEndRenderPass)(
if (anv_batch_has_error(&cmd_buffer->batch))
return;
- anv_cmd_buffer_resolve_subpass(cmd_buffer);
-
- /* Perform transitions to the final layout after all writes have occurred.
- */
- cmd_buffer_subpass_transition_layouts(cmd_buffer, true);
-
- cmd_buffer->state.pending_pipe_bits |=
- cmd_buffer->state.pass->subpass_flushes[cmd_buffer->state.pass->subpass_count];
+ cmd_buffer_end_subpass(cmd_buffer);
cmd_buffer->state.hiz_enabled = false;
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Jason Ekstrand
2017-11-28 03:06:10 UTC
Reply
Permalink
Raw Message
This moves the decision out of begin_subpass and into BeginRenderPass
like the decision for color clears. We use a similar name for the
function for depth/stencil as for color even though no aux usage is
really getting computed.
---
src/intel/vulkan/genX_cmd_buffer.c | 84 +++++++++++++++++++++++---------------
1 file changed, 50 insertions(+), 34 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 57685bd..3f90c1a 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -346,6 +346,52 @@ color_attachment_compute_aux_usage(struct anv_device * device,
}
}

+static void
+depth_stencil_attachment_compute_aux_usage(struct anv_device *device,
+ struct anv_cmd_state *cmd_state,
+ uint32_t att, VkRect2D render_area)
+{
+ struct anv_attachment_state *att_state = &cmd_state->attachments[att];
+ struct anv_image_view *iview = cmd_state->framebuffer->attachments[att];
+
+ /* These will be initialized after the first subpass transition. */
+ att_state->aux_usage = ISL_AUX_USAGE_NONE;
+ att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
+
+ if (att_state->aux_usage != ISL_AUX_USAGE_HIZ) {
+ att_state->fast_clear = false;
+ return;
+ } else if (!(att_state->pending_clear_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {
+ /* If we're just clearing stencil, we can always HiZ clear */
+ att_state->fast_clear = true;
+ return;
+ }
+
+ if (!blorp_can_hiz_clear_depth(GEN_GEN,
+ iview->planes[0].isl.format,
+ iview->image->samples,
+ render_area.offset.x,
+ render_area.offset.y,
+ render_area.offset.x +
+ render_area.extent.width,
+ render_area.offset.y +
+ render_area.extent.height)) {
+ att_state->fast_clear = false;
+ } else if (att_state->clear_value.depthStencil.depth != ANV_HZ_FC_VAL) {
+ att_state->fast_clear = false;
+ } else if (GEN_GEN == 8 &&
+ anv_can_sample_with_hiz(&device->info, iview->image)) {
+ /* Only gen9+ supports returning ANV_HZ_FC_VAL when sampling a
+ * fast-cleared portion of a HiZ buffer. Testing has revealed that Gen8
+ * only supports returning 0.0f. Gens prior to gen8 do not support this
+ * feature at all.
+ */
+ att_state->fast_clear = false;
+ } else {
+ att_state->fast_clear = true;
+ }
+}
+
static bool
need_input_attachment_state(const struct anv_render_pass_attachment *att)
{
@@ -1052,12 +1098,9 @@ genX(cmd_buffer_setup_attachments)(struct anv_cmd_buffer *cmd_buffer,
add_image_view_relocs(cmd_buffer, iview, 0,
state->attachments[i].color);
} else {
- /* This field will be initialized after the first subpass
- * transition.
- */
- state->attachments[i].aux_usage = ISL_AUX_USAGE_NONE;
-
- state->attachments[i].input_aux_usage = ISL_AUX_USAGE_NONE;
+ depth_stencil_attachment_compute_aux_usage(cmd_buffer->device,
+ state, i,
+ begin->renderArea);
}

if (need_input_attachment_state(&pass->attachments[i])) {
@@ -3228,34 +3271,7 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
VK_IMAGE_ASPECT_STENCIL_BIT));

if (att_state->pending_clear_aspects) {
- bool clear_with_hiz = att_state->aux_usage == ISL_AUX_USAGE_HIZ;
- if (clear_with_hiz &&
- (att_state->pending_clear_aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) {
- if (!blorp_can_hiz_clear_depth(GEN_GEN,
- iview->planes[0].isl.format,
- iview->image->samples,
- render_area.offset.x,
- render_area.offset.y,
- render_area.offset.x +
- render_area.extent.width,
- render_area.offset.y +
- render_area.extent.height)) {
- clear_with_hiz = false;
- } else if (att_state->clear_value.depthStencil.depth != ANV_HZ_FC_VAL) {
- clear_with_hiz = false;
- } else if (GEN_GEN == 8 &&
- anv_can_sample_with_hiz(&cmd_buffer->device->info,
- iview->image)) {
- /* Only gen9+ supports returning ANV_HZ_FC_VAL when sampling a
- * fast-cleared portion of a HiZ buffer. Testing has revealed
- * that Gen8 only supports returning 0.0f. Gens prior to gen8
- * do not support this feature at all.
- */
- clear_with_hiz = false;
- }
- }
-
- if (clear_with_hiz) {
+ if (att_state->fast_clear) {
/* We currently only support HiZ for single-layer images */
assert(iview->planes[0].isl.base_level == 0);
assert(iview->planes[0].isl.base_array_layer == 0);
--
2.5.0.400.gff86faf
Jason Ekstrand
2017-11-28 03:06:07 UTC
Reply
Permalink
Raw Message
This doesn't really change much now but it will give us more/better
control over clears in the future. The one interesting functional
change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and
friends for each clear. However, this only happens at begin_subpass
time so it shouldn't be substantially more expensive.
---
src/intel/vulkan/anv_blorp.c | 115 +++++++++++--------------------------
src/intel/vulkan/anv_private.h | 8 +++
src/intel/vulkan/genX_cmd_buffer.c | 46 ++++++++++++++-
3 files changed, 86 insertions(+), 83 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 46e2eb0..7401234 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1138,17 +1138,6 @@ subpass_needs_clear(const struct anv_cmd_buffer *cmd_buffer)
const struct anv_cmd_state *cmd_state = &cmd_buffer->state;
uint32_t ds = cmd_state->subpass->depth_stencil_attachment.attachment;

- for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
- uint32_t a = cmd_state->subpass->color_attachments[i].attachment;
- if (a == VK_ATTACHMENT_UNUSED)
- continue;
-
- assert(a < cmd_state->pass->attachment_count);
- if (cmd_state->attachments[a].pending_clear_aspects) {
- return true;
- }
- }
-
if (ds != VK_ATTACHMENT_UNUSED) {
assert(ds < cmd_state->pass->attachment_count);
if (cmd_state->attachments[ds].pending_clear_aspects)
@@ -1182,77 +1171,6 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer)
};

struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
- for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
- const uint32_t a = cmd_state->subpass->color_attachments[i].attachment;
- if (a == VK_ATTACHMENT_UNUSED)
- continue;
-
- assert(a < cmd_state->pass->attachment_count);
- struct anv_attachment_state *att_state = &cmd_state->attachments[a];
-
- if (!att_state->pending_clear_aspects)
- continue;
-
- assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
-
- struct anv_image_view *iview = fb->attachments[a];
- const struct anv_image *image = iview->image;
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device,
- image, VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage, &surf);
-
- if (att_state->fast_clear) {
- surf.clear_color = vk_to_isl_color(att_state->clear_value.color);
-
- /* From the Sky Lake PRM Vol. 7, "Render Target Fast Clear":
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's
- * going or what's really needed. It appears that fast clear ops are
- * not properly synchronized with other drawing. This means that we
- * cannot have a fast clear operation in the pipe at the same time as
- * other regular drawing operations. We need to use a PIPE_CONTROL
- * to ensure that the contents of the previous draw hit the render
- * target before we resolve and then use a second PIPE_CONTROL after
- * the resolve to ensure that it is completed before any additional
- * drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- assert(image->n_planes == 1);
- blorp_fast_clear(&batch, &surf, iview->planes[0].isl.format,
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer, fb->layers,
- render_area.offset.x, render_area.offset.y,
- render_area.offset.x + render_area.extent.width,
- render_area.offset.y + render_area.extent.height);
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
- } else {
- assert(image->n_planes == 1);
- anv_cmd_buffer_mark_image_written(cmd_buffer, image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage,
- iview->planes[0].isl.base_level);
-
- blorp_clear(&batch, &surf, iview->planes[0].isl.format,
- anv_swizzle_for_render(iview->planes[0].isl.swizzle),
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer, fb->layers,
- render_area.offset.x, render_area.offset.y,
- render_area.offset.x + render_area.extent.width,
- render_area.offset.y + render_area.extent.height,
- vk_to_isl_color(att_state->clear_value.color), NULL);
- }
-
- att_state->pending_clear_aspects = 0;
- }

const uint32_t ds = cmd_state->subpass->depth_stencil_attachment.attachment;
assert(ds == VK_ATTACHMENT_UNUSED || ds < cmd_state->pass->attachment_count);
@@ -1617,6 +1535,39 @@ isl_to_blorp_hiz_op(enum isl_aux_op isl_op)
}

void
+anv_image_clear_color(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ enum isl_format format, struct isl_swizzle swizzle,
+ uint32_t level, uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area, union isl_color_value clear_color)
+{
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ aux_usage, &surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, image, aspect,
+ aux_usage, level);
+
+ blorp_clear(&batch, &surf, format, anv_swizzle_for_render(swizzle),
+ level, base_layer, layer_count,
+ area.offset.x, area.offset.y,
+ area.offset.x + area.extent.width,
+ area.offset.y + area.extent.height,
+ clear_color, NULL);
+
+ blorp_batch_finish(&batch);
+}
+
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index e875705..bc355bb 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2537,6 +2537,14 @@ anv_cmd_buffer_mark_image_written(struct anv_cmd_buffer *cmd_buffer,
unsigned level);

void
+anv_image_clear_color(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ enum isl_format format, struct isl_swizzle swizzle,
+ uint32_t level, uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area, union isl_color_value clear_color);
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 56036f7..265ae44 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3140,7 +3140,9 @@ static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
uint32_t subpass_id)
{
- cmd_buffer->state.subpass = &cmd_buffer->state.pass->subpasses[subpass_id];
+ struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+ struct anv_subpass *subpass = &cmd_state->pass->subpasses[subpass_id];
+ cmd_state->subpass = subpass;

cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;

@@ -3172,6 +3174,48 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
*/
cmd_buffer_subpass_sync_fast_clear_values(cmd_buffer);

+ VkRect2D render_area = cmd_buffer->state.render_area;
+ struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
+ for (uint32_t i = 0; i < subpass->color_count; ++i) {
+ const uint32_t a = subpass->color_attachments[i].attachment;
+ if (a == VK_ATTACHMENT_UNUSED)
+ continue;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state = &cmd_state->attachments[a];
+
+ if (!att_state->pending_clear_aspects)
+ continue;
+
+ assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ /* Multi-planar images are not supported as attachments */
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->n_planes == 1);
+
+ if (att_state->fast_clear) {
+ anv_image_ccs_op(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers,
+ ISL_AUX_OP_FAST_CLEAR, false);
+ } else {
+ anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.format,
+ iview->planes[0].isl.swizzle,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ vk_to_isl_color(att_state->clear_value.color));
+ }
+
+ att_state->pending_clear_aspects = 0;
+ }
+
cmd_buffer_emit_depth_stencil(cmd_buffer);

anv_cmd_buffer_clear_subpass(cmd_buffer);
--
2.5.0.400.gff86faf
Pohjolainen, Topi
2017-12-08 14:16:04 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This doesn't really change much now but it will give us more/better
control over clears in the future. The one interesting functional
change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and
friends for each clear. However, this only happens at begin_subpass
time so it shouldn't be substantially more expensive.
---
src/intel/vulkan/anv_blorp.c | 115 +++++++++++--------------------------
src/intel/vulkan/anv_private.h | 8 +++
src/intel/vulkan/genX_cmd_buffer.c | 46 ++++++++++++++-
3 files changed, 86 insertions(+), 83 deletions(-)
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 46e2eb0..7401234 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1138,17 +1138,6 @@ subpass_needs_clear(const struct anv_cmd_buffer *cmd_buffer)
const struct anv_cmd_state *cmd_state = &cmd_buffer->state;
uint32_t ds = cmd_state->subpass->depth_stencil_attachment.attachment;
- for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
- uint32_t a = cmd_state->subpass->color_attachments[i].attachment;
- if (a == VK_ATTACHMENT_UNUSED)
- continue;
-
- assert(a < cmd_state->pass->attachment_count);
- if (cmd_state->attachments[a].pending_clear_aspects) {
- return true;
- }
- }
-
if (ds != VK_ATTACHMENT_UNUSED) {
assert(ds < cmd_state->pass->attachment_count);
if (cmd_state->attachments[ds].pending_clear_aspects)
@@ -1182,77 +1171,6 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer)
};
struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
- for (uint32_t i = 0; i < cmd_state->subpass->color_count; ++i) {
- const uint32_t a = cmd_state->subpass->color_attachments[i].attachment;
- if (a == VK_ATTACHMENT_UNUSED)
- continue;
-
- assert(a < cmd_state->pass->attachment_count);
- struct anv_attachment_state *att_state = &cmd_state->attachments[a];
-
- if (!att_state->pending_clear_aspects)
- continue;
-
- assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
-
- struct anv_image_view *iview = fb->attachments[a];
- const struct anv_image *image = iview->image;
- struct blorp_surf surf;
- get_blorp_surf_for_anv_image(cmd_buffer->device,
- image, VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage, &surf);
-
- if (att_state->fast_clear) {
- surf.clear_color = vk_to_isl_color(att_state->clear_value.color);
-
- *
- * "After Render target fast clear, pipe-control with color cache
- * write-flush must be issued before sending any DRAW commands on
- * that render target."
- *
- * This comment is a bit cryptic and doesn't really tell you what's
- * going or what's really needed. It appears that fast clear ops are
- * not properly synchronized with other drawing. This means that we
- * cannot have a fast clear operation in the pipe at the same time as
- * other regular drawing operations. We need to use a PIPE_CONTROL
- * to ensure that the contents of the previous draw hit the render
- * target before we resolve and then use a second PIPE_CONTROL after
- * the resolve to ensure that it is completed before any additional
- * drawing occurs.
- */
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
-
- assert(image->n_planes == 1);
- blorp_fast_clear(&batch, &surf, iview->planes[0].isl.format,
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer, fb->layers,
- render_area.offset.x, render_area.offset.y,
- render_area.offset.x + render_area.extent.width,
- render_area.offset.y + render_area.extent.height);
-
- cmd_buffer->state.pending_pipe_bits |=
- ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
- } else {
- assert(image->n_planes == 1);
- anv_cmd_buffer_mark_image_written(cmd_buffer, image,
- VK_IMAGE_ASPECT_COLOR_BIT,
- att_state->aux_usage,
- iview->planes[0].isl.base_level);
-
- blorp_clear(&batch, &surf, iview->planes[0].isl.format,
- anv_swizzle_for_render(iview->planes[0].isl.swizzle),
- iview->planes[0].isl.base_level,
- iview->planes[0].isl.base_array_layer, fb->layers,
- render_area.offset.x, render_area.offset.y,
- render_area.offset.x + render_area.extent.width,
- render_area.offset.y + render_area.extent.height,
- vk_to_isl_color(att_state->clear_value.color), NULL);
- }
-
- att_state->pending_clear_aspects = 0;
- }
const uint32_t ds = cmd_state->subpass->depth_stencil_attachment.attachment;
assert(ds == VK_ATTACHMENT_UNUSED || ds < cmd_state->pass->attachment_count);
@@ -1617,6 +1535,39 @@ isl_to_blorp_hiz_op(enum isl_aux_op isl_op)
}
void
+anv_image_clear_color(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ enum isl_format format, struct isl_swizzle swizzle,
+ uint32_t level, uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area, union isl_color_value clear_color)
+{
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+
+ /* We don't support planar images with multisampling yet */
+ assert(image->n_planes == 1);
+
+ struct blorp_batch batch;
+ blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
+
+ struct blorp_surf surf;
+ get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
+ aux_usage, &surf);
+ anv_cmd_buffer_mark_image_written(cmd_buffer, image, aspect,
+ aux_usage, level);
+
+ blorp_clear(&batch, &surf, format, anv_swizzle_for_render(swizzle),
+ level, base_layer, layer_count,
+ area.offset.x, area.offset.y,
+ area.offset.x + area.extent.width,
+ area.offset.y + area.extent.height,
+ clear_color, NULL);
+
+ blorp_batch_finish(&batch);
+}
+
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index e875705..bc355bb 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2537,6 +2537,14 @@ anv_cmd_buffer_mark_image_written(struct anv_cmd_buffer *cmd_buffer,
unsigned level);
void
+anv_image_clear_color(struct anv_cmd_buffer *cmd_buffer,
+ const struct anv_image *image,
+ VkImageAspectFlagBits aspect,
+ enum isl_aux_usage aux_usage,
+ enum isl_format format, struct isl_swizzle swizzle,
+ uint32_t level, uint32_t base_layer, uint32_t layer_count,
+ VkRect2D area, union isl_color_value clear_color);
+void
anv_image_hiz_op(struct anv_cmd_buffer *cmd_buffer,
const struct anv_image *image,
VkImageAspectFlagBits aspect, uint32_t level,
diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
index 56036f7..265ae44 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -3140,7 +3140,9 @@ static void
cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
uint32_t subpass_id)
{
- cmd_buffer->state.subpass = &cmd_buffer->state.pass->subpasses[subpass_id];
+ struct anv_cmd_state *cmd_state = &cmd_buffer->state;
+ struct anv_subpass *subpass = &cmd_state->pass->subpasses[subpass_id];
+ cmd_state->subpass = subpass;
cmd_buffer->state.dirty |= ANV_CMD_DIRTY_RENDER_TARGETS;
@@ -3172,6 +3174,48 @@ cmd_buffer_begin_subpass(struct anv_cmd_buffer *cmd_buffer,
*/
cmd_buffer_subpass_sync_fast_clear_values(cmd_buffer);
+ VkRect2D render_area = cmd_buffer->state.render_area;
+ struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
+ for (uint32_t i = 0; i < subpass->color_count; ++i) {
+ const uint32_t a = subpass->color_attachments[i].attachment;
+ if (a == VK_ATTACHMENT_UNUSED)
+ continue;
+
+ assert(a < cmd_state->pass->attachment_count);
+ struct anv_attachment_state *att_state = &cmd_state->attachments[a];
+
+ if (!att_state->pending_clear_aspects)
+ continue;
+
+ assert(att_state->pending_clear_aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+
+ struct anv_image_view *iview = fb->attachments[a];
+ const struct anv_image *image = iview->image;
+
+ /* Multi-planar images are not supported as attachments */
+ assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
+ assert(image->n_planes == 1);
+
+ if (att_state->fast_clear) {
+ anv_image_ccs_op(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers,
+ ISL_AUX_OP_FAST_CLEAR, false);
I didn't spot anything amiss elsewhere. Here I'm a little puzzled though. It
looks there is functional change as now the fast clear targets the entire
level. Before in anv_cmd_buffer_clear_subpass() the render area was passed to
blorp_fast_clear():

blorp_fast_clear(&batch, &surf, iview->planes[0].isl.format,
iview->planes[0].isl.base_level,
iview->planes[0].isl.base_array_layer, fb->layers,
render_area.offset.x, render_area.offset.y,
render_area.offset.x + render_area.extent.width,
render_area.offset.y + render_area.extent.height);

Clearing the entire level makes sense to me and I'm more wondering that did we
really do partial fast clears before? Or was the render_area in practise
covering the entire level.
Post by Jason Ekstrand
+ } else {
+ anv_image_clear_color(cmd_buffer, image, VK_IMAGE_ASPECT_COLOR_BIT,
+ att_state->aux_usage,
+ iview->planes[0].isl.format,
+ iview->planes[0].isl.swizzle,
+ iview->planes[0].isl.base_level,
+ iview->planes[0].isl.base_array_layer,
+ fb->layers, render_area,
+ vk_to_isl_color(att_state->clear_value.color));
+ }
+
+ att_state->pending_clear_aspects = 0;
+ }
+
cmd_buffer_emit_depth_stencil(cmd_buffer);
anv_cmd_buffer_clear_subpass(cmd_buffer);
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Pohjolainen, Topi
2017-11-28 14:09:52 UTC
Reply
Permalink
Raw Message
Post by Jason Ekstrand
This patch series is a major rework of the aux tracking and fast clear code
1) Patches 1-13 rework the way layout transitions work and add some
additional granularity to our aux tracking scheme. This is required to
support Y-tiled window system buffers where we have CCS_E but need to
do a full resolve prior to handing it off to the window system. The
current code does a partial resolve if and only if CCS_E is enabled.
These patches may get back-ported to 17.3 because it seems that people
are hitting issues with this somewhere in Chrome.
2) Patches 14-25 rework the code we have for doing fast clears, setting up
indirect clear colors, and doing implicit layout transitions. In
particular, we pull them all together into a single begin/end_subpass
function pair instead of scattering them across multiple functions in
genX_cmd_buffer.c and anv_blorp.c. This allows us to avoid the
redundant fast-clear that you get when you have LOAD_OP_CLEAR combined
with IMAGE_LAYOUT_UNDEFINED.
3) Patches 26-29 revive my old CCS ambiguate pass and make us use that
instead of a fast-clear for initializing CCS buffers on gen9+. This
should allow us to avoid some unneeded resolves in a couple of corner
cases. It also simplifies transition_color_buffer a decent bit.
I've organized this patch series in order of priority both in terms of time
and in terms of importance. If the third chunk doesn't land for a while or
never at all, I'm not going to cry over it, but I do think it's quite a bit
better.
intel/isl: Codify AUX operations in an enum
anv/blorp: Rework image clear/resolve helpers
anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
anv/blorp: Rework HiZ ops to look like MCS and CCS
anv/image: Update a comment
I had to go back-and-forth in patch 2 but I really liked the end result. There
is a case missed in patch 4 it seems, but patches 1-5:

Reviewed-by: Topi Pohjolainen <***@intel.com>

I will try to read the rest also. I'm not that familiar with the big picture
in anvil so I wouldn't put too much weight on it.
Post by Jason Ekstrand
anv/image: Add a helper for determining when fast clears are supported
anv/image: Support color aspects in layout_to_aux_usage
anv/cmd_buffer: Recurse in transition_color_buffer instead of falling
through
anv/cmd_buffer: Generalize transition_color_buffer
anv/cmd_buffer: Add an anv_genX_call macro
anv/cmd_buffer: Add a mark_image_written helper
anv/cmd_buffer: Drop the genX from get/set_needs_resolve
anv/cmd_buffer: Rework aux tracking
anv/cmd_buffer: Apply subpass flushes before set_subpass
anv/cmd_buffer: Add begin/end_subpass helpers
anv/cmd_buffer: Pass a subpass id into begin_subpass
anv/cmd_buffer: Move the color portion of clear_subpass into
begin_subpass
intel/blorp: Add a blorp_hiz_clear_depth_stencil helper
anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass
anv/cmd_buffer: Decide whether or not to HiZ clear up-front
anv/cmd_buffer: Iterate all subpass attachments when clearing
anv/cmd_buffer: Add a concept of pending load aspects
anv/cmd_buffer: Sync clear values in begin_subpass
anv/cmd_buffer: Do subpass image transitions in begin/end_subpass
anv/cmd_buffer: Avoid unnecessary transitions before fast clears
intel/blorp: Add a CCS ambiguation pass
anv/cmd_buffer: Pull the undefined layout condition into the if
anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
anv: Use blorp_ccs_ambiguate instead of fast-clears
src/intel/blorp/blorp.h | 16 +
src/intel/blorp/blorp_clear.c | 156 ++++++++
src/intel/isl/isl.h | 74 ++--
src/intel/vulkan/anv_blorp.c | 661 +++++++++++++++---------------
src/intel/vulkan/anv_cmd_buffer.c | 52 ++-
src/intel/vulkan/anv_genX.h | 6 +
src/intel/vulkan/anv_image.c | 108 ++++-
src/intel/vulkan/anv_private.h | 86 +++-
src/intel/vulkan/genX_cmd_buffer.c | 795 +++++++++++++++++++++++--------------
9 files changed, 1249 insertions(+), 705 deletions(-)
--
2.5.0.400.gff86faf
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Loading...