Discussion:
[PATCH] radv: enable denorms for 64-bit and 16-bit floats
(too old to reply)
Samuel Pitoiset
2017-12-28 21:55:27 UTC
Permalink
Raw Message
Similar to RadeonSI.

This fixes:
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat

Signed-off-by: Samuel Pitoiset <***@gmail.com>
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers are
+ * ignored.
+ *
+ * Don't enable denormals for 32-bit floats, because:
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}

static void
--
2.15.1
Matt Arsenault
2017-12-28 22:08:53 UTC
Permalink
Raw Message
Post by Samuel Pitoiset
Similar to RadeonSI.
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers are
+ * ignored.
+ *
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}
This is set in the program binary. You should use that directly rather than ignoring it
Samuel Pitoiset
2017-12-28 22:13:14 UTC
Permalink
Raw Message
Post by Matt Arsenault
Post by Samuel Pitoiset
Similar to RadeonSI.
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers are
+ * ignored.
+ *
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}
This is set in the program binary. You should use that directly rather than ignoring it
Ah, I didn't know.
Samuel Pitoiset
2018-01-04 10:54:46 UTC
Permalink
Raw Message
Post by Matt Arsenault
Post by Samuel Pitoiset
Similar to RadeonSI.
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers are
+ * ignored.
+ *
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}
This is set in the program binary. You should use that directly rather than ignoring it
Not sure to understand where that flag is set actually, and RadeonSI
does a similar thing.
Bas Nieuwenhuizen
2018-01-04 11:15:30 UTC
Permalink
Raw Message
Looking at AMDGPUAsmPrinter::EmitProgramInfoSI in LLVM that is only
set for compute shaders. So fix radv to default to the proposed value
and fix LLVM to pass it through for all shaders?

On Thu, Jan 4, 2018 at 11:54 AM, Samuel Pitoiset
Post by Matt Arsenault
Post by Samuel Pitoiset
Similar to RadeonSI.
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c
b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void
ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no
performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers
are
+ * ignored.
+ *
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We
would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}
This is set in the program binary. You should use that directly rather than ignoring it
Not sure to understand where that flag is set actually, and RadeonSI does a
similar thing.
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Bas Nieuwenhuizen
2018-01-04 18:54:40 UTC
Permalink
Raw Message
Reviewed-by: Bas Nieuwenhuizen <***@basnieuwenhuizen.nl>

On Thu, Dec 28, 2017 at 10:55 PM, Samuel Pitoiset
Post by Samuel Pitoiset
Similar to RadeonSI.
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers are
+ * ignored.
+ *
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}
static void
--
2.15.1
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Samuel Pitoiset
2018-01-05 08:50:43 UTC
Permalink
Raw Message
Yeah, I think it's easier to fix it that way for now.
Thanks!
Post by Bas Nieuwenhuizen
On Thu, Dec 28, 2017 at 10:55 PM, Samuel Pitoiset
Post by Samuel Pitoiset
Similar to RadeonSI.
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
---
src/amd/common/ac_nir_to_llvm.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..9d9a1f911b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6879,6 +6879,20 @@ static void ac_compile_llvm_module(LLVMTargetMachineRef tm,
/* +3 for scratch wave offset and VCC */
config->num_sgprs = MAX2(config->num_sgprs,
shader_info->num_input_sgprs + 3);
+
+ /* Enable 64-bit and 16-bit denormals, because there is no performance
+ * cost.
+ *
+ * If denormals are enabled, all floating-point output modifiers are
+ * ignored.
+ *
+ * - Floating-point output modifiers would be ignored by the hw.
+ * - Some opcodes don't support denormals, such as v_mad_f32. We would
+ * have to stop using those.
+ * - SI & CI would be very slow.
+ */
+ config->float_mode |= V_00B028_FP_64_DENORMS;
}
static void
--
2.15.1
_______________________________________________
mesa-dev mailing list
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Loading...