Rhys Perry
2018-12-07 17:21:53 UTC
This series add support for:
- VK_KHR_shader_float16_int8
- VK_AMD_gpu_shader_half_float
- VK_AMD_gpu_shader_int16
- VK_KHR_8bit_storage
on VI+. Half floats are currently disabled on LLVM 7 because of a bug
causing large memory usage and long (or unbounded) compilation times with
some tests.
It depends on the follow patch series:
- https://patchwork.freedesktop.org/series/53454/
- https://patchwork.freedesktop.org/series/53602/
- https://patchwork.freedesktop.org/series/53660/
An older version was tested on my Polaris card, but due to hardware issues
I currently can't test the latest version of the series.
deqp-vk has no regressions and none of the newly enabled tests fail.
Rhys Perry (38):
ac: add various helpers for float16/int16/int8
ac/nir: implement 8-bit push constant, ssbo and ubo loads
ac/nir: implement 8-bit ssbo stores
ac/nir: fix 16-bit ssbo stores
ac/nir: implement 8-bit nir_load_const_instr
ac/nir: implement 8-bit conversions
ac/nir: fix 64-bit nir_op_f2f16_rtz
ac/nir: make ac_build_clamp work on all bit sizes
ac/nir: make ac_build_fract work on all bit sizes
ac/nir: make ac_build_isign work on all bit sizes
ac/nir: make ac_build_fsign work on all bit sizes
ac/nir: make ac_build_fdiv support 16-bit floats
ac/nir: implement half-float nir_op_frcp
ac/nir: implement half-float nir_op_frsq
ac/nir: implement half-float nir_op_ldexp
radv: lower 16-bit flrp
ac/nir: support half floats in emit_b2f
ac/nir: make emit_b2i work on all bit sizes
ac/nir: implement 16-bit shifts
compiler/nir: add lowering option for 16-bit ffma
ac/nir: implement 16-bit ac_build_ddxy
ac/nir: implement 8 and 16 bit ac_build_readlane
nir: make bitfield_reverse and ifind_msb work with all integers
ac/nir: make ac_find_lsb work on all bit sizes
ac/nir: make ac_build_umsb work on all bit sizes
ac/nir: implement 8 and 16 bit ac_build_imsb
ac/nir: make ac_build_bit_count work on all bit sizes
ac/nir: make ac_build_bitfield_reverse work on all bit sizes
ac/nir: implement 16-bit pack/unpack opcodes
ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
ac/nir,radv: create an array of varying output types
ac/nir: store all outputs as f32
radv: store all fragment shader inputs as f32
radv: handle all fragment output types
ac,radv: run LLVM's SLP vectorizer
ac/nir: generate better code for nir_op_f2f16_rtz
ac/nir: have nir_op_f2f16 round to zero
radv: expose float16, int16 and int8 features and extensions
src/amd/common/ac_llvm_build.c | 355 ++++++++++++++------------
src/amd/common/ac_llvm_build.h | 22 +-
src/amd/common/ac_llvm_util.c | 9 +-
src/amd/common/ac_llvm_util.h | 1 +
src/amd/common/ac_nir_to_llvm.c | 258 +++++++++++++++----
src/amd/common/ac_shader_abi.h | 1 +
src/amd/vulkan/radv_device.c | 17 ++
src/amd/vulkan/radv_extensions.py | 4 +
src/amd/vulkan/radv_nir_to_llvm.c | 92 ++++---
src/amd/vulkan/radv_shader.c | 7 +
src/broadcom/compiler/nir_to_vir.c | 1 +
src/compiler/nir/nir.h | 1 +
src/compiler/nir/nir_opcodes.py | 4 +-
src/compiler/nir/nir_opt_algebraic.py | 4 +-
src/gallium/drivers/radeonsi/si_get.c | 1 +
src/gallium/drivers/vc4/vc4_program.c | 1 +
16 files changed, 516 insertions(+), 262 deletions(-)
- VK_KHR_shader_float16_int8
- VK_AMD_gpu_shader_half_float
- VK_AMD_gpu_shader_int16
- VK_KHR_8bit_storage
on VI+. Half floats are currently disabled on LLVM 7 because of a bug
causing large memory usage and long (or unbounded) compilation times with
some tests.
It depends on the follow patch series:
- https://patchwork.freedesktop.org/series/53454/
- https://patchwork.freedesktop.org/series/53602/
- https://patchwork.freedesktop.org/series/53660/
An older version was tested on my Polaris card, but due to hardware issues
I currently can't test the latest version of the series.
deqp-vk has no regressions and none of the newly enabled tests fail.
Rhys Perry (38):
ac: add various helpers for float16/int16/int8
ac/nir: implement 8-bit push constant, ssbo and ubo loads
ac/nir: implement 8-bit ssbo stores
ac/nir: fix 16-bit ssbo stores
ac/nir: implement 8-bit nir_load_const_instr
ac/nir: implement 8-bit conversions
ac/nir: fix 64-bit nir_op_f2f16_rtz
ac/nir: make ac_build_clamp work on all bit sizes
ac/nir: make ac_build_fract work on all bit sizes
ac/nir: make ac_build_isign work on all bit sizes
ac/nir: make ac_build_fsign work on all bit sizes
ac/nir: make ac_build_fdiv support 16-bit floats
ac/nir: implement half-float nir_op_frcp
ac/nir: implement half-float nir_op_frsq
ac/nir: implement half-float nir_op_ldexp
radv: lower 16-bit flrp
ac/nir: support half floats in emit_b2f
ac/nir: make emit_b2i work on all bit sizes
ac/nir: implement 16-bit shifts
compiler/nir: add lowering option for 16-bit ffma
ac/nir: implement 16-bit ac_build_ddxy
ac/nir: implement 8 and 16 bit ac_build_readlane
nir: make bitfield_reverse and ifind_msb work with all integers
ac/nir: make ac_find_lsb work on all bit sizes
ac/nir: make ac_build_umsb work on all bit sizes
ac/nir: implement 8 and 16 bit ac_build_imsb
ac/nir: make ac_build_bit_count work on all bit sizes
ac/nir: make ac_build_bitfield_reverse work on all bit sizes
ac/nir: implement 16-bit pack/unpack opcodes
ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
ac/nir,radv: create an array of varying output types
ac/nir: store all outputs as f32
radv: store all fragment shader inputs as f32
radv: handle all fragment output types
ac,radv: run LLVM's SLP vectorizer
ac/nir: generate better code for nir_op_f2f16_rtz
ac/nir: have nir_op_f2f16 round to zero
radv: expose float16, int16 and int8 features and extensions
src/amd/common/ac_llvm_build.c | 355 ++++++++++++++------------
src/amd/common/ac_llvm_build.h | 22 +-
src/amd/common/ac_llvm_util.c | 9 +-
src/amd/common/ac_llvm_util.h | 1 +
src/amd/common/ac_nir_to_llvm.c | 258 +++++++++++++++----
src/amd/common/ac_shader_abi.h | 1 +
src/amd/vulkan/radv_device.c | 17 ++
src/amd/vulkan/radv_extensions.py | 4 +
src/amd/vulkan/radv_nir_to_llvm.c | 92 ++++---
src/amd/vulkan/radv_shader.c | 7 +
src/broadcom/compiler/nir_to_vir.c | 1 +
src/compiler/nir/nir.h | 1 +
src/compiler/nir/nir_opcodes.py | 4 +-
src/compiler/nir/nir_opt_algebraic.py | 4 +-
src/gallium/drivers/radeonsi/si_get.c | 1 +
src/gallium/drivers/vc4/vc4_program.c | 1 +
16 files changed, 516 insertions(+), 262 deletions(-)
--
2.19.2
2.19.2