Discussion:
[Mesa-dev] [Bug 94957] dEQP failures on llvmpipe
b***@freedesktop.org
2016-04-15 23:50:51 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

Bug ID: 94957
Summary: dEQP failures on llvmpipe
Product: Mesa
Version: unspecified
Hardware: Other
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: Mesa core
Assignee: mesa-***@lists.freedesktop.org
Reporter: ***@alum.mit.edu
QA Contact: mesa-***@lists.freedesktop.org

Created attachment 122980
--> https://bugs.freedesktop.org/attachment.cgi?id=122980&action=edit
list of llvmpipe-related deqp fails

There are a number of dEQP tests that fail on llvmpipe, it's up to the llvmpipe
maintainers whether they care or not. I'm guessing for a lot of these, it will
be "not". I've manually removed a number of failures that are msaa-related, or
are otherwise not llvmpipe's fault.

Also, due to dEQP testsuite bugs, I avoided running anything with mipmap_linear
in the name, as it leads to super-long-running tests that fail anyways. However
it would seem that mipmap_linear filtering is non-functional. Perhaps that's
known.

I'm thinking that as these issues are triaged, dependent bugs will be created.
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2016-04-15 23:55:49 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #1 from Ilia Mirkin <***@alum.mit.edu> ---
This was, by the way, the result of running mesa master with

LIBGL_ALWAYS_SOFTWARE=1 ./deqp-gles3 --deqp-visibility=hidden
--deqp-caselist-file=<(grep -v 'mipmap_linear'
../../android/cts/master/gles3-master.txt )

You should be able to run any one of the failures directly by doing
--deqp-case='foobar' instead of the caselist-file.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
b***@freedesktop.org
2016-04-16 00:31:06 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #2 from Roland Scheidegger <***@vmware.com> ---
Oh, linear mipmap filtering should work perfectly. For performance reasons
though we cheat, which is likely why it fails.
The cheats can be disabled via env var, preferably all 3 of them
(GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod).
If the vertex texturing tests use (non-constant) explicit lod or derivatives
that would explain the failures there as well.
Though I'm wondering about the blend failures, should work perfect (and gets
quite a lot of test coverage from piglit). Or is this silly and complaining
about single-bit rounding errors?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
b***@freedesktop.org
2016-04-16 00:36:23 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #3 from Ilia Mirkin <***@alum.mit.edu> ---
A random sampling of the texturing tests that failed seem to all pass with

GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod

I will start a fresh run with these parameters. A random blending test makes it
seem like it completely fails. I'd like to encourage you to grab a copy of deqp
and run it yourself to see the details - you get expected and actual images
among other things.

<TestCaseResult Version="0.3.3"
CasePath="dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.src.dst_color_one_minus_src_alpha"
CaseType="SelfValidate">
<Text>RGB equation = GL_FUNC_ADD</Text>
<Text>RGB src func = GL_DST_COLOR</Text>
<Text>RGB dst func = GL_ONE</Text>
<Text>Alpha equation = GL_FUNC_ADD</Text>
<Text>Alpha src func = GL_ONE_MINUS_SRC_ALPHA</Text>
<Text>Alpha dst func = GL_ONE</Text>
<Text>Blend color = (0.2, 0.4, 0.6, 0.8)</Text>
<Text>Image comparison failed: max difference = (97, 97, 97, 1), threshold =
(4, 4, 4, 4)</Text>

Looking at the error mask:


IBAF0Y/3vzMWRgqjnctkk3mVdDsgDSMzcyTJmEnafC9HmjOA1jXgug/pG5C7oXHA
xQCaATQDaAbQDKAZQDOAZgDNAJoBtB0Bj8fAf206gbqGwoDSjV+qAvZMHy8xz4AP
6/X4dfmjwhNYQ9dNH38hngE0A2gG0AygGUAzgGYAzQCaATQDaAbQDKAZQDsBxS4p
f+bHF58AAAAASUVORK5CYII=

(paste that into the url bar of your browser) makes it seem like something
funny is going on. Green = good, red = bad. This is running on a SKL with LLVM
3.7.1 in case it matters.
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2016-04-16 00:47:10 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #4 from Roland Scheidegger <***@vmware.com> ---
(In reply to Ilia Mirkin from comment #3)
Post by b***@freedesktop.org
I will start a fresh run with these parameters. A random blending test makes
it seem like it completely fails. I'd like to encourage you to grab a copy
of deqp and run it yourself to see the details - you get expected and actual
images among other things.
<TestCaseResult Version="0.3.3"
CasePath="dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.
rgb_func_alpha_func.src.dst_color_one_minus_src_alpha"
CaseType="SelfValidate">
<Text>RGB equation = GL_FUNC_ADD</Text>
<Text>RGB src func = GL_DST_COLOR</Text>
<Text>RGB dst func = GL_ONE</Text>
<Text>Alpha equation = GL_FUNC_ADD</Text>
<Text>Alpha src func = GL_ONE_MINUS_SRC_ALPHA</Text>
<Text>Alpha dst func = GL_ONE</Text>
<Text>Blend color = (0.2, 0.4, 0.6, 0.8)</Text>
<Text>Image comparison failed: max difference = (97, 97, 97, 1), threshold
= (4, 4, 4, 4)</Text>
Yes, that looks quite wrong indeed - and not precision related of course.
I'll give it a try...
Post by b***@freedesktop.org
(paste that into the url bar of your browser) makes it seem like something
funny is going on. Green = good, red = bad. This is running on a SKL with
LLVM 3.7.1 in case it matters.
Ideally it should of course not, albeit different bugs with and without AVX are
possible, as quite different code paths may be used due to 8x32 vs. 4x32
vectors (if you use LP_NATIVE_VECTOR_WIDTH=128 it will disable avx). Of course
it's also possible llvm miscompiles things in which case the llvm version would
matter, but that should be rare.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
b***@freedesktop.org
2016-04-19 17:28:23 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

Ilia Mirkin <***@alum.mit.edu> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #122980|0 |1
is obsolete| |

--- Comment #5 from Ilia Mirkin <***@alum.mit.edu> ---
Created attachment 123061
--> https://bugs.freedesktop.org/attachment.cgi?id=123061&action=edit
list of llvmpipe-related deqp fails

Updated dEQP fail list attached (again, filtered for msaa-related fails, but
left linear mipmaps in this time). This was run with a copy of mesa which
includes the LLVM 3.7 workaround for broken vector selects.

Also GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod was used in the
environment.

LLVM 3.7.1, Core i7-6700 (SKL)
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2016-04-19 17:58:30 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #6 from Roland Scheidegger <***@vmware.com> ---
I wonder what deqp doesn't like about our nearest_mipmap_linear implementation
(all filtering errors use that).
Also, I'm wondering if the test is overly picky about pow. The spec says right
there the error is derived as pow(x,y) = exp2(log2(y) * x) (note there is a
spec bug, x and y are swapped), which is exactly as we implement it. Therefore,
if our results are good enough for passing exp2 and log2, we should pass pow as
well.
The problems with 32bit integer formats look a bit odd as well (since there can
be no filtering or blending or whatever, the values should remain mostly
untouched), not sure what's up with that.
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2016-04-19 18:03:50 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #7 from Ilia Mirkin <***@alum.mit.edu> ---
(In reply to Roland Scheidegger from comment #6)
Post by b***@freedesktop.org
I wonder what deqp doesn't like about our nearest_mipmap_linear
implementation (all filtering errors use that).
Error mask for
dEQP-GLES3.functional.texture.filtering.2d.formats.rgba16f_nearest_mipmap_linear:


MAwE0HPp//+yO7kppUMhWoTeLQmCmCM8DV472Wvnnb2u9zP/Nfuc/zu7e+b392tf
zzs9P8a9cn7AnTzPQSedBFT1JICApqkQ8Kip0jdWYPoKEEBACCCAgKYhoKgnAQQ0
DQFFPUcLcB1OrMDoFSCAgBBAAAFNQ0BRTwIIaBoCinqOFuA6nFiB0StAAAEhgAAC
moaAop4EENA0BBT1HC3AdTixAqNXgAACQgABBDQNAUU9CSCgaSoEvADIL0yLjY1M
5wAAAABJRU5ErkJggg== <Text>Texture coordinates: (-1, -2.7) -&gt; (2.03143, 4.76426)</Text>
<Text>ERROR: Result verification failed, got 3072 invalid pixels!</Text>
Post by b***@freedesktop.org
Also, I'm wondering if the test is overly picky about pow. The spec says
right there the error is derived as pow(x,y) = exp2(log2(y) * x) (note there
is a spec bug, x and y are swapped), which is exactly as we implement it.
Therefore, if our results are good enough for passing exp2 and log2, we
should pass pow as well.
pow() fails for inf ^ x == inf. I glanced at the gallivm code, and this appears
to be on purpose (i.e. you generate faster code that doesn't handle inf).
Post by b***@freedesktop.org
The problems with 32bit integer formats look a bit odd as well (since there
can be no filtering or blending or whatever, the values should remain mostly
untouched), not sure what's up with that.
Error mask for dEQP-GLES3.functional.fbo.color.tex2d.rgba32i:


IBAFife/83qBAFu1ZPoFuj9dlcE2k8k+8fq0e67eC619rp8+efolvy+xtp7nimmt
/Y1elOdRAIwCYBQAowAYBcDcCxhPTq8mbWtXS5rHqYeDgnp6S2S1oN5TEsXCenrf
6kEBaRd7Jg8KSPt1zsQrAMYrACbrJnwgWQI27lq9rd1//NF5vGCJtSM/kAds3L7T
tpbVgg5EATAKgFEAjAJgzANgzANG9RQskdWC0ob0hQB5gHyDAmAUAKMAGAXAZAlI
G9IXYh4wKck8oIi0rWW1oANRAIwCYBQAowAY8wAY84BRPQVLZLWgtCF9IeYBoSgA
RgEwHtCA8YAGTNYVsPFFYx4wKck8oIi0rTkFwSgARgEwCoBRAIx5AIx5QFE9vSWy
noTThvR3/7+gtJ9mJt6EYRQAowAYBcBkCUgb0hdiHjApyTygiLStZbWgA1EAjAJg
FACjABjzABjzgFE9BUtktaC0IX0hng8IRQEwCoBRAIwCYLL+KiJtSF+IecCkJPOA
ItK25j0ARgEwCoBRAIwCYMwDYMwDiurpLZHVgtKG9HefD5BvyHov6EA8IQNjC4JR
AEyWgI1vG+YBk5LMA4pI21pWCzoQBcAoAEYBMAqAMQ+AMQ8Y1VOwRFYLShvSF2Ie
EIoCYBQAowAYBcBkCUgb0hdiHjApyTygiLStZbWgA1EAjAJgFACjABjzABjzgKJ6
ektktaC0Id3zAfujABgFwCgARgEwnhGD+QfUVU3wABEGZAAAAABJRU5ErkJggg
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
b***@freedesktop.org
2016-04-19 19:36:58 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #8 from Roland Scheidegger <***@vmware.com> ---
(In reply to Ilia Mirkin from comment #7)
Post by b***@freedesktop.org
Post by b***@freedesktop.org
Also, I'm wondering if the test is overly picky about pow. The spec says
right there the error is derived as pow(x,y) = exp2(log2(y) * x) (note there
is a spec bug, x and y are swapped), which is exactly as we implement it.
Therefore, if our results are good enough for passing exp2 and log2, we
should pass pow as well.
pow() fails for inf ^ x == inf. I glanced at the gallivm code, and this
appears to be on purpose (i.e. you generate faster code that doesn't handle
inf).
Ahh right forgot about that - we hook up the safe log2 version for LG2 tgsi
opcode, but use the unsafe version for pow.
I think we did the lg2 safe version for d3d10 initially, since in gl it
traditionally didn't really matter. And pow doesn't exist in d3d10.
I suppose we could switch that if it's really worth it (too bad the special
values require 3 comparisons, 3 selects).
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2016-04-20 03:22:23 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #9 from Roland Scheidegger <***@vmware.com> ---
So, the r32i/ui failures are actually due to an overflow.
One example I've seen samples a rgb8 unorm texture, scales to int range,
converts to int and outputs this. The problem is that rescaling to 2^31 - 1
really ends up with 2^31 due to imprecise float math, which causes an overflow
when converted to an int.
I'm nearly certain this is undefined behavior by the glsl spec, albeit the spec
doesn't explicitly say so (but should probably follow from ieee754 math). d3d10
would require clamping, making it work.
So, I'm inclined to say that's just a test bug. But even if it's undefined but
all gpus clamp anyway we might want to fix it nonetheless...
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
b***@freedesktop.org
2018-04-06 18:54:06 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #10 from ***@gmail.com ---
I am also seeing this same issue on Mesa 17.3.6. I wanted to know if there is
an update /patch available for this issue.
Thanks.
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2018-04-06 19:34:29 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #11 from Roland Scheidegger <***@vmware.com> ---
(In reply to msdhedhi007 from comment #10)
Post by b***@freedesktop.org
I am also seeing this same issue on Mesa 17.3.6. I wanted to know if there
is an update /patch available for this issue.
There is no goal as such to pass dEQP. As mentioned, some bugs are due to
performance optimizations (which can be disabled, albeit only on debug builds).
Some might not even be real bugs (also as mentioned, where I think dEQP relies
on behavior not guaranteed by the spec). For both of these types, there's no
interest in addressing these (albeit I suppose if you're talking about making
it possible to disable performance hacks on release builds, that could be
done).
As for the rest, patches welcome, but personally I've got little interest and
definitely no time to specifically look into dEQP failures.
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2018-04-13 10:20:55 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

Timothy Arceri <***@yahoo.com.au> changed:

What |Removed |Added
----------------------------------------------------------------------------
Component|Mesa core |Drivers/Gallium/llvmpipe
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2018-12-07 16:02:39 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #12 from Emil Velikov <***@gmail.com> ---
The below commit allows us to disable the perf. optimisations (for release
builds), and thus fixing the functional.texture tests.

Should we close this bug, or keep it open as all the failing tests have a
solution/workaround?

commit 8f77156c268356baf9df8490c52cc5d8475b9db8
Author: Gert Wollny <***@collabora.com>
Date: Fri Oct 5 15:08:51 2018 +0200

gallivm: Make it possible to disable some optimization shortcuts in release
builds
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
b***@freedesktop.org
2018-12-08 01:57:32 UTC
Permalink
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #13 from Roland Scheidegger <***@vmware.com> ---
(In reply to Emil Velikov from comment #12)
Post by b***@freedesktop.org
The below commit allows us to disable the perf. optimisations (for release
builds), and thus fixing the functional.texture tests.
Should we close this bug, or keep it open as all the failing tests have a
solution/workaround?
If there's still failures with the perf optimizations disabled I think we
should keep it open.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
Loading...