[Mesa-dev] [Bug 102597] [Regression] mpv, high rendering times (two to three times higher)

Discussion:

b***@freedesktop.org

2017-09-07 22:34:19 UTC

https://bugs.freedesktop.org/show_bug.cgi?id=102597

Bug ID: 102597
Summary: [Regression] mpv, high rendering times (two to three
times higher)
Product: Mesa
Version: git
Hardware: Other
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: Mesa core
Assignee: mesa-***@lists.freedesktop.org
Reporter: ***@gmail.com
QA Contact: mesa-***@lists.freedesktop.org

Known good version: 17.1.6
Known bad version: 17.2

To reproduce, one need only run mpv with the arguments --vo=opengl-hq
--scale=ewa_lanczossharp

--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.

b***@freedesktop.org

2017-09-08 01:30:38 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

Kenneth Graunke <***@whitecape.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO

--- Comment #1 from Kenneth Graunke <***@whitecape.org> ---
We can't really look into this unless you tell us what hardware you're using.

--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.

b***@freedesktop.org

2017-09-08 10:58:14 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #2 from Cris <***@gmail.com> ---
(In reply to Kenneth Graunke from comment #1)

Post by b***@freedesktop.org
We can't really look into this unless you tell us what hardware you're using.

R9 280x, mpv git, linux 4.13. mpv-stats.lua is required to measure timings in a
hassle free manner.

--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.

b***@freedesktop.org

2017-09-11 23:23:39 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #3 from Niklas Haas <***@haasn.xyz> ---
Hello, this is the author of the affected `mpv` code.

I've reproduced and bisected the mesa issue, and the first bad commit is:

bd4b224fa6630262df2b70fd6a6fc8080ad59086 is the first bad commit
commit bd4b224fa6630262df2b70fd6a6fc8080ad59086
Author: Marek OlÅ¡Ã¡k <***@amd.com>
Date: Mon May 15 17:27:25 2017 +0200

gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSED

Reviewed-by: Nicolai HÃ€hnle <***@amd.com>

:040000 040000 22f52d00a837608d7a43e50498519850bf6ccdf6
eebe49b7fe167a7d6f96fe2a18218f350dededce M src

This leads me to believe it's just a cosmetic issue, rather than a geneuine
performance regression. In order to figure out whether the results were more
correct before or after the change, I compared the numbers of all of my render
passes (added together) against the measurements of a single TIME_ELAPSED query
that surrounds the entire frame. The numbers matched perfectly before this
commit. Since the commit, the per-pass numbers are all way higher than they
should be.

I suspect this is because the driver is now âover-estimatingâ the amount of
time taken in between successive passes. Consider the following call sequence:

glBeginQuery(GL_TIME_ELAPSED)
// submit some draw calls
glEndQuery(GL_TIME_ELAPSED)
glBeginQuery(GL_TIME_ELAPSED)
// submit some more draw calls
glEndQuery(GL_TIME_ELAPSED)

If what I suspect is right, and these are assembled into the same command
buffer, then the ârecord TOP_OF_PIPE timestampâ for both queries would possibly
fire at the same time (out of order with the draw calls), leading to an
over-counting of the time taken for the second set of draw calls.

Perhaps it would be more correct to submit all of the timestamps with the
pipeline stage set to BOTTOM_OF_PIPE bit, so that the timestamps are
(respectively) only recorded once the previous commands in the command buffer
have been fully realized?

--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.

b***@freedesktop.org

2017-09-11 23:46:55 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #4 from Niklas Haas <***@haasn.xyz> ---
Also, even with this patch reverted, the reported timers are unrealistic: The
TIME_ELAPSED queries tell me my rendering takes ~8ms, but I'm getting 200 fps.
(With higher settings, it tells me it takes ~20ms but I'm getting 90 fps)

--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.

b***@freedesktop.org

2017-09-11 23:51:34 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

The TIME_ELAPSED queries tell me my rendering takes ~8ms, but I'm getting 200 fps.

Oops, my bad - I was measuring FPS incorrectly. With the measurement fixed, and
the patch in question reverted, the numbers match up perfectly. GL_TIME_ELAPSED
queries sum up to ~19ms and I'm getting ~50 fps.

--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.

b***@freedesktop.org

2017-09-12 00:10:55 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #6 from Marek OlÅ¡Ã¡k <***@gmail.com> ---
Here's why the times are different: Draw calls are not executed sequentially,
but instead they are executed in parallel when it's possible. If you start 10
draw calls wrapped around 10 TIME_ELAPSED queries, the start times will be
similar (they all should start quickly if compute units aren't overloaded), but
the end times will be different. The TIME_ELAPSED query then returns (end -
start).

Now some theory. The top-of-pipe timestamp means "before the next draw call
started". The bottom-of-pipe timestamp means "after the last draw call
finished". The driver subtracts top-of-pipe from bottom-of-pipe to get
TIME_ELAPSED.

That is an accurate implementation for measuring how much time each draw call
spent in this highly pipelined and parallel system that is the GPU. However,
it's actually not what the spec "suggests" in the Issues section of
ARB_timer_query. It suggests that TIME_ELAPSED should record both start and end
timestamps at bottom-of-pipe. That seems kinda silly, because it means that
TIME_ELAPSED should be (draw_finish_time - previous_draw_finish_time). What if
they finish nearly at the same time? TIME_ELAPSED would be close to 0, and that
is not how much time the draw call spent in the system.

This all becomes a non-issue if we simply assume that TIME_ELAPSED results are
not additive.

--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.

b***@freedesktop.org

2017-09-12 04:53:40 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

That seems kinda silly, because it means that TIME_ELAPSED should be (draw_finish_time - previous_draw_finish_time). What if they finish nearly at the same time? TIME_ELAPSED would be close to 0, and that is not how much time the draw call spent in the system.

Doesn't it become a question of what these time_elapsed queries are supposed to
be useful for? If a pass becomes "free" because it's executed in parallel with
another pass, then I would (personally) expect the measured execution time to
be low. Basically, what I'm interested in measuring is the critical path;
rather than the âsumâ of all GPU time spent in the program. I also highly rely
on the additive property of pass queries, which is perhaps what khronos also
intended when they advised using bottom_of_pipe queries.

As for practical solutions - I've already experimented using GL_TIMESTAMP
instead of GL_TIME_ELAPSED - the main problem I had with this approach was that
it completely broke on OS X and Windows (ANGLE), because apparently google and
apple are unwilling to implement them, citing âlack of a way to communicate
overflowâ as a reason (and apparently oblivious to the possibility of using
64-bit timestamps like everybody else).

Worst comes to worst, I'll have to work around this particular issue by using
GL_TIMESTAMP where available and falling back to GL_TIME_ELAPSED otherwise, but
either way, I still think this is a rather silly violation of both âcommon
senseâ and the khronos specification.

When BeginQuery and EndQuery are called with a <target> of
TIME_ELAPSED, the GL prepares to start and stop the timer used for
timer queries. The timer is started or stopped when the effects from all
previous commands on the GL client and server state and the framebuffer
have been fully realized.

http://developer.download.nvidia.com/opengl/specs/GL_ARB_timer_query.txt

This is not just an advisal, this directly implies a dependency between the
BeginQuery and the previous draw calls. My application is coded around the
behavior described by this paragraph, which your patch appears to violate.

--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.

b***@freedesktop.org

2017-09-20 15:02:56 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #8 from Cris <***@gmail.com> ---
Well? Thousands of mpv users rely on mesa's now incorrect behaviour to diagnose
performance issues. Are you going to revert that commit in accordance with the
spec or are you going to be stubborn and continue to violate it?

--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.

b***@freedesktop.org

2017-10-02 21:38:05 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

Thousands of mpv users rely on mesa's now incorrect behaviour to diagnose performance issues.

Well, to be fair, that's going to end up being misleading one way or the other
- the timers are unreliable on so many platforms and configurations that using
them as the basis of a performance evaluation needs to be taken with a heavy
grain of salt. I also think these timers are only really relevant to developers
anyway; for users the only thing that really matters is whether or not it drops
frames (which mpv already displays separately).

Are you going to revert that commit in accordance with the spec or are you going to be stubborn and continue to violate it?

I don't think pestering the developers is going to help this issue progress; if
anything, it will accomplish the opposite.

--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.

b***@freedesktop.org

2018-11-14 02:38:00 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #10 from Dieter NÃŒtzel <***@nuetzel-hh.de> ---
Code fix under way:

https://lists.freedesktop.org/archives/mesa-dev/2018-November/209473.html

With this patch mpv drops notably, apart that '--vo=opengl-hq' isn't available
any longer. Was replaced by '--vo=gpu'.

--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.

b***@freedesktop.org

2018-12-07 16:09:54 UTC

Permalink

https://bugs.freedesktop.org/show_bug.cgi?id=102597

Emil Velikov <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |RESOLVED
Resolution|--- |FIXED

--- Comment #11 from Emil Velikov <***@gmail.com> ---
The following commit reverts to the original behaviour, this we can close this.
Feel free to reopen if it doesn't help.

commit ea9f95e2a67eca90bb84eea24e7b4b804b3b1345
Author: Marek OlÅ¡Ã¡k <***@amd.com>
Date: Tue Nov 13 16:19:42 2018 -0500

radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSED

--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.