Jump to content
You must now use your email address to sign in [click for more info] ×

AMD Radeon RX Hardware Acceleration


Recommended Posts

14 hours ago, debraspicher said:

I wonder if y'all run something like GPU-Z while running the benchmark/application, does it actually utilize the GPU?

I have a 5000 series card, I'm considering installing it in another rig we have to test.

from windows 10 and above you can check this on task manager

Current Workstation:
CPU: AMD Ryzen 5 5500 - MOBO: Asus B450 - RAM: 16GB DDR4 2667Mhz - GPU: AMD Radeon 7850 1GB
NVMe SSD: Crusial P3 1TB M.2 -  SSD: Samsung Evo 850 256GB  - PSU: XFX TS450 - OS: Win10

Link to comment
Share on other sites

14 hours ago, debraspicher said:

I wonder if y'all run something like GPU-Z while running the benchmark/application, does it actually utilize the GPU?

I have a 5000 series card, I'm considering installing it in another rig we have to test.

Yes, it does, as a name of the used GPU in benchmark screenshot suggest. If there will be no utilization instead of number (GPU name) you would have N/A
 

13 hours ago, ashf said:

I wonder if there is a future for OpenCL...
Blender has already removed it.

https://wiki.blender.org/wiki/Source/Render/Cycles/OpenCL

Some future probably, but yes HIP and Metal is a better way to go these days if you have people/resources/time who know how to deal with low level stuff. NGL I was hoping for HIP and Metal in V2 affinity apps, but yea ;)

OpenCL's hardware acceleration in V2 works ok in Vega 56 as expected (bit lower, it should be closer to GF1070ti but ok), still better than in 5XXX/6XXX series cards. I'm curious how 7XXX series cards will do...
benchmark_V2.png.32b026c846d0bd3aff42f1899b720c45.png

Link to comment
Share on other sites

benchmark.PNG.59435199a6923121bf994afcdf607886.PNG

 

5700XT and ryzen 9 5900X.

this is my gaming machine - workstation is running a threadripper 2920X and vega56 - so at least openCL works a bit better. but im quite fed up with radeons for work. davinci resolve doesnt run without stuttering either. capture one has started giving me artifacts with openCL acceleration again (thankfully not baked in when exporting, was a fun time in 2019), even had a gpu related system lockup while editing. time for team green, amd just doesnt care.

 

checked the gpu utilization and the card clocks to its full 3D clocks and draws more power, but the load duration in the benchmark is extremely short. gpu only is less than a second and combined is maybe 2 seconds of load.

Link to comment
Share on other sites

On 11/15/2022 at 3:57 AM, slizgi said:

Yes, it does, as a name of the used GPU in benchmark screenshot suggest. If there will be no utilization instead of number (GPU name) you would have N/A

Yes it is from the post, but that's what I'm questioning though, that assumption?... The GPU bench seemingly scales linearly based on CPU single-core from when I tested in the old bench using different CPU's on same setup (important). It's also evidenced in the old chart in the beta thread where multiple results have been compiled. I don't know if it's changed in the new bench, but the low score itself could be similar to if CPU were being measured, but the GPU score "fudged". If that makes sense. I just thought worth a check to see what is going on with the GPU during testing.

Microsoft Windows 10 Home (Build 19045)
AMD Ryzen 7 5800X @ 3.8Ghz (-30 all core +200mhz PBO); Mobo: Asus X470 Prime Pro
32GB DDR4 (3600Mhz); EVGA NVIDIA GeForce GTX 3080 X3C Ultra 12GB
Monitor 1 4K @ 125% due to a bug
Monitor 2 4K @ 150%
Monitor 3 (as needed) 1080p @ 100%

WACOM Intuos4 Large; X-rite i1Display Pro; NIKON D5600 DSLR

Link to comment
Share on other sites

On 11/17/2022 at 5:51 PM, Duskstalker said:

5700XT and ryzen 9 5900X.

this is my gaming machine - workstation is running a threadripper 2920X and vega56 - so at least openCL works a bit better. but im quite fed up with radeons for work. davinci resolve doesnt run without stuttering either. capture one has started giving me artifacts with openCL acceleration again (thankfully not baked in when exporting, was a fun time in 2019), even had a gpu related system lockup while editing. time for team green, amd just doesnt care.

 

checked the gpu utilization and the card clocks to its full 3D clocks and draws more power, but the load duration in the benchmark is extremely short. gpu only is less than a second and combined is maybe 2 seconds of load.

Hmm. I just ran mine (again). You're right, it doesn't load GPU portion very long. But I did see change in power state on my end. GPU chart is for duration of benchmark.

image.png.24a30788c486bd1273a1d02c35f05ffc.png

image.png.a40d5a3c73c257d6473048777e4446bc.png

Microsoft Windows 10 Home (Build 19045)
AMD Ryzen 7 5800X @ 3.8Ghz (-30 all core +200mhz PBO); Mobo: Asus X470 Prime Pro
32GB DDR4 (3600Mhz); EVGA NVIDIA GeForce GTX 3080 X3C Ultra 12GB
Monitor 1 4K @ 125% due to a bug
Monitor 2 4K @ 150%
Monitor 3 (as needed) 1080p @ 100%

WACOM Intuos4 Large; X-rite i1Display Pro; NIKON D5600 DSLR

Link to comment
Share on other sites

4 hours ago, debraspicher said:

Hmm. I just ran mine (again). You're right, it doesn't load GPU portion very long. But I did see change in power state on my end. GPU chart is for duration of benchmark.

GPU-Z isn't the greatest tool to see load over time. OCCT allows you to graph pretty much any of the sensor data it monitors. This is its graph of my GPU clock while running the benchmark tool. There are two big spikes in use that last for about 4 to 5 seconds each, so its definitely loading the GPU (caveat - this is on an Nvidia card, not a Radeon, but that isn't really relevant, as the point of this comment is to state that the benchmark is offloading to the GPU for a period of about 8 - 10 seconds in my case)

image.png.6c6acd155e9dd437d6c2b7bf48490c23.png

Link to comment
Share on other sites

24 minutes ago, rvst said:

GPU-Z isn't the greatest tool to see load over time. OCCT allows you to graph pretty much any of the sensor data it monitors. This is its graph of my GPU clock while running the benchmark tool. There are two big spikes in use that last for about 4 to 5 seconds each, so its definitely loading the GPU (caveat - this is on an Nvidia card, not a Radeon, but that isn't really relevant, as the point of this comment is to state that the benchmark is offloading to the GPU for a period of about 8 - 10 seconds in my case)

image.png.6c6acd155e9dd437d6c2b7bf48490c23.png

lol I was literally going to load that up for the next time. I'm like I know I can get a better resolution graph in OCCT.

I do have an AMD card but I need to load it in the other system I just moved my other ryzen to when I get the chance. Then I can test there.

Microsoft Windows 10 Home (Build 19045)
AMD Ryzen 7 5800X @ 3.8Ghz (-30 all core +200mhz PBO); Mobo: Asus X470 Prime Pro
32GB DDR4 (3600Mhz); EVGA NVIDIA GeForce GTX 3080 X3C Ultra 12GB
Monitor 1 4K @ 125% due to a bug
Monitor 2 4K @ 150%
Monitor 3 (as needed) 1080p @ 100%

WACOM Intuos4 Large; X-rite i1Display Pro; NIKON D5600 DSLR

Link to comment
Share on other sites

It is really nice that Serif has finally enabled OpenCL support for AMD GPUs on v.2.0. Please keep it that way !!!

Through some testing i came to the conclusion that it really works. Lets look, not only the benchmark but also some real case performance. I am using the latest AMD Pro Driver 22.Q4 but even with Windows 11 default driver v.30 or v.31 it still works great. Actually Windows 11 store driver v.30 might be slightly better. When looking at the benchmark result, it is really low for the GPU tested (i have an RX 6900 XT), compared to the results at the thread below:

When this GPU is used in Metal it is capable of a score of nearly 50000. When measured on Windows 11 i got only slightly more than 1000 (the result between versions are not directly comparable and is used just indicatively). So where does the difference of 50x comes from , and does it really matter?

If you look at the picture below, i run the benchmark and in parallel i simply have the Windows performance monitor showing what's going on. The result in the graphs includes only the benchmark in it's full duration. A close look on the graphs, can clearly explain why the performance is so low. During the benchmark the CPU is continuously utilized at almost 100% during the entire duration, even when GPU execution is taking place. It seems to me that GPU performance is actually throttled by the CPU capability to JIT compile the OpenCL kernels. So, no matter what GPU you have RX 5500, or RX 5700 or RX 6900 the result will be pretty much the same, because it's anyway limited by the CPU ability to compile the kernels. Similarly if the CPU can compile and spawn kernels at a higher rate, most likely the GPU performance will increase. Now if we take a look at the GPU performance, we observe that the GPU is utilized at a peak of less than 15%. NVIDIA's implementation (and i believe the same applies to Apple's Metal) pre-caches compiled OpenCL kernels, and as a result the benchmark result is substantially higher just because the CPU doesn't need to recompile the kernels.

But the most interesting question is does it really matter? To find out if the low benchmark result somewhat affects the user experience i stitched a large panorama, exported it on a large 16bit tiff and reloaded it (without layers or anything else); and started testing the live blur filters. To my surprise, the behavior of the application was excellent. Those filters that were implemented to use the GPU have been working really smooth, and flawless. Monitoring the GPU activity i verified that there was copy and compute activity. The compute activity was actually higher than in the benchmark but still resulted in device under-utilization, which is pretty normal considering that most likely i didn't have enough data to fully utilize the GPU. To conclude, i think it's really great that Serif has finally enabled OpenCL acceleration on AMD GPUs. The benchmark results might be on the low side but this i believe doesn't necessarily translate to bad user experience, as i believe the typical usage scenario when you apply an OpenCL filter is to spent at least a few seconds until you get the right result. And there is of course plenty of room for optimization for the developers (e.g., manual kernel pre-caching and compilation at startup, etc), although i don't think it's needed. The typical overhead of the driver (as i measured it using hello-world like code) is in the range of a few ms (50-70), so it should be completely unnoticed even if it happens at filter loading.

So don't stick to the benchmark, test it out :)

 

AP2.jpg

Link to comment
Share on other sites

5 hours ago, kkoukos said:

But the most interesting question is does it really matter?

Nice analysis, although I think you need to be careful about drawing too emphatic a conclusion about the cause if you based it only on the low resolution performance tab of task monitor. If I use OCCT and look at the CPU/GPU load during the benchmark, the CPU is for sure near 100% for a good part of the test, but there's a portion where the load drops right down just as the load on the GPU spikes up

Aren't OpenCL kernels compiled at runtime on the target device and not the CPU?

Your point about whether or not it matters is very well made. Benchmark workloads are a total mismatch to typical workloads. 

image.thumb.png.1b852997c8252eea1142b8bb38b51719.png

Link to comment
Share on other sites

Recently upgraded to 2.0 and enabled OpenCL on a 5600 XT.

When I use a drawing tablet (older Wacom Intuos) and enable any pressure-based jitter (e.g. size), the brush lags behind heavily. Turning off hardware acceleration makes the brush lag-less again.

Can anyone else confirm this?

Link to comment
Share on other sites

@Eroica yes, brush performance on amd cards is abysmal, especially with rdna1 and 2 cards.

i got my hands on intel arc A750 - i have to play around with it some more, but with a ryzen 5 3600 @4,3 ghz the gpu scores 8000 points in raster single gpu. gonna upload the screenshot later, its saved on my testbench in the basement.

considering @debraspicher had an rtx3080 score 10500 points here, with a faster cpu, the result for the intel gpu is actually quite good when you look at the price of just $290 for the A750.

Link to comment
Share on other sites

11 hours ago, Duskstalker said:

@Eroica yes, brush performance on amd cards is abysmal, especially with rdna1 and 2 cards.

i got my hands on intel arc A750 - i have to play around with it some more, but with a ryzen 5 3600 @4,3 ghz the gpu scores 8000 points in raster single gpu. gonna upload the screenshot later, its saved on my testbench in the basement.

considering @debraspicher had an rtx3080 score 10500 points here, with a faster cpu, the result for the intel gpu is actually quite good when you look at the price of just $290 for the A750.

those indeed seem interesting GPUs, not so much for gaming yet (drivers need a lot more love for something that demanding) but I have my eyes on them for my workstation for some time now. I just wait to see them in my local market in normal prices, because now they are unofficial imports with 3-4 times the MSRP.

Current Workstation:
CPU: AMD Ryzen 5 5500 - MOBO: Asus B450 - RAM: 16GB DDR4 2667Mhz - GPU: AMD Radeon 7850 1GB
NVMe SSD: Crusial P3 1TB M.2 -  SSD: Samsung Evo 850 256GB  - PSU: XFX TS450 - OS: Win10

Link to comment
Share on other sites

  • 3 weeks later...
On 11/9/2022 at 11:51 PM, slizgi said:

Yup, OpenCL acceleration seems to up and running in V2. Great news overall, just a bit late and behind a reasonable paywall, but ok. I also do not think it would be fixed in V1, would be nice, but from business perspective have 0 sense. Thanks Serif for figuring and fixing it out in V2.

So they have done the bait and switch and screwed over users of version 1 who have supported this product from the start.

I thought it smelled fishy and didn't add up all of the excuses and blame they were putting on AMD

Remember this is NOT a new feature, it was part of version 1. They have pulled it because they want to make it a selling point for v2.

Serif, you've went from hero to zero with this terrible move.

Link to comment
Share on other sites

9 minutes ago, ziplock9000 said:

So they have done the bait and switch and screwed over users of version 1 who have supported this product from the start.

I thought it smelled fishy and didn't add up all of the excuses and blame they were putting on AMD

Remember this is NOT a new feature, it was part of version 1. They have pulled it because they want to make it a selling point for v2.

Serif, you've went from hero to zero with this terrible move.

didn't we just confirmed that is still have the same problem like in v1 and it just have the option to turn it on? how is this a selling point for v2 if it is still not working

Current Workstation:
CPU: AMD Ryzen 5 5500 - MOBO: Asus B450 - RAM: 16GB DDR4 2667Mhz - GPU: AMD Radeon 7850 1GB
NVMe SSD: Crusial P3 1TB M.2 -  SSD: Samsung Evo 850 256GB  - PSU: XFX TS450 - OS: Win10

Link to comment
Share on other sites

1 minute ago, nitro912gr said:

didn't we just confirmed that is still have the same problem like in v1 and it just have the option to turn it on? how is this a selling point for v2 if it is still not working

I see a theory not confirmation from minimal testing. That would come from lot more testing and comments from serif themselves.

We've been fooled before.

Even if it DID work, it should have worked in version 1 as it was part of the product features that were sold to us a long time ago now.

Link to comment
Share on other sites

Just now, ziplock9000 said:

I see a theory not confirmation from minimal testing. That would come from lot more testing and comments from serif themselves.

We've been fooled before.

Even if it DID work, it should have worked in version 1 as it was part of the product features that were sold to us a long time ago now.

the are a couple of answers above that confirm via testing that openCL in AMD GPUs it can be turned on but it performs worst than just working with the CPU alone. I only upgraded designer that is my bread and butter so I can't confirm myself through the benchmark but I think the above answers are enough to prove that this is still not working.

Also I see other app vendors are moving away from openCL altogether for AMD GPGPU (I think someone also mentioned it here somewhere?) so there is indeed a problem with AMD and not affinity.

Current Workstation:
CPU: AMD Ryzen 5 5500 - MOBO: Asus B450 - RAM: 16GB DDR4 2667Mhz - GPU: AMD Radeon 7850 1GB
NVMe SSD: Crusial P3 1TB M.2 -  SSD: Samsung Evo 850 256GB  - PSU: XFX TS450 - OS: Win10

Link to comment
Share on other sites

15 minutes ago, nitro912gr said:

the are a couple of answers above that confirm via testing that openCL in AMD GPUs it can be turned on but it performs worst than just working with the CPU alone. I only upgraded designer that is my bread and butter so I can't confirm myself through the benchmark but I think the above answers are enough to prove that this is still not working.

Also I see other app vendors are moving away from openCL altogether for AMD GPGPU (I think someone also mentioned it here somewhere?) so there is indeed a problem with AMD and not affinity.

That's grand and certainly in the right direction, but not confirmation of the situation with such a limited set of data points. For that we'd need a bigger data set and multiple combinations of graphics cards, driver and OS versions. A couple of data point is not 'confirmation'

What ever problems AMD has had, the industry has found a way to work on their products just fine with GPU acceleration with a plethora of products from video editing, streaming and of course vector and bitmap design, while affinity notably hasn't.

The problem is very much in Affinity's court when those other competing products work fine.

Link to comment
Share on other sites

  • Staff
9 hours ago, ziplock9000 said:

So they have done the bait and switch and screwed over users of version 1 who have supported this product from the start.

I thought it smelled fishy and didn't add up all of the excuses and blame they were putting on AMD

Remember this is NOT a new feature, it was part of version 1. They have pulled it because they want to make it a selling point for v2.

Serif, you've went from hero to zero with this terrible move.

That is not completely fair. AMD drivers were terrible and buggy. They were 50 times slower than other manufacturers so we blocked you being able to turn it on for AMD in Windows V1 because it had already led to lots of support contacts complaining about the performance. Since about May '22 the AMD drivers have been improved and are now only about 3 times slower than other makers. Consequently during the internal V2 early beta we re-allowed AMD chips to have hardware acceleration on and made further changes to the code accordingly. So although it appears as though we could technically now "allow" AMD to turn on hardware acceleration in a Version 1 patch, it is not that straightforward. Other changes made during the the V2 development cycle have also facilitated AMD drivers, and it is not a simple matter to port any of those back into the V1 codebase. The recent patch for critical OS issues is the only change we can make given our resources here. As it happens there are lots of "fixes" in 2.0 that were a problem in 1.10 but we are trying hard not to sell the software based on those so much as the many improvements and new features that you get. I think it is better that some things got fixed as well as the improvements and new features, than that 2.0 would have all the same restrictions as 1.10.x

So yes if this was the ONLY change in 2.0 over 1.10 I would agree we had put this "behind a paywall" but it is not how I would describe the development cycle where things change over time and occasionally we will ask for more money.

tldr; New AMD drivers mean that enabling AMD in 1.10 might now work better, but would likely lead to unforseen problems making it still unusable.

Patrick Connor
Serif Europe Ltd

"There is nothing noble in being superior to your fellow man. True nobility lies in being superior to your previous self."  W. L. Sheldon

 

Link to comment
Share on other sites

Early benchmarks on the new Radeon cards suggest they perform slower on OpenCL compared against their Nvidia direct competitor(s) (in this case, 4080). With Intel improving their ARC drivers and the clamoring about Nvidia's predatory pricing, maybe that will push AMD to put more care into their drivers... to be seen, hopefully.

image.png.82664061b7c664443aab125b947d8f33.png

 

Microsoft Windows 10 Home (Build 19045)
AMD Ryzen 7 5800X @ 3.8Ghz (-30 all core +200mhz PBO); Mobo: Asus X470 Prime Pro
32GB DDR4 (3600Mhz); EVGA NVIDIA GeForce GTX 3080 X3C Ultra 12GB
Monitor 1 4K @ 125% due to a bug
Monitor 2 4K @ 150%
Monitor 3 (as needed) 1080p @ 100%

WACOM Intuos4 Large; X-rite i1Display Pro; NIKON D5600 DSLR

Link to comment
Share on other sites

I can confirm that in one of last updates for AMD drivers (22.11.2) they did something with OpenCL 🤔

@Mark Ingram I see you are now ex-serif (all the best in new job), nonetheless want to let you know that your OpenCL test from GitHub, have now better results on my 6800XT

6800XT 22.11.1
Compiling kernel for device AMD Radeon RX 6800 XT (OpenCL 2.0 AMD-APP (3444.0)):
Run 1: 513.386ms
Run 2: 142.812ms
Run 3: 140.732ms
Run 4: 141.156ms
Run 5: 143.655ms
Run 6: 140.524ms
Run 7: 140.218ms
Run 8: 139.633ms
Run 9: 140.573ms
Run 10: 140.306ms
Average: 178.299ms

So still no Average ~50-60ms like on Nvidia, but ~180 is way better than ~1400, unfortunately benchmark results in V2 are the same as they were before, so I don't know 😂

Link to comment
Share on other sites

Aww @Mark Ingram is gone?! Good luck on your new endeavors <3<3

Microsoft Windows 10 Home (Build 19045)
AMD Ryzen 7 5800X @ 3.8Ghz (-30 all core +200mhz PBO); Mobo: Asus X470 Prime Pro
32GB DDR4 (3600Mhz); EVGA NVIDIA GeForce GTX 3080 X3C Ultra 12GB
Monitor 1 4K @ 125% due to a bug
Monitor 2 4K @ 150%
Monitor 3 (as needed) 1080p @ 100%

WACOM Intuos4 Large; X-rite i1Display Pro; NIKON D5600 DSLR

Link to comment
Share on other sites

Do you also face performance issues? Or is it mainly the benchmark? I just made the switch from MacOS back to Windows on my machine and was utterly dissapointed by the performance of Affinity Photo (V2). Moving around images and drawing with the pen tool is quite laggy.

Has anyone found a solution for the performance issue?

Affinity_Photo_Benchmark_Windows.png.cb72692c2495d6da059b16607a3509f0.pngAffinity_Photo_Benchmark_MacOS.png.3ce21f41b9425bbdb2f00bc4a00e7ca2.png

Edited by F1orian
Link to comment
Share on other sites

1 hour ago, F1orian said:

Do you also face performance issues? Or is it mainly the benchmark? I just made the switch from MacOS back to Windows on my machine and was utterly dissapointed by the performance of Affinity Photo (V2). Moving around images and drawing with the pen tool is quite laggy.

Has anyone found a solution for the performance issue?

Well... yes... the whole topic (11 pages) is about this :)
Just turn off hardware acceleration on Windows with Radeon (5XXX+), it will be fine and no lags, but you will lose some extra umpf in performance with more demanding tasks and bigger files.
FYI AMD did improve something quite a bit lately (in independent OpenCL bench results are around 8 times better), but it did not change anything in Photo benchmark results, so I guess didn't change anything in overall app performance, unfortunately. Maybe after those AMD changes, Affinity have to tune something up in patch to use this AMD improvements, or not. I don't know...
Other solution, switch back to macOS or stay on Windows but use Adobe or buy GeForce/A (formerly Quadro) or Intel Arc (sic!).

Link to comment
Share on other sites

3 hours ago, F1orian said:

Do you also face performance issues? Or is it mainly the benchmark? I just made the switch from MacOS back to Windows on my machine and was utterly dissapointed by the performance of Affinity Photo (V2). Moving around images and drawing with the pen tool is quite laggy.

Has anyone found a solution for the performance issue?

Affinity_Photo_Benchmark_Windows.png.cb72692c2495d6da059b16607a3509f0.pngAffinity_Photo_Benchmark_MacOS.png.3ce21f41b9425bbdb2f00bc4a00e7ca2.png

what you mean? you had a hackintosh?

Current Workstation:
CPU: AMD Ryzen 5 5500 - MOBO: Asus B450 - RAM: 16GB DDR4 2667Mhz - GPU: AMD Radeon 7850 1GB
NVMe SSD: Crusial P3 1TB M.2 -  SSD: Samsung Evo 850 256GB  - PSU: XFX TS450 - OS: Win10

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.