[PrimeGrid] Changes to GFN Apps

News and Information related to Distributed Computing
Post Reply
BOINC_News
Reactions:
Posts: 997
Joined: Sun Nov 08, 2020 3:51 pm

[PrimeGrid] Changes to GFN Apps

Post by BOINC_News »

I just installed the new Windows and Linux (but not Mac) CPU versions for the following GFN projects:

GFN-16 Yes, you can run GFN16 on CPUs again!
GFN-17-LOW
GFN-17-MEGA Yes, you can run GFN-17-MEGA on CPUs again!
GFN-18
GFN-19
GFN-20

Genefer 3.3.5 provides substantial speed boosts for CPU on those projects.

This applies only to the CPU apps. GPU apps are not affected.

GFN-21 has not been updated because it's still capable of running the FP64 transforms, which are the fastest.

I have removed the CPU GFN-22 app. It was a mistake.

Mac versions of the new app will be made available when testing is completed, but this has been delayed by a lack of Mac testers. If you want to help out testing the Mac version, head over to the 3.3.5 testing thread.

Discussion about the new release can be found in the Genefer 3.3.5 release thread.

Source: http://www.primegrid.com/forum_thread.php?id=9463
StefanR5R
TAAT Member
Reactions:
Posts: 1661
Joined: Wed Sep 25, 2019 4:32 pm

Re: [PrimeGrid] Changes to GFN Apps

Post by StefanR5R »

From the "Genefer 3.3.5 testing" thread:
Yves Gallot wrote:This new version is a CPU app. The transform of genefer 3.3.4 for n <= 20 was based on the x86 extended precision format (80-bit) implemented in the Intel 8087. "x87" is still available for backward compatibility but is slow today. genefer 3.3.5 implements a new transform which is five times faster with AVX/FMA3 instructions. This transform is an Irrational Base Discrete Transform (named irrational).

Four implementations are available for x64 CPUs: FMA, AVX, SSE4 and SSE2. F64 is also available for x86 CPUs (if a processor without SSE2 is still running?). Multithreading is not supported.
Note that the larger GFN's are unable to utilize the execution units of a CPU properly. Their data footprint is so large that they exceed typical processor cache sizes and are therefore bottlenecked by memory access bandwidth:
Yve Gallot wrote:Allocated memory sizes are GFN20: 16 MB, GFN19: 8 MB and GFN18: 4 MB.
GFN-22 for CPU was removed, as mentioned in the news item. I.e. GFN-22 is a GPU-only project now.
GFN-21 for CPU (unchanged, as mentioned in the news) does support multithreading, and certainly needs it to be viable.
The lack of multithreading in GFN-20...GFN-18 makes these less practical on CPUs, due to the large cache footprint per task.
GFN-17-Mega, GFN-17-Low, and GFN-16 should work quite well on CPUs, even compared with GPUs, I presume.
GFN-15 remains a GPU-only project.
User avatar
biodoc
TAAT Member
Reactions:
Posts: 1014
Joined: Sun Sep 15, 2019 3:22 pm
Location: Massachusetts, USA

Re: [PrimeGrid] Changes to GFN Apps

Post by biodoc »

BOINC_News wrote: Fri Dec 04, 2020 10:53 am GFN-21 has not been updated because it's still capable of running the FP64 transforms, which are the fastest.
Another app for the Radeon VII. :) Maybe I'll try it out during the PG GPU challenge later this month unless of course, it's need elsewhere. :)
StefanR5R
TAAT Member
Reactions:
Posts: 1661
Joined: Wed Sep 25, 2019 4:32 pm

Re: [PrimeGrid] Changes to GFN Apps

Post by StefanR5R »

biodoc wrote: Sun Dec 06, 2020 5:11 am
BOINC_News wrote: Fri Dec 04, 2020 10:53 am GFN-21 has not been updated because it's still capable of running the FP64 transforms, which are the fastest.
Another app for the Radeon VII. :) Maybe I'll try it out during the PG GPU challenge later this month unless of course, it's need elsewhere. :)
I think the announcement refers to FP64 usage in the CPU application version.

The GPU application versions may be using INT64 nowadays, but I am not sure about that.
User avatar
biodoc
TAAT Member
Reactions:
Posts: 1014
Joined: Sun Sep 15, 2019 3:22 pm
Location: Massachusetts, USA

Re: [PrimeGrid] Changes to GFN Apps

Post by biodoc »

This is an interesting thread on PG. I'm not clear on what it means exactly but it sounds like GFN-21 and GFN-22 tasks may use FP64 on some cards.
StefanR5R
TAAT Member
Reactions:
Posts: 1661
Joined: Wed Sep 25, 2019 4:32 pm

Re: [PrimeGrid] Changes to GFN Apps

Post by StefanR5R »

Here is a current result of GFN-19 on Nvidia V100:
http://www.primegrid.com/result.php?resultid=1149989755

Code: Select all

Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
geneferocl 3.3.3-2 (Linux/OpenCL/64-bit)

Copyright 2001-2018, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Running on platform 'NVIDIA CUDA', device 'GRID V100DX-8Q', vendor 'NVIDIA Corporation', version 'OpenCL 1.2 CUDA' and driver '410.92'.
80 computeUnits @ 1530MHz, memSize=8192MB, cacheSize=1280kB, cacheLineSize=128B, localMemSize=48kB, maxWorkGroupSize=1024.
Supported transform implementations: ocl ocl2 ocl3 ocl4 ocl5 

Command line: ../../projects/www.primegrid.com/geneferocl_linux64_3.3.3-2 -boinc -q 4022674^524288+1 --device 0 

Normal priority change failed (needs superuser privileges.
Checking available transform implementations...
OCL transform is past its b limit.
OCL4 transform is past its b limit.
A benchmark is needed to determine best transform, testing available transform implementations...
Testing OCL2 transform...
Testing OCL3 transform...
Testing OCL5 transform...
Benchmarks completed (3.998 seconds).
Using OCL5 transform
Starting initialization...
Initialization complete (1.437 seconds).
Testing 4022674^524288+1...
Estimated time for 4022674^524288+1 is 0:20:10                 
4022674^524288+1 is complete. (3462668 digits) (err = 0.0000) (time = 0:20:15) 21:33:34
21:33:34 (21486): called boinc_finish

</stderr_txt>
]]>
and GFN-extreme (do you feel lucky)
http://www.primegrid.com/result.php?resultid=1154401235

Code: Select all

Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
geneferocl 3.3.3-2 (Linux/OpenCL/64-bit)

Copyright 2001-2018, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Running on platform 'NVIDIA CUDA', device 'GRID V100DX-8Q', vendor 'NVIDIA Corporation', version 'OpenCL 1.2 CUDA' and driver '410.92'.
80 computeUnits @ 1530MHz, memSize=8192MB, cacheSize=1280kB, cacheLineSize=128B, localMemSize=48kB, maxWorkGroupSize=1024.
Supported transform implementations: ocl ocl2 ocl3 ocl4 ocl5 

Command line: ../../projects/www.primegrid.com/geneferocl_linux64_3.3.3-2 -boinc -q 920852^4194304+1 --device 0 

Normal priority change failed (needs superuser privileges.
Checking available transform implementations...
OCL transform is past its b limit.
OCL4 transform is past its b limit.
A benchmark is needed to determine best transform, testing available transform implementations...
Testing OCL2 transform...
Testing OCL3 transform...
Testing OCL5 transform...
Benchmarks completed (23.083 seconds).
Using OCL5 transform
Starting initialization...
Initialization complete (27.298 seconds).
Testing 920852^4194304+1...
Estimated time for 920852^4194304+1 is 18:40:00                 
920852^4194304+1 is complete. (25015626 digits) (err = 0.0000) (time = 18:50:21) 17:42:23
17:42:23 (27824): called boinc_finish

</stderr_txt>
]]>
..and another one at GFN-20
http://www.primegrid.com/result.php?resultid=1143807952

Code: Select all

Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
geneferocl 3.3.3-2 (Linux/OpenCL/64-bit)

Copyright 2001-2018, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Running on platform 'NVIDIA CUDA', device 'GRID V100DX-8Q', vendor 'NVIDIA Corporation', version 'OpenCL 1.2 CUDA' and driver '410.92'.
80 computeUnits @ 1530MHz, memSize=8192MB, cacheSize=1280kB, cacheLineSize=128B, localMemSize=48kB, maxWorkGroupSize=1024.
Supported transform implementations: ocl ocl2 ocl3 ocl4 ocl5 

Command line: ../../projects/www.primegrid.com/geneferocl_linux64_3.3.3-2 -boinc -q 1506022^1048576+1 --device 0 

Normal priority change failed (needs superuser privileges.
Checking available transform implementations...
OCL transform is past its b limit.
OCL4 transform is past its b limit.
A benchmark is needed to determine best transform, testing available transform implementations...
Testing OCL2 transform...
Testing OCL3 transform...
Testing OCL5 transform...
Benchmarks completed (6.607 seconds).
Using OCL5 transform
Starting initialization...
Initialization complete (3.571 seconds).
Testing 1506022^1048576+1...
Estimated time for 1506022^1048576+1 is 1:13:00                 
1506022^1048576+1 is complete. (6477926 digits) (err = 0.0000) (time = 1:13:48) 01:33:16
01:33:16 (18828): called boinc_finish

</stderr_txt>
]]>
They are all using the "OCL5" transform — whatever that means.

Current results from Radeon VII:
GFN-17-Mega, http://www.primegrid.com/result.php?resultid=1155605729 — using OCL2 transform

Code: Select all

Stderr output

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
geneferocl 3.3.3-2 (Windows/OpenCL/32-bit)

Copyright 2001-2018, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Running on platform 'AMD Accelerated Parallel Processing', device 'gfx906', vendor 'Advanced Micro Devices, Inc.', version 'OpenCL 1.2 AMD-APP (3188.4)' and driver '3188.4 (PAL,HSAIL)'.
60 computeUnits @ 1801MHz, memSize=3072MB, cacheSize=16kB, cacheLineSize=64B, localMemSize=32kB, maxWorkGroupSize=256.
Supported transform implementations: ocl ocl2 ocl3 ocl4 ocl5 

Command line: projects/www.primegrid.com/geneferocl_windows_3.3.3-2.exe -boinc -q 81020774^131072+1 

Normal priority change succeeded.
Checking available transform implementations...
OCL transform is past its b limit.
OCL3 transform is past its b limit.
OCL4 transform is past its b limit.
OCL5 transform is past its b limit.
Using OCL2 transform
Starting initialization...
Initialization complete (0.335 seconds).
Testing 81020774^131072+1...
Estimated time for 81020774^131072+1 is 0:06:15                 
81020774^131072+1 is complete. (1036596 digits) (err = 0.0000) (time = 0:06:58) 04:14:24
04:14:24 (35372): called boinc_finish(0)

</stderr_txt>
]]>
GFN-22 http://www.primegrid.com/result.php?resultid=1152458165 — using OCL4 transform

Code: Select all

Stderr output

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
geneferocl 3.3.3-2 (Windows/OpenCL/32-bit)

Copyright 2001-2018, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Running on platform 'AMD Accelerated Parallel Processing', device 'gfx1030', vendor 'Advanced Micro Devices, Inc.', version 'OpenCL 1.2 AMD-APP (3188.4)' and driver '3188.4 (PAL,LC)'.
30 computeUnits @ 1815MHz, memSize=3072MB, cacheSize=16kB, cacheLineSize=64B, localMemSize=64kB, maxWorkGroupSize=256.
Supported transform implementations: ocl ocl2 ocl3 ocl4 ocl5 

Command line: projects/www.primegrid.com/geneferocl_windows_3.3.3-2.exe -boinc -q 185726^4194304+1 

Normal priority change succeeded.
Checking available transform implementations...
A benchmark is needed to determine best transform, testing available transform implementations...
Testing OCL transform...
Testing OCL2 transform...
Testing OCL3 transform...
Testing OCL4 transform...
Testing OCL5 transform...
Benchmarks completed (36.931 seconds).
Using OCL4 transform
Starting initialization...
Initialization complete (37.293 seconds).
Testing 185726^4194304+1...
Estimated time for 185726^4194304+1 is 18:00:00                 
185726^4194304+1 is complete. (22099254 digits) (err = 0.0000) (time = 18:29:14) 11:15:19
11:15:19 (37344): called boinc_finish(0)

</stderr_txt>
]]>
Post Reply