New OpenCL Miner

I created a new miner that needs some testing done. If you are willing to run some benchmarks and post results here, It would be much appreciated.

Should be compatible with all gpus and igpu’s(yes igpu’s like hd6000), however, if its not working for one of your gpus please let me know.

Requires OpenCL 1.2

Link to miner here

Thanks all,
maztheman

2 Likes

I designed it for NVIDIA at the start, but AMD and Intel Gpu’s work pretty good too.

Its programmed in OpenCL, so it should be compatible with any gpu’s and CPU’s with OpenCL 1.2 support.

The issue is I dont have any “high” end GPU’s to test with, so I dont know the performance on them.

GTX 650 - 13-17 sols/s
AMD Oland - ~20 sols/s
Intel HD4000 - 8-10 sols/s

Uses ~580MB of GPU memory and Uses very little cpu time (except with some intel gpu’s due to driver).

For comparison EBWF gets 16-17 sols/s on my GTX 650.

2 Likes

I need people to test with Intel Gen 5,6,7,8 i gpu’s and with gtx 480+.

1 Like

Ok. Here is the GPU listing

C:\Users\Mark\Documents\igpu_v0.0.1>igpu -a
Platform #0: Intel(R) OpenCL : Device #0: Intel(R) HD Graphics 630 : Max Buf: 2047MiB
Platform #0: Intel(R) OpenCL : Device #1: Intel(R) Pentium(R) CPU G4620 @ 3.70GHz : Max Buf: 2018MiB
Platform #1: NVIDIA CUDA : Device #0: GeForce GTX 1080 Ti : Max Buf: 2816MiB
Platform #1: NVIDIA CUDA : Device #1: GeForce GTX 1080 Ti : Max Buf: 2816MiB

And the benchmark using -b2000
C:\Users\Mark\Documents\igpu_v0.0.1>igpu -spnvidia -sdgtx -b2000 -lus1-zcash.flypool.org -u XXXXXXXXXXXXXXXXXX
Setting log level to 2
[21:25:16][0x00001610] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1080 Ti GS=512
[21:25:16][0x00001ec8] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1080 Ti GS=512

[21:25:17][0x00001610] Benchmark starting… this may take several minutes, please wait…
[21:25:24][0x00001610] Benchmark done!
[21:25:24][0x00001610] Total time : 6763 ms
[21:25:24][0x00001610] Total iterations: 2000
[21:25:24][0x00001610] Total solutions found: 3660
[21:25:24][0x00001610] Speed: 295.727 I/s
[21:25:24][0x00001610] Speed: 541.18 Sols/s

Normally I get approximately 750 sols/s out of each card. Not sure if it used both cards or just one. Also I had to execute this block of code in the command line before I could get 541.8 sols/s out of it.
setx GPU_FORCE_64BIT_PTR 0 >nul 2>&1
setx GPU_MAX_HEAP_SIZE 100 >nul 2>&1
setx GPU_USE_SYNC_OBJECTS 1 >nul 2>&1
setx GPU_MAX_ALLOC_PERCENT 100 >nul 2>&1
setx GPU_SINGLE_ALLOC_PERCENT 100 >nul 2>&1

1 Like

That is just 1 card, to use both your cards you would need to go:
igpu.exe -p 1,1 -d 0,1 -b 2000

could you also do a benchmark on your igpu?
igpu.exe -p 0 -d 0 -b 50

oh, and thanks for the setx commands, that is usefull info

The maximum for your card is usually 2x the I/s so there is still some room for increase if the benchmark was longer.

2 Likes

Hello!

Thank you for this opportunity to test a new miner!

I have 2x gtx 1070s and an integrated gpu in my old AMD processor:

Listing

igpu -a
Platform #0: AMD Accelerated Parallel Processing : Device #0: AMD A10-6790K APU with Radeon™ HD Graphics : Max Buf: 2800MiB
Platform #0: AMD Accelerated Parallel Processing : Device #1: Devastator : Max Buf: 512MiB
Platform #1: NVIDIA CUDA : Device #0: GeForce GTX 1070 : Max Buf: 2048MiB
Platform #1: NVIDIA CUDA : Device #1: GeForce GTX 1070 : Max Buf: 2048MiB

I made a bat file with the command that you have specified:
igpu.exe -p 1,1 -d 0,1 -b 2000

Here are the results for tests:

Test 1

igpu_v0.0.1>igpu.exe -p 1,1 -d 0,1 -b 2000
Setting log level to 2
[11:58:43][0x00002230] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512
[11:58:43][0x00002230] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1070 GS=512
[11:58:43][0x00002178] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512
[11:58:43][0x000000d0] Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1070 GS=512
[11:58:44][0x00002230] Benchmark starting… this may take several minutes, please wait…
[11:58:50][0x00002230] Benchmark done!
[11:58:50][0x00002230] Total time : 5776 ms
[11:58:50][0x00002230] Total iterations: 2000
[11:58:50][0x00002230] Total solutions found: 3740
[11:58:50][0x00002230] Speed: 346.26 I/s
[11:58:50][0x00002230] Speed: 647.507 Sols/s

Test 2

igpu_v0.0.1>igpu.exe -p 1,1 -d 0,1 -b 2000
Setting log level to 2
[12:06:17][0x000028f4] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512
[12:06:18][0x000028f4] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1070 GS=512
[12:06:18][0x000026cc] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512

[12:06:18][0x0000151c] Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1070 GS=512

[12:06:19][0x000028f4] Benchmark starting… this may take several minutes, please wait…
[12:06:24][0x000028f4] Benchmark done!
[12:06:24][0x000028f4] Total time : 5765 ms
[12:06:24][0x000028f4] Total iterations: 2000
[12:06:24][0x000028f4] Total solutions found: 3756
[12:06:24][0x000028f4] Speed: 346.921 I/s
[12:06:24][0x000028f4] Speed: 651.518 Sols/s

I’ve made a second batch file to test on a pool, similar to what MColeman did:

Pool test 1

F:\CRYPT\igpu_v0.0.1>igpu.exe -spnvidia -sdgtx -b2000 -lus1-zcash.flypool.org -u XXXXXXXXXX
Setting log level to 2
[12:12:32][0x000027fc] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512
[12:12:32][0x00000314] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512

[12:12:33][0x000027fc] Benchmark starting… this may take several minutes, please wait…
[12:12:45][0x000027fc] Benchmark done!
[12:12:45][0x000027fc] Total time : 11261 ms
[12:12:45][0x000027fc] Total iterations: 2000
[12:12:45][0x000027fc] Total solutions found: 3626
[12:12:45][0x000027fc] Speed: 177.604 I/s
[12:12:45][0x000027fc] Speed: 321.996 Sols/s

Pool test 2

F:\CRYPT\igpu_v0.0.1>igpu.exe -spnvidia -sdgtx -b2000 -lus1-zcash.flypool.org -u XXXXXXXXXXXXX

Setting log level to 2
[12:12:51][0x000024e0] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512
[12:12:51][0x00002258] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1070 GS=512

[12:12:52][0x000024e0] Benchmark starting… this may take several minutes, please wait…
[12:13:03][0x000024e0] Benchmark done!
[12:13:03][0x000024e0] Total time : 11296 ms
[12:13:03][0x000024e0] Total iterations: 2000
[12:13:03][0x000024e0] Total solutions found: 3676
[12:13:03][0x000024e0] Speed: 177.054 I/s
[12:13:03][0x000024e0] Speed: 325.425 Sols/s

I’m not much of a command prompt person, even though I’ve been using computers since windows 3.11 (used norton commander most of the time).
Not really sure if the tests used both or only one of the 1070s.
Nor do I know why there is such a big difference in Sols/s between Tests and Pool tests.

Hope this helps.

Thanks for the info, looks like the first 2 tests used both 1070’s and the last 2 tests just used 1 gpu. the -sp command is like a text search and looks up the platform by name, i thought it may be useful but it seems that people just want to use all gpus attached to the platform. I might want to change it so it works like that. the first couple tests are interesting since the Iteration per second is 346, which means the peak would be around 700 sols/s.

Thank you for submitting your results.

1 Like

Oh, and I almost forgot to mention that usually with both 1070s I get around 990 Sols/s (± 15 Sols/s )on Flypool with EWBF miner:

Flypool sample Zcash mining

INFO 12:47:26: GPU0 Accepted share 211ms [A:11, R:0]
INFO 12:47:31: GPU1 Accepted share 212ms [A:18, R:0]
Temp: GPU0: 52C GPU1: 49C
GPU0: 492 Sol/s GPU1: 491 Sol/s
Total speed: 983 Sol/s
INFO 12:47:32: GPU1 Accepted share 213ms [A:19, R:0]
INFO 12:47:58: GPU0 Accepted share 210ms [A:12, R:0]
Temp: GPU0: 53C GPU1: 50C
GPU0: 502 Sol/s GPU1: 492 Sol/s
Total speed: 994 Sol/s
INFO 12:48:14: GPU0 Accepted share 212ms [A:13, R:0]
INFO 12:48:22: GPU1 Accepted share 2547ms [A:20, R:0]
Temp: GPU0: 53C GPU1: 50C
GPU0: 496 Sol/s GPU1: 496 Sol/s
Total speed: 992 Sol/s
INFO 12:48:38: GPU1 Accepted share 214ms [A:21, R:0]
INFO 12:48:38: GPU0 Accepted share 211ms [A:14, R:0]
INFO 12:48:40: GPU0 Accepted share 212ms [A:15, R:0]
INFO: Detected new work: 2b143f9997751f745440
INFO 12:48:57: GPU1 Accepted share 214ms [A:22, R:0]
Temp: GPU0: 54C GPU1: 50C
GPU0: 494 Sol/s GPU1: 502 Sol/s
Total speed: 996 Sol/s
INFO 12:49:05: GPU0 Accepted share 219ms [A:16, R:0]
INFO 12:49:06: GPU1 Accepted share 213ms [A:23, R:0]
INFO 12:49:13: GPU1 Accepted share 212ms [A:24, R:0]
INFO 12:49:27: GPU0 Accepted share 211ms [A:17, R:0]
INFO 12:49:28: GPU1 Accepted share 212ms [A:25, R:0]
Temp: GPU0: 54C GPU1: 51C
GPU0: 501 Sol/s GPU1: 496 Sol/s
Total speed: 997 Sol/s
INFO 12:49:34: GPU0 Accepted share 211ms [A:18, R:0]
INFO 12:49:49: GPU0 Accepted share 211ms [A:19, R:0]
INFO 12:49:50: GPU0 Accepted share 210ms [A:20, R:0]
INFO 12:49:55: GPU0 Accepted share 211ms [A:21, R:0]
Temp: GPU0: 54C GPU1: 51C
GPU0: 494 Sol/s GPU1: 500 Sol/s
Total speed: 994 Sol/s
INFO 12:50:20: GPU1 Accepted share 1658ms [A:26, R:0]
INFO 12:50:25: GPU1 Accepted share 211ms [A:27, R:0]
Temp: GPU0: 54C GPU1: 50C
GPU0: 498 Sol/s GPU1: 496 Sol/s
Total speed: 994 Sol/s
INFO 12:50:32: GPU1 Accepted share 212ms [A:28, R:0]
INFO 12:50:32: GPU0 Accepted share 212ms [A:22, R:0]
INFO 12:50:39: GPU1 Accepted share 211ms [A:29, R:0]

ok thanks for the comparison, there is definetly more optimizations I can do for higher end nvidia GPU’s

1 Like

Great :smiley:

I’ll be glad to help with further testing in the future.
I shall follow this forum and on github.

May I know what’s your settings for 1070? I can only get 420 sol/s in win 10.

1 Like

Sure :smiley:6
My 1070s are:
Gigabyte GTX 1070 G1 Gaming and Asus GTX 1070 Strix.
The weird thing is that even though the Asus card has a slightly higher ‘basic’ clock and ‘boost’ clock, it still overclocks not as good as the Gigabyte one.
Almost a year ago, when I was buying the Gigabyte card, I read somewhere that they have a difference in power phases.
Here found it:

Comment by Chillsabre @ Partpicker

https://pcpartpicker.com/forums/topic/170290-asus-strix-gaming-oc-gtx-1070-vs-gigabyte-g1-gaming-gtx-1070#cx1819212

Dunno if that is true or not though.

I have increased the power limit to 111%, but I try to keep the cards colder than 60 C with a cheapo Chinese made fan.

Here are the settings and the fan curve for both GPUs

OC Settings - Album on Imgur

Hope this helps :))

Sure. Here you go
C:\Users\Mark\Documents\igpu_v0.0.1>igpu.exe -p 1,1 -d 0,1 -b 2000
Setting log level to 2
[18:40:33][0x000005a8] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1080 Ti GS=512
[18:40:33][0x000005a8] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1080 Ti GS=512
[18:40:33][0x000014bc] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1080 Ti GS=512

[18:40:33][0x00001618] Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1080 Ti GS=512

[18:40:34][0x000005a8] Benchmark starting… this may take several minutes, please wait…
[18:40:38][0x000005a8] Benchmark done!
[18:40:38][0x000005a8] Total time : 3436 ms
[18:40:38][0x000005a8] Total iterations: 2000
[18:40:38][0x000005a8] Total solutions found: 3661
[18:40:38][0x000005a8] Speed: 582.072 I/s
[18:40:38][0x000005a8] Speed: 1065.48 Sols/s

C:\Users\Mark\Documents\igpu_v0.0.1>setx GPU_FORCE_64BIT_PTR 0 >nul 2>&1

C:\Users\Mark\Documents\igpu_v0.0.1>
C:\Users\Mark\Documents\igpu_v0.0.1>setx GPU_MAX_HEAP_SIZE 100 >nul 2>&1

C:\Users\Mark\Documents\igpu_v0.0.1>setx GPU_USE_SYNC_OBJECTS 1 >nul 2>&1

C:\Users\Mark\Documents\igpu_v0.0.1>setx GPU_MAX_ALLOC_PERCENT 100 >nul 2>&1

C:\Users\Mark\Documents\igpu_v0.0.1>setx GPU_SINGLE_ALLOC_PERCENT 100 >nul 2>&1

C:\Users\Mark\Documents\igpu_v0.0.1>igpu.exe -p 1,1 -d 0,1 -b 2000
Setting log level to 2
[18:43:26][0x00000abc] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1080 Ti GS=512
[18:43:27][0x00000abc] Benchmarking OPENCL worker (OCL-KAK) Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1080 Ti GS=512
[18:43:27][0x00001ef4] Platform #1 'NVIDIA CUDA ', D#1 GeForce GTX 1080 Ti GS=512

[18:43:27][0x000021ac] Platform #1 'NVIDIA CUDA ', D#0 GeForce GTX 1080 Ti GS=512

[18:43:28][0x00000abc] Benchmark starting… this may take several minutes, please wait…
[18:43:31][0x00000abc] Benchmark done!
[18:43:31][0x00000abc] Total time : 3430 ms
[18:43:31][0x00000abc] Total iterations: 2000
[18:43:31][0x00000abc] Total solutions found: 3677
[18:43:31][0x00000abc] Speed: 583.09 I/s
[18:43:31][0x00000abc] Speed: 1072.01 Sols/s

and with the Intel GPU
C:\Users\Mark\Documents\igpu_v0.0.1>igpu.exe -p 0 -d 0 -b 50
Setting log level to 2
[18:44:49][0x00000698] Benchmarking OPENCL worker (OCL-KAK) Platform #0 'Intel(R) OpenCL ', D#0 Intel(R) HD Graphics 630 GS=256
[18:44:49][0x00000da4] Platform #0 'Intel(R) OpenCL ', D#0 Intel(R) HD Graphics 630 GS=256

[18:44:50][0x00000698] Benchmark starting… this may take several minutes, please wait…
[18:44:58][0x00000698] Benchmark done!
[18:44:58][0x00000698] Total time : 8319 ms
[18:44:58][0x00000698] Total iterations: 50
[18:44:58][0x00000698] Total solutions found: 99
[18:44:58][0x00000698] Speed: 6.01034 I/s
[18:44:58][0x00000698] Speed: 11.9005 Sols/s

For the Nvidia’s that’s about 500 sols/s less than I get with EWBF miner. Cards settings are 80% power, +200 core clock, +550 mem clock

1 Like

ok thanks for the feed back. I only use 512 threads instead of 1024 so Im guessing thats robbing a lot of performance. When I get my 1070 Ill be able to tune it might much higher.

Thanks all!

1 Like

I have to consider electricity cost when adjusting power limit, right now the power efficiency is ok which is around 3.7 to 4. If increased to so high value, my Gainwind 1070 is not stable.

1 Like

Fair enough, mate.

I suggest you fine tune the OC settings to find the stable and optimal parameters and then lower them just one step down.
I read this somewhere online and have been following it since.

Also, from my experience, stability varies per algorithm:
Last month I was mining ZEC for a week with what I thought were stable settings, yet when I decided to mine through Nicehash for sometime and when Excavator switched to Lyra2rev2, one of the GPUs crashed and reverted to default settings, so I had to adjust the settings.

Best of luck!

Each algo has different stable settings. When you’re using a multipool algo like NiceHash you need very small overclock. When you are only mining 1 coin, you can do aggressive oc.

1 Like

just wanted to bump the thread to see if there is any new information or updates on your miner?

@CitricAcid
I havent posted any new binaries. I just got a GTX 1070 and im playing around with that at the moment. Hopefully I will get performance increases in the next week or so.

Thanks for the interest!

Dev notes:

Performance on my initial binary was around ~300 sols/s on the 1070, which is significantly less than EBWF, but that miner uses a different row count and slot size than what I had created. I have figured out a way of using less Rows and more shared memory to maximize the shared memory on Pascal cards. I just haven’t finished recoding it all.

Also I want to make it easier to start the miner with all attached GPU’s that are capable of running, with a way of overriding that behavior with a command line.

During this testing phase there is no dev fee (not that you would want to use it because low performance) and I look forward to getting it out there and seeing where i can go with it.

I have a high end 1x1080ti, so I will give it a try in the weekend. Also, a nice excuse to get hacking OpenCL :wink: