omaralv that’s great! I know I have seen some people interested in CUDA for nvidia and that the other miner softwares aren’t touching it yet since most miners run AMD.
You are correct, I haven’t committed any of the kernel code that I have locally. This is simply because the code isn’t ready yet. I’m hoping if I have enough time to have a basic generalized birthday problem solver committed soon.
Not familiar with the bitonic sort or if there are any limitations to it as it applies to the gpb for PoW. I will look more into this!
Items:
More research on kernel development.
Improved javascript example (will source soon!)
Current solvers run in two steps. And there is a issue that was deprioritized to combine these steps for optimization. I am not sure if the current gpu miners are already doing this or not.
1. Sort the current round's list using std::sort (which uses a quicksort variant).
2. In a single pass, transform the list into the next round's list.
There’s been some good suggestions here for debuggers and sorting algorithms. From what I can tell omaralv’s suggestion of the bitonic sort is a good pick for parallel sorting on GPU. I have a lot more reading up to do on this.
Another important piece to the solver is generating the list before even sorting. Which requires personalizing a blake2b 64 digestLength hash + nonce + index. Where person = a little endian binary c struct.pack('<ii', [n, k]), nonce ranges between 0 and 49 ish, and the index is computed with i/indices_per_hash_output where i ranges from 0 to 2^collision_length + 1. All of this is in the basic_gpb functions in both c++ and python implementations from str4d. I have been having some troubles with the node blake2 wrapper but soon I’ll post a javascript example too so people can study the algorithm in c++, python, or javascript.
This list L is of length N where N = 2^(n/k+1)+1. This give a hash function of size 2^collision_length + 1 a probability of returning ~two solutions. In practice N=131072.
However blake is not blake2b in many ways. Right now I am looking through john the ripper which may have a blake2b implementation. I can see here they were benchmarking it, but I’m not sure where the code resides yet john-dev - Re: PHC: Argon2 on GPU
If anyone knows of other blake2b opencl kernels we could use that would be majorly helpful. Even more so if they can do the process mentioned above.
Thank you for pointing me to sia! I had no idea they used blake2b as their hashing algo. Really appreciate any input you give me tromp I’m finding this all pretty cool to work on and you are a pro at this.
Correct, sorry, I have been testing with params k=5 n=96 as strd4 did when he began creating the equihash miner in zcashd. So N = 2^17 = 131072 but yeah, in practice this list will be much larger. ty!
Thanks for this work. I’ve been meaning to build my own version of this. Given time constraints I think I will dedicate my time to helping you guys. Give me a few days to catch up to your code implementation. Let me know if there are specific tasks you need done.
This anecdote from Taek (from Sia) is interesting, I haven’t seen it elsewhere. It’s about Sia miners finding it profitable to share a GPU resources for both Eth (memory bound) and Sia (compute bound). BotBot.me + Startup Resources | Startup Resources
today with Sia, the professional miners do a lot better than the home miners, because they have the right GPU, and they actually dual-mine Sia and Ethereum, as Sia is a computation-bottleneck and Eth is a memory bottleneck. Mining both gets you less coins of each, but more value overall
Great news! Although I think we should organize in some way, if not we will end up working on the same thing. Any ideas? A slack or something like that?
Sure, slack or gitter.
I’ve had a quick look over the open sourced code and there seems to be nothing there…
I dont want to start writing code that is probably already written somewhere. I guess a forum to discuss work would be a good place to start
I have a lot of test work locally that I didn’t feel comfortable committing since it’s fragmented. We are in a good place to assign tasks out and work together. We can discuss putting together a real roadmap in chat or assign github issues with features toward finishing / improving the miner.
Okay slack was a bad idea, I didn’t realize you couldn’t open email domain signups to public domains like gmail… wtf is that about, slack sucks. Preferred channel of communication will be FREENODE IRC #ZOGMINER
At this time we only have a javascript implementation of str4d’s basic equihash solver implemented in the zcash wallet. We are working on getting this running on GPU over the weekend. If all goes well we will have some software to distribute for testing. Wish us luck!
Today I was able to get zcash equihash solutions in javascript (it solves in ~18 seconds on CPU). I am going to be spending the rest of the night cleaning up and documenting the code. Something that would be nice to have is better documents on zcash’s basic equihash solver. Now people can study the algorithm in their choice of c++, python, or javascript.
Hopefully these can get others thinking about optimizing the process laid out in those repos.
I also had a chat with Omar who will be looking into a blake2b kernel that matches zcash specs. And over the weekend I plan to try and hack together the other needed sorting and comparison kernels needed for the basic solver.
If we can do this then we will at least have something to test on cards. This also may give us an idea of how much optimization the other miners have done on equihash solver. If our results are somewhat close then we can probably assume others are still using the basic not optimized solver and we are good to go! This is optimistic
As always thanks to anyone who donates to me. It helps me put my contract work aside to work longer on this.