I promise to finish this topic with this post! (part 1, part 2, part 3)
We ended with a rather efficient (but still double exponential) code that gave us hope to extend the sequence a little bit further. So, what can we improve? Constant, a little, by efficient parallelisation.
Each step of our BFS consists of independent processing of a huge number of objects, and occasional writes to a hashtable.
We refactor our code, so it works in a pipeline, reading one layer of bitmasks from stdin
(antimatroids of size ) and writing another to stdout
(of size ), so we have intermediate results stored on disk.
We use new additions to a C++:
#include <mutex> #include <thread>
We need to guard hashtable for race conditions:
const int magic = 2000003; class hashtable { public: vector<vector > table; int size; mutex mtx_; hashtable() { table.resize(magic); size = 0; } void insert(const u128& X) { int h = hash_(X); bool ok = true; mtx_.lock(); for(auto& i : table[h]) if(i==X) { ok = false; break; } if(ok) { table[h].push_back(X); size++; } mtx_.unlock(); } };
We also provide ourselves with an object to read and distribute bitmasks from standard input:
class producer { mutex mtx_; public: int s; bool get(u128& x) { lock_guard guard(mtx_); if(s==0) return false; s--; x.scan(); return true; } };
lock_guard
creates an object that will be automatically destructed at the end of scope of the get()
.
And running consumers in parallel is very simple:
vector threads; for(int i=0;i<thread_cnt;i++) threads.emplace_back(consumer); for (auto& t: threads) t.join();
In the end we have this code.
Running it on CPU able to efficiently 12 threads, we calculated:
64649980092538 12884849614
It took only 2.5 days, and at the critical point, used 16GB of RAM (delights of double-exponentiality!).
So we end with:
all[] = 1, 1, 3, 22, 485, 59386, 133059751, 64649980092538
nonisomorphic[] = 1, 1, 2, 6, 34, 672, 199572, 12884849614
Is it possible to calculate further values? Not without any breakthrough, and doing actually something smart here, instead of just optimizing brute force. Even exponential calculations would be a huge speed-up.
(I would like to thank Michał Bartoszkiewicz for a huge help he provided while writing the code above, and for all the help in performing final computations.)