Enumeration of Antimatroids, part IV

I promise to finish this topic with this post! (part 1, part 2, part 3)

We ended with a rather efficient (but still double exponential) code that gave us hope to extend the sequence a little bit further. So, what can we improve? Constant, a little, by efficient parallelisation.

Each step of our BFS consists of independent processing of a huge number of objects, and occasional writes to a hashtable.
We refactor our code, so it works in a pipeline, reading one layer of bitmasks from stdin (antimatroids of size k) and writing another to stdout (of size k-1), so we have intermediate results stored on disk.

We use new additions to a C++:

#include <mutex>
#include <thread>

We need to guard hashtable for race conditions:

const int magic = 2000003;
class hashtable
{
    public:
        vector<vector > table;
        int size;
        mutex mtx_;
        hashtable()
        {
            table.resize(magic);
            size = 0;
        }
        void insert(const u128& X)
        {
            int h = hash_(X);
            bool ok = true;
            mtx_.lock();
            for(auto& i : table[h])
                if(i==X)
                {
                    ok = false;
                    break;
                }
            if(ok)
            {
                table[h].push_back(X);
                size++;
            }
            mtx_.unlock();
        }
};

We also provide ourselves with an object to read and distribute bitmasks from standard input:

class producer
{
    mutex mtx_;
    public:
    int s;
    bool get(u128& x)
    {
        lock_guard guard(mtx_);
        if(s==0)
            return false;
        s--;
        x.scan();
        return true;
    }
};

lock_guard creates an object that will be automatically destructed at the end of scope of the get().

And running consumers in parallel is very simple:

vector threads;
for(int i=0;i<thread_cnt;i++)
    threads.emplace_back(consumer);
for (auto& t: threads) t.join();

In the end we have this code.
Running it on CPU able to efficiently 12 threads, we calculated:

64649980092538
12884849614

It took only 2.5 days, and at the critical point, used 16GB of RAM (delights of double-exponentiality!).

So we end with:

all[] = 1, 1, 3, 22, 485, 59386, 133059751, 64649980092538
nonisomorphic[] = 1, 1, 2, 6, 34, 672, 199572, 12884849614

Is it possible to calculate further values? Not without any breakthrough, and doing actually something smart here, instead of just optimizing brute force. Even exponential calculations would be a huge speed-up.

(I would like to thank Michał Bartoszkiewicz for a huge help he provided while writing the code above, and for all the help in performing final computations.)

edit: A119770 was updated, A224913 was added.

Leave a comment