Pepin

Pepin is a DNF streaming counting tool that gives the number of approximate points in a volume given a set of n-dimensional boxes that may intersect. Think of it as a tool that takes a set of cubes in space and approximates the total volume of all cubes. The related research paper, published at IJCAI 2023 is Engineering an Efficient Approximate DNF-Counter.

The tool takes in a set of cubes from a DNF file, such as this, that has 10 dimensions, and 2 cubes:

$ cat myfile.dnf
p dnf 20 2
1 2 3 0
1 -5 0

And outputs a probabilistically approximate count.

You can build&run it with:

sudo apt-get install libgmp-dev zlib1g-dev libboost-program-options-dev cmake build-essential git
git clone https://github.com/meelgroup/pepin
mkdir build
cd build
cmake ..
make

Then you can run it like:

./pepin --epsilon 0.15 --delta 0.1 myfile.dnf
[...]
c [dnfs] Low-precision approx num points: 348672

Notice that the cube 1 2 3 has 2**17=131072 points, and 1 -5 has 2**18=262144, so a total of 393216. But they overlap, 1 2 3 -5 is counted twice. So the exact number is: 327680. Hence, we over-approximated a bit here. The error is 1.0-348672/327680=-0.064, so about 6.4%. This is well below the advertised 15% error allowed (i.e. epsilon 0.15).

Weighted Counting

Pepin is a streaming, weighted approximate model. Because it works with data streams, you must declare the weights of all literals before you start the stream. Hence, the weights need to be declared at the top of the DNF file. Here is an example:

$ cat myfile-weighted.dnf
p dnf 20 2
w 1 2/3
w 2 3/4
1 2 3 0
1 -5 0

The weights of variables 1 and 2 are now declared to be 2/3 and 3/4, respectively. The rest of the variables will have weight 1/2 by default.

Current Limitations

Currently, the algorithm can only return estimate after the exact number of clauses have been passed in as promised. This means you must have the p dnf VARS CLS header correct in your DNF file or the tool will not work.

Library Use

You can install the library with sudo make install. Then, the installed header file pepin/pepin.h can be used as a library:

#include <pepin/pepin.h>
#include <iostream>
#include <iomanip>
#include <vector>
using namespace PepinNS;

int main() {
  Pepin pepin;
  pepin.new_vars(20);
  pepin.set_n_cls(2);

  typedef std::vector<Lit> CL;
  pepin.add_clause(CL{itol(1), itol(2), itol(3)});
  pepin.add_clause(CL{itol(1), itol(-5)});
  auto weigh_nsols = pepin.get_low_prec_appx_weighted_sol();
  std::cout << "Solution: " << std::scientific << *weigh_nsols << "std::endl;

  return 0;
}

The library is clean -- it cleans up after itself, you can have more than one in memory, you can even run them in parallel, if you wish.

Fuzzing

You can fuzz by building DNFKLM from here, putting the resulting DNFKLM binary into build/, and then:

cd build
ln -s ../scripts/* .
./build_norm.sh
mkdir -f tests
./fuzz_test.py

You can check fuzz_test.py to adjust etc.

License

MIT license all the way through

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github/workflows		.github/workflows
cmake		cmake
scripts		scripts
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE.txt		LICENSE.txt
README.md		README.md
pepinConfig.cmake.in		pepinConfig.cmake.in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

cmake

cmake

scripts

scripts

src

src

.gitignore

.gitignore

CMakeLists.txt

CMakeLists.txt

LICENSE.txt

LICENSE.txt

README.md

README.md

pepinConfig.cmake.in

pepinConfig.cmake.in

Repository files navigation

Pepin

Weighted Counting

Current Limitations

Library Use

Fuzzing

License

About

Releases 3

Contributors 2

Languages

License

meelgroup/pepin

Folders and files

Latest commit

History

Repository files navigation

Pepin

Weighted Counting

Current Limitations

Library Use

Fuzzing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages