thoughput and compression ratio of the high level API

Hello, I am testing the high level APIs on a V100 GPU (Summit) with a very simple benchmark. The input data is generated from random numbers between (0, 1). I got a few questions and it would be very helpful if you guys could shed some lights on them.

I got ~0.1GB/s for the comp and decomp thoughput. I am not sure what would a typical thoughput for mgard but does this seem low?
The API takes host/managed pointer. I guess the host-device copies (assumming comp/decomp happens on GPU) might lower the throughput. Is there a way to directly pass a device pointer and do all the work on GPU?
With the ABS error bound, if I set the tolerance below 1.0e-4, the data will not be compressed but inflated, i.e., compression ratio < 1.0. May I ask what causes this? Are there any lower bounds for the tolerance?

Below is the test I am using. Thank you so much!

#include <vector>
#include <iostream>
#include <random>
#include <limits>
#include "mgard/compress_x.hpp"

const double eps = std::numeric_limits<double>::epsilon();

int main()
{
  mgard_x::SIZE ni = 128;
  mgard_x::SIZE nj = 128;
  mgard_x::SIZE nk = 16;
  mgard_x::SIZE nCell = ni * nj * nk;
  std::vector<mgard_x::SIZE> shape({ni, nj, nk});

  std::random_device rd;
  std::default_random_engine eng(rd());
  std::uniform_real_distribution<double> gen(0.0, 1.0);

  double *arr_h = new double [nCell];
  for (int i=0; i<nCell; ++i) arr_h[i] = gen(eng);

  mgard_x::Config config;
  config.dev_type = mgard_x::device_type::CUDA;
  config.lossless = mgard_x::lossless_type::Huffman;
  config.uniform_coord_mode = 1;
  config.timing = true;

  void*  compArr = nullptr;
  size_t compSz;
  mgard_x::compress(3, mgard_x::data_type::Double, shape, 1.0e-6, 0.0,
                    mgard_x::error_bound_type::ABS, arr_h, compArr,
                    compSz, config, false);

  double ratio = (double)(nCell*sizeof(double)) / compSz;
  std::cout << "ratio " << ratio << "\n";

  void* decompArr;
  mgard_x::decompress(compArr, compSz, decompArr, config, false);

  double  maxabs = 0.0, avgabs = 0.0;
  double  maxrel = 0.0, avgrel = 0.0;
  //double* output = decompArr;
  for (int i=0; i<nCell; ++i) {
    double err = fabs(arr_h[i] - ((double*)decompArr)[i]);
    maxabs  = std::max(err, maxabs);
    avgabs += err;
    maxrel  = std::max(err/(fabs(arr_h[i])+eps), maxrel);
    avgrel += err / (fabs(arr_h[i]) + eps);
  }
  avgabs /= nCell;
  avgrel /= nCell;
  std::cout << "max abs err " << maxabs << " avg abs err " << avgabs << "\n";
  std::cout << "max rel err " << maxrel << " avg rel err " << avgrel << "\n";

  delete [] arr_h;                                                                                                                                                                               
  return 0;
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions