Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPCC from "remote" node causes less than expected data to be loaded #47

Open
tylarb opened this issue Sep 3, 2020 · 4 comments
Open

Comments

@tylarb
Copy link

tylarb commented Sep 3, 2020

When I run a TPCC load from a node that is undersized and in a different zone as the cluster. I get data less than expected to be loaded. Given 100 warehouses, I expect some number of orders = 30k times number of warehouses, new_orders = 90k, order_lines=30k.

These numbers, given random inserts, will have some variability, in the realm of 1-2%, decreasing as load/threads goes up.

Data representing this will be uploaded shortly.

Here is the cluster setup and tpcc load module:
cluster:

US-West
3 node
i3.8xlarge, 32core, ~240Gb ram

Node which tpcc is running from:

US-east
c5.xlarge 4vcpus, 8GB RAM

Issue happens with small numbers of threads.
Tested and failing with 4, 8, 48 threads

100 warehouses

Command for example:

time ./tpccbenchmark --create=true --load=true --nodes=[node1, node2, node3]  --warehouses=100   --loaderthreads 8

The tpcc benchmark success but with the above errors.

@tylarb
Copy link
Author

tylarb commented Sep 3, 2020

Here is a healthy run, with counts as expected:

select count(*) as warehouses from warehouse;
 warehouses
------------
        100
(1 row)
select count ( distinct  d_w_id ) as warehouses, count(*) as districts from district;
 warehouses | districts
------------+-----------
        100 |      1000
(1 row)
select count ( distinct  c_w_id ) as warehouses, count(*) as customers from customer;
 warehouses | customers
------------+-----------
        100 |   3000000
(1 row)
;
select count ( distinct  s_w_id ) as warehouses, count(*) as stocks from stock;
 warehouses | stocks
------------+---------
        100 | 4920832
(1 row)
select count ( distinct  h_w_id ) as warehouses, count(*) as history from history;
 warehouses | history
------------+---------
        100 | 3000000
(1 row)
select count ( distinct  o_w_id ) as warehouses, count(*) as orders from oorder;
 warehouses | orders
------------+---------
        100 | 3000000
(1 row)
select count ( distinct  no_w_id ) as warehouses, count(*) as new_orders from new_order;
 warehouses | new_orders
------------+------------
        100 |     900000
(1 row)

@tylarb
Copy link
Author

tylarb commented Sep 3, 2020

Here's an unhealty connection, from the node in different region:

{$some different smaller node}$ time ./tpccbenchmark --create=true --load=true --nodes=172.151.59.4,172.151.49.178,172.151.55.80 --warehouses=100 --loaderthreads 48

yugabyte=# select count(*) as warehouses from warehouse;
 warehouses 
------------
        100
(1 row)

yugabyte=# select count ( distinct  d_w_id ) as warehouses, count(*) as districts from district;
 warehouses | districts 
------------+-----------
        100 |      1000
(1 row)

yugabyte=# select count ( distinct  c_w_id ) as warehouses, count(*) as customers from customer;

 warehouses | customers 
------------+-----------
        100 |   3000000
(1 row)

yugabyte=# 
yugabyte=# select count ( distinct  s_w_id ) as warehouses, count(*) as stocks from stock;

 warehouses | stocks  
------------+---------
        100 | 7281152
(1 row)

yugabyte=# select count ( distinct  h_w_id ) as warehouses, count(*) as history from history;

 warehouses | history 
------------+---------
        100 | 3000000
(1 row)


yugabyte=# 
yugabyte=# select count ( distinct  o_w_id ) as warehouses, count(*) as orders from oorder;

 warehouses | orders  
------------+---------
        100 | 2978220
(1 row)

yugabyte=# 
yugabyte=# select count ( distinct  no_w_id ) as warehouses, count(*) as new_orders from new_order;

 warehouses | new_orders 
------------+------------
        100 |     892920
(1 row)

@tylarb
Copy link
Author

tylarb commented Sep 3, 2020

Here's the same data, this time from a tpcc with 8 threads (other values the same). The performance was abysmal - over 2 hours to completion, compared to about 17 min running locally, and plenty of errors. So I am expecting them to be related.

yugabyte=# select count(*) as warehouses from warehouse;
 warehouses 
------------
        100
(1 row)
yugabyte=# select count ( distinct  c_w_id ) as warehouses, count(*) as customers from customer;
 warehouses | customers 
------------+-----------
        100 |   3000000
(1 row)

yugabyte=# select count ( distinct  s_w_id ) as warehouses, count(*) as stocks from stock;
 warehouses |  stocks  
------------+----------
        100 | 10000000
(1 row)

yugabyte=#  select count ( distinct  h_w_id ) as warehouses, count(*) as history from history;
 warehouses | history 
------------+---------
        100 | 3000000
(1 row)

yugabyte=# select count ( distinct  o_w_id ) as warehouses, count(*) as orders from oorder;
 warehouses | orders  
------------+---------
        100 | 2860457
(1 row)

yugabyte=# select count ( distinct  no_w_id ) as warehouses, count(*) as new_orders from new_order;
 warehouses | new_orders 
------------+------------
        100 |     856106
(1 row)

This time, we're well under-valued for orders and new_orders compared to local - 5% under.

@tylarb
Copy link
Author

tylarb commented Sep 4, 2020

CC issue #46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant