predict function doesn't return the correct predictions with num_actors >1 #231

faaany · 2022-08-24T09:21:26Z

Hi, when using the following code snippet to do xgboost training, I noticed that the results that the predict function returns are different when I set the number of actors to different values. In my case, I need to set the number of actors to 1 in the predict function in order to get the correct predictions.

   `ray.init() 
    cpus_per_actor = 15
    num_actors = 10
    ray_params = RayParams(num_actors=num_actors, cpus_per_actor=cpus_per_actor, elastic_training=True, max_failed_actors=1, max_actor_restarts=1)`


    
    dtrain = RayDMatrix(
                    train_path,
                    label=name,  
                    columns=feature_list[numlabel],
                    filetype=RayFileType.PARQUET)
    dvalid = RayDMatrix(
            valid_path,
            label=name, 
            columns=feature_list[numlabel],
            filetype=RayFileType.PARQUET)

    print("Training.....")
    model = train(xgb_parms, 
            dtrain,
            evals=[(dtrain,'train'),(dvalid,'valid')],
            num_boost_round=250,
            early_stopping_rounds=25,
            verbose_eval=25,
            ray_params=ray_params)
    
    model.save_model(f"{model_save_path}/xgboost_{name}_stage1.model")

    print('Predicting...')        
    dvalid = RayDMatrix(
                    valid_path,
                    label=name, 
                    columns=feature_list[numlabel],
                    filetype=RayFileType.PARQUET)
    
    oof[:, numlabel] = predict(model, dvalid,  ray_params=RayParams(num_actors=num_actors, cpus_per_actor=1))`

The returned predictions for num_actors=1:

[0.00197015 0.00656855 0.00210109 ... 0.00132486 0.00912175 0.03348438]

The returned predictions for num_actors=10:

[0.00253869 0.02829305 0.0060115 ... 0.00152305 0.01026866 0.03538961]

Is this a bug or am I setting the number of actors wrong? Thanks for your review!

The text was updated successfully, but these errors were encountered:

Yard1 · 2022-08-26T18:29:22Z

I tried running the following code locally:

from sklearn import datasets
from sklearn.model_selection import train_test_split

import numpy as np

from xgboost_ray import RayDMatrix, RayParams
from xgboost import XGBClassifier

from xgboost_ray.main import predict


# Load dataset
data, labels = datasets.load_breast_cancer(return_X_y=True)
# Split into train and test set
train_x, test_x, train_y, test_y = train_test_split(
    data, labels, test_size=0.25)

xgb = XGBClassifier()
xgb.fit(train_x, train_y)
pred = xgb.predict_proba(test_x)[:, 1]
print(pred)

pred_1 = predict(xgb.get_booster(), RayDMatrix(test_x), ray_params=RayParams(num_actors=1))
print(pred_1)

pred_8 = predict(xgb.get_booster(), RayDMatrix(test_x), ray_params=RayParams(num_actors=8))
print(pred_8)

assert np.allclose(pred, pred_1)
assert np.allclose(pred, pred_8)

and got the same results. will try in a distributed setting, and with the higgs dataset.

Yard1 · 2022-08-26T20:44:38Z

I can reproduce this

Yard1 · 2022-08-26T20:56:05Z

As a workaround, you can either use the new Ray AIR API, or switch to sharding=RayShardingMode.BATCH in prediction RayDMatrix.

faaany · 2022-08-29T06:09:42Z

it works by adding sharding=RayShardingMode.BATCH to the prediction RayDMatrix. Close this issue.

faaany · 2022-08-29T06:09:57Z

thanks!

Yard1 · 2022-08-29T12:21:56Z

Let's keep this open as this is still a bug :)

Yard1 self-assigned this Aug 26, 2022

faaany closed this as completed Aug 29, 2022

Yard1 reopened this Aug 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict function doesn't return the correct predictions with num_actors >1 #231

predict function doesn't return the correct predictions with num_actors >1 #231

faaany commented Aug 24, 2022 •

edited

Yard1 commented Aug 26, 2022

Yard1 commented Aug 26, 2022

Yard1 commented Aug 26, 2022 •

edited

faaany commented Aug 29, 2022

faaany commented Aug 29, 2022

Yard1 commented Aug 29, 2022

predict function doesn't return the correct predictions with num_actors >1 #231

predict function doesn't return the correct predictions with num_actors >1 #231

Comments

faaany commented Aug 24, 2022 • edited

Yard1 commented Aug 26, 2022

Yard1 commented Aug 26, 2022

Yard1 commented Aug 26, 2022 • edited

faaany commented Aug 29, 2022

faaany commented Aug 29, 2022

Yard1 commented Aug 29, 2022

faaany commented Aug 24, 2022 •

edited

Yard1 commented Aug 26, 2022 •

edited