Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new_loss_class - ValueError: Invalid Reduction Key when calling loss as a function #1342

Open
krzjoa opened this issue Jul 16, 2022 · 2 comments

Comments

@krzjoa
Copy link

krzjoa commented Jul 16, 2022

I'm trying to define a custom loss class. Normally, when using a loss from keras, it can be called in two ways:

  • directly as a function, e. g. keras::loss_mean_absolute_error(c(1,2,3), c(3,4,5))
  • as a callable object, for instance: keras::loss_mean_absolute_error(reduction = 'sum')(c(1,2,3), c(3,4,5))

When I define a custom loss class, the first way doesn't work correctly. I don't know, what's the cause. I've tried to skip the initialize function definition (example 1), but it hasn't solved the problem. I use the development version of keras (2.9.0.9000).

Below I attach an example. In the first case, it even raises an error when calling loss as an object, but it's not the main problem. I've used a custom quantiles parameter, so it can be easily overcome by adding the initialize method. Unfortunately, I cannot solve the problem with ValueError: Invalid Reduction Key when calling loss as a function.

library(keras)
library(tensorflow)

# =============================================================================
#                                   DUMMY DATA
# =============================================================================

y_pred <- array(runif(60), c(2, 10, 3))
y_true <- array(runif(20), c(2, 10, 1))

# =============================================================================
#                         INITIALIZE METHOD NOT DEFINED
# =============================================================================

loss_quantile_1 <- keras::new_loss_class(
  
  classname = "QuantileLoss",

  call = function(y_true, y_pred, quantiles){
    
    if (missing(quantiles))
      quantiles <- self$quantiles
    else
      quantiles <- self$.validate_quantiles(quantiles)
    
    quantiles <- array(quantiles, c(1, 1, length(quantiles)))
    quantiles <- keras_array(quantiles)
    
    errors <- tf$subtract(y_pred, y_true)
    errors <- k_cast(errors, 'float32')
    
    loss   <- tf$maximum(tf$subtract(quantiles, 1) * errors,
                         quantiles * errors)
    loss
  },
  
  .validate_quantiles = function(quantiles){
    if (any(quantiles > 1) | any(quantiles < 0)) {
      stop("It contains quatiles out of the [0, 1] range!")
    }
    quantiles
  }
  
)
#> Loaded Tensorflow version 2.8.0

# As a callable object
loss_quantile_1(quantiles = c(0.1, 0.5, 0.9), reduction = 'auto')(y_true, y_pred)
#> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: __init__() got an unexpected keyword argument 'quantiles'
loss_quantile_1(quantiles = c(0.1, 0.5, 0.9), reduction = 'sum')(y_true, y_pred)
#> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: __init__() got an unexpected keyword argument 'quantiles'

# As a function
loss_quantile_1(y_true, y_pred)
#> Error in py_call_impl(callable, dots$args, dots$keywords): ValueError: Invalid Reduction Key: [[[0.10292229]
#>   [0.49193897]
#>   [0.80630794]
#>   [0.99013599]
#>   [0.33000854]
#>   [0.83075855]
#>   [0.24299352]
#>   [0.26481244]
#>   [0.85807844]
#>   [0.99448096]]
#> 
#>  [[0.45570282]
#>   [0.34256617]
#>   [0.6961025 ]
#>   [0.10391989]
#>   [0.34779687]
#>   [0.87304018]
#>   [0.10824608]
#>   [0.21774361]
#>   [0.30273686]
#>   [0.02866354]]]. Expected keys are "('auto', 'none', 'sum', 'sum_over_batch_size')"

# =============================================================================
#                         INITIALIZE METHOD NOT DEFINED
# =============================================================================


loss_quantile_2 <- keras::new_loss_class(
  
  classname = "QuantileLoss",

    initialize = function(quantiles=NULL, ...){
    super()$`__init__`(...)
    self$quantiles <- self$.validate_quantiles(quantiles)
  },
  
  call = function(y_true, y_pred, quantiles){
    
    if (missing(quantiles))
      quantiles <- self$quantiles
    else
      quantiles <- self$.validate_quantiles(quantiles)
    
    quantiles <- array(quantiles, c(1, 1, length(quantiles)))
    quantiles <- keras_array(quantiles)
    
    errors <- tf$subtract(y_pred, y_true)
    errors <- k_cast(errors, 'float32')
    
    loss   <- tf$maximum(tf$subtract(quantiles, 1) * errors,
                         quantiles * errors)
    loss
  },
  
  .validate_quantiles = function(quantiles){
    if (any(quantiles > 1) | any(quantiles < 0)) {
      stop("It contains quatiles out of the [0, 1] range!")
    }
    quantiles
  }
  
)

# As a callable object
loss_quantile_2(quantiles = c(0.1, 0.5, 0.9), reduction = 'auto')(y_true, y_pred)
#> tf.Tensor(0.16841231, shape=(), dtype=float32)
loss_quantile_2(quantiles = c(0.1, 0.5, 0.9), reduction = 'sum')(y_true, y_pred)
#> tf.Tensor(10.104739, shape=(), dtype=float32)

# As a function
loss_quantile_2(y_true, y_pred)
#> Error in py_call_impl(callable, dots$args, dots$keywords): RuntimeError: Evaluation error: ValueError: Invalid Reduction Key: [[[0.08136004 0.18284503 0.15212684]
#>   [0.23967848 0.77847213 0.82365409]
#>   [0.02822829 0.0940565  0.02737745]
#>   [0.56501074 0.65513322 0.0235039 ]
#>   [0.04813642 0.4485324  0.6318595 ]
#>   [0.09528749 0.32116325 0.49146111]
#>   [0.43723217 0.87765376 0.33272954]
#>   [0.23735788 0.16774845 0.13686183]
#>   [0.31339923 0.16587593 0.7988089 ]
#>   [0.03746263 0.84035165 0.06726819]]
#> 
#>  [[0.61504559 0.78687068 0.07096574]
#>   [0.59740435 0.02917441 0.44003368]
#>   [0.04047055 0.92696131 0.94110391]
#>   [0.4742021  0.41549199 0.09645652]
#>   [0.70882213 0.39135755 0.53949784]
#>   [0.61301041 0.60270889 0.74573307]
#>   [0.70965792 0.27351185 0.10086689]
#>   [0.28157228 0.54496708 0.12288983]
#>   [0.53166794 0.59471474 0.29636166]
#>   [0.71853539 0.23627646 0.86413276]]]. Expected keys are "('auto', 'none', 'sum', 'sum_over_batch_size')"
#> .

Created on 2022-07-16 by the reprex package (v2.0.1)

@t-kalinowski
Copy link
Member

t-kalinowski commented Aug 1, 2022

Thank you very much for the thoughtful issue! I see that there is indeed a difference in the way built-in Loss functions in the package behave from the objects returned by new_loss_class(). I spent some time thinking on how to reconcile the difference and use-cases between the two modes of the loss_ builtins.

The difficulty is that the call() method of a new Loss subclass will need to be different from the stateless pure function expressing the same computation. Two big differences are:

  • The class method has a strict signature requirement (only (y_true, y_false), while the function has no such requirement
  • The pure function would not have self, private, ClassName, or other attributes and methods that would typically be resolved as part of self.

Attempting to patch a supplied call method that reconciles these difference is difficult and would be hard to maintain. Creating a new loss instance on each invocation of the user-created loss object as a state-less function also seems like a poor choice because it invites users into a slow and unsustainable workflow (especially if the loss object creates new variables... it would be antithetical to the "pit of success" principle).

I see two possible interfaces to enable this: we could ask users to supply a separate pure-function method, maybe like this:

new_loss_class(..., 
  call = function(y_true, y_pred) { self$ ... },
  call_fn = function(y_true, y_pred, extra_arg1, extra_arg2) { ... }
)

But this has it's downsides too (I think having call and call_fn would be be confusing quickly).

Perhaps, the best approach is to create a new function in Keras, as_loss_class(), taking a pure function as a callable:

as_loss_class(function(y_true, y_pred, extra_arg, ...) { ... }, name = "loss_name")

This would be analogous to the LossFunctionWrapper used internally in Python Keras, and also officially exported through tensorflow_addons.

This also has it's downsides because it grows the already large namespace, and having new_loss_class() and as_loss_class() might be confusing to new users too.

Note that today, you can pass a custom R function to compile(loss = ) directly, without creating a loss instance. The additional complexity of defining custom loss instances as created by new_loss_class() only becomes worthwhile if you need to keep track of state between invocations (e.g., you have a custom training loop), or are doing something in a distributed context and you want the framework to take care of the reduction op, both are advanced use-cases where the state-less pure function form wouldn't make sense. The 3rd option is to leave things as is and improve the documentation to make this distinction clearer.

A similar situation exists with the metrics, and you'll note that for backwards compatibility we preserved the state-less pure-function mode for the metrics that started out that way, but the more recently added metrics now only return a Metric instance, and cannot be used as pure functions directly.

I'm eager to hear about your use-case. Is this because you want to export a custom loss function from the keras.tft package?

@krzjoa
Copy link
Author

krzjoa commented Aug 3, 2022

Thank you for the clarification 😃

Yes, I'd like to use it in the keras.tft. Once I created loss functions for this project, I also prepared a couple of tests to check, if they work correctly. And the results was the discovery of the aforementioned discrepancies between the losses delivered by the keras package and the custom ones based on the new_loss_class.

I think I'll stick with the new_loss_class and simply add a relevant explanation in the docs about the differences in the loss objects behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants