You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We can see __add__ got called once and received xr.DataArray obj but __radd__ got called 8 times and received ints. This causes 2 problems;
Performance issue on large xr.DataArray
No access to xr.DataArray coords which is needed in a more realistic use case
Describe the solution you'd like
I would like to have a mechanism so that DemoObj.__radd__ got called only once and received xr.DataArray instance in the above example.
Describe alternatives you've considered
Option 1:
The most naive approach to workaround this is to call obj.__radd__(da) to achieve da + obj which defeats the purpose of implementing the reflexive operator and not offer good readability.
Option 2:
As xr.DataArray._binary_op replies on numpy's operator resolving mechanism under the hood, I could improve the situation by setting __array_ufunc__ = None on my class, e.g.:
This will make __radd__ get called once with np.ndarray instead of 8 times with ints. This solves the potential perf concern, however, it still doesn't cover the case if xr.Dataarray.coords is needed.
I'm happy with a similar property if you prefer to make it xarray specific. I'm happy to make the PR as well once you confirmed the mechanism / property name you preferred.
Many thanks in advance!
The text was updated successfully, but these errors were encountered:
Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!
Indeed, currently Xarray very aggressively attempts to take control of all binary arithmetic operations (by applying them to the wrapped .data of the xarray object). I agree that this is definitely not ideal.
Xarray should only attempt to do this for objects with an API that work like multi-dimensional arrays. I see at least two ways to determine this:
As you suggest, we could use __array_ufunc__ = None like NumPy to indicate that an object explicitly does not have an API like NumPy arrays.
Alternatively, we return NotImplemented except for types that explicitly indicate that they do work like NumPy arrays, which in principle should be the same set of types that are valid when wrapped inside xarray objects, because they implement one of two generations of NumPy compatibility APIs (__array_ufunc__/__array_function__ or __array_namespace). Here is where the current code to check for compatibility with these objects lives:
My inclination would be to try the second solution first (I think it's a little cleaner / more comprehensive) but if that doesn't work I would be OK to fall back to the first one.
Is your feature request related to a problem?
I would like to implement reflexive operator on a custom class applied to xarray objects.
Following is a demo snippet:
Actual Output:
We can see
__add__
got called once and receivedxr.DataArray
obj but__radd__
got called 8 times and receivedint
s. This causes 2 problems;xr.DataArray
xr.DataArray
coords which is needed in a more realistic use caseDescribe the solution you'd like
I would like to have a mechanism so that
DemoObj.__radd__
got called only once and receivedxr.DataArray
instance in the above example.Describe alternatives you've considered
Option 1:
The most naive approach to workaround this is to call
obj.__radd__(da)
to achieveda + obj
which defeats the purpose of implementing the reflexive operator and not offer good readability.Option 2:
As
xr.DataArray._binary_op
replies on numpy's operator resolving mechanism under the hood, I could improve the situation by setting__array_ufunc__ = None
on my class, e.g.:This will make
__radd__
get called once withnp.ndarray
instead of 8 times withint
s. This solves the potential perf concern, however, it still doesn't cover the case ifxr.Dataarray.coords
is needed.Additional context
Considering
xr.DataArray._binary_op
has already returnedNoImplemented
for a list of classes:https://github.com/pydata/xarray/blob/v2025.01.1/xarray/core/dataarray.py#L4808-L4809
I'm wondering whether we should do the same for classes has
__array_ufunc__ = None
, i.e.:I'm happy with a similar property if you prefer to make it xarray specific. I'm happy to make the PR as well once you confirmed the mechanism / property name you preferred.
Many thanks in advance!
The text was updated successfully, but these errors were encountered: