8000 Remove `patch_method_to_DataFrame` and use piping for functions in `computation.operations` · Issue #166 · OSOceanAcoustics/echopop · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Remove patch_method_to_DataFrame and use piping for functions in computation.operations #166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leewujung opened this issue Jan 25, 2024 · 1 comment

Comments

@leewujung
Copy link
Member

See #164 (review) -- to keep code easily traceable seems better to just use .pipe and not a specialized decorator for these operations.

This is low priority since all functions are working fine now. We can do this after v0.4.2 is released.

@brandynlucca
Copy link
Collaborator

Just to provide additional context so I can reference this later to discern the logic/rationale for why I implemented these functions using a sort of monkey patch rather than just using the native pandas.DataFrame.pipe method.

There is certainly overlap between patch_method_to_DataFrame decorator and the associated functions (that will be renamed in via #164, but like discretize_variable) and the flexibility/extension of pandas.DataFrame that .pipe provides. My brain definitely prefers moving through pipes with chained method, e.g. object.function(*args), rather than as independent functions/mutations, e.g. function(object, *args). As you've pointed out, .pipe accomplishes the same task (it's almost as if pipe was intentionally designed that way or something...). So the (kind of) monkey patch feature in patch_method_to_DataFrame is (almost) entirely an aesthetic choice such that the difference for how discretize_variable is written out would be:

# the current (sort of) monkey patched pandas.DataFrame extension approach
specimen_grouped = (
    specimen_df_copy
    .assign( arbitrary_value = 1 )
    .discretize_variable( bin_values = length_intervals , bin_variable = 'length' )
    .assign( group = lambda x: np.where( x['sex'] == int(1) , 'male' , 'female' ) )
    .pipe( lambda df: pd.concat( [ df.loc[ df[ 'sex' ] != 3 ] , df.assign( group = 'all' ) ] ) )
    .assign( station = 2 )
)

# the 'let's not be beholden to Brandyn's mostly trivial and idiosyncratic code aesthetic preferences" approach
specimen_grouped = (
    specimen_df_copy
    .assign( arbitrary_value = 1 )
    .pipe( lambda df: discretize_variable( df , bin_values = length_intervals , bin_variable = 'length' ) )
    .assign( group = lambda x: np.where( x['sex'] == int(1) , 'male' , 'female' ) )
    .pipe( lambda df: pd.concat( [ df.loc[ df[ 'sex' ] != 3 ] , df.assign( group = 'all' ) ] ) )
    .assign( station = 2 )
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

2 participants
0