Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update "Differences between R and Patsy Formulas" to state that dot on RHS is not supported #134

Open
tmelconian opened this issue Oct 22, 2018 · 1 comment

Comments

@tmelconian
Copy link

As discussed in #10, the dot/period on the RHS of a formula is not currently supported in Patsy. Could you update the docs, in the Comparison to R section https://patsy.readthedocs.io/en/latest/R-comparison.html, to mention this difference? Dot on the RHS is used quite extensively in R — in fact, most basic examples are written with it — so this is definitely a difference people will notice.

@steve-the-bayesian
Copy link

Here is some somewhat clunky code that can handle most of what the dot operator in R can do (albeit not as gracefully). I agree that the absence of '.' is a fairly large omission from the formula language.

`def dot(data_frame, omit=[]):
"""

Build a formula string by "summing" all entries except those on an 'omit
list'.  This would typically include the name of the variable on the left
hand side of the equation.

This function is named for the 'dot' operator in R, where a formula given
as 'y ~ .' means "regress y on all other variables".  The R dot operator
can also be used to specify interactions, as in y ~ .^2.  To allow for
similar specifications, the return value of this function is wrapped in
paraentheses "()".

Args:

  data_frame: The data frame from which to build the equation.  A list or
    array of column names is also acceptable.

  omit:  A list of names to omit.

Returns:
  A string containing the list of names in data_frame, separated by '+'.
  The return value begins with '(' and ends with ')' so that y~dot(data,
  omit=["y"])**2 can be used to specify all 2-way interactions.

"""
vnames = [x for x in data_frame.columns if x not in omit]
ans = "(" + "+".join(x for x in vnames) + ")"
return ans

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants