Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor (in?)consistency in terms naming with Treatment scheme #151

Open
m-dz opened this issue Aug 28, 2019 · 0 comments
Open

Minor (in?)consistency in terms naming with Treatment scheme #151

m-dz opened this issue Aug 28, 2019 · 0 comments

Comments

@m-dz
Copy link

m-dz commented Aug 28, 2019

We've recently ran into a pretty silly problem with terms naming when using the Treatment scheme, see below:

Imports and data prep.:

import numpy as np
from patsy import dmatrices, dmatrix, demo_data
data = demo_data("a", "b", "x1", "x2", "y", "z column")
  1. Single quotation marks snippet:
dmatrix("C(a, Treatment('a1')) + x1 + x2", data)
  1. Double quotation marks snippet:
dmatrix('C(a, Treatment("a1")) + x1 + x2', data)

Skipping the full printout, 1) gives the following terms' names:

  Terms:
    'Intercept' (column 0)
    "C(a, Treatment('a1'))" (column 1)
    'x1' (column 2)
    'x2' (column 3)

while 2):

  Terms:
    'Intercept' (column 0)
    'C(a, Treatment("a1"))' (column 1)
    'x1' (column 2)
    'x2' (column 3)

This inconsistency in quotation marks used in the output caused some troubles when post-processing/cleaning terms' names etc. I understand the output is consistent with the input, but it might be beneficial to standardise the output here (as in "C(a, Treatment('a1'))" -> 'C(a, Treatment("a1"))').

This seems loosely related to e.g. #40 with its long categorical names, and if the answer is similar, i.e. better not to fix things that aren't broken, maybe this can at least be mentioned in the docs? Happy to make w PR for that.

Edit:
patsy 0.5.1
Python 2.7.5 (I know...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant