Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Programmatically creating splines and applying knots to new data #121

Open
BrianMiner opened this issue Mar 9, 2018 · 1 comment

Comments

@BrianMiner
Copy link

I have found that I can create a spline on training data and then apply to test data like this:

#create
x1= dmatrix("cr(x, df=3) - 1", {"x":TRAIN_DATA.VARIABLE.values})
#apply
xx1=build_design_matrices([x1.design_info], {"x":TEST_DATA.VARIABLE.values })

This works but of course requires manually creating variables or trying to programatically creating strings.

Is there anyway to do something like this
patsy.cr(x, df=5)

and grab the knots to apply to new data using the same function cr()?

@thequackdaddy
Copy link
Contributor

thequackdaddy commented Mar 11, 2018

I'm not really an expert, so there's likely an oversight here.

First, do you need to know the knots for some reason? If not, I think the canonical way would be to do something like...

# Build the design matrix
x = np.arange(100)
dm = patsy.dmatrix('cr(x, df=5)', {'x': x})

# Apply design matrix to new data... 
new_data = np.arange(25, 75)
patsy.dmatrix(dm.design_info, {'x': new_data})

If you really want to know what the knots were, you could probably dig through the dm.design_info object and find it.

However, it may be a little easier to pull the CR class out of the cr stateful transform function.

cr = patsy.cr.__patsy_stateful_transform__()
cr.memorize_chunk(x, df=5)
cr.memorize_finish()

cr._all_knots

You could also apply to the new data using...

cr.transform(new_data)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants