Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR Thompson sampling example #592

Open
fritzo opened this issue Apr 9, 2022 · 0 comments
Open

FR Thompson sampling example #592

fritzo opened this issue Apr 9, 2022 · 0 comments
Labels
examples Examples and tutorials

Comments

@fritzo
Copy link
Member

fritzo commented Apr 9, 2022

I came across this cute example of nested semiring dynamic programming in the context of Bayesian optimization. Thompson sampling first performs Bayesian regression, fitting a posterior p(θ|Xs,ys) over parameters θ given a fully-supervised dataset of (X,y) pairs, then repeatedly samples parameters θ and optimizes expected reward E[y|X,θ] over X. That is, at each step

X_optimal <- arg max_X int_y int_θ y p(y|θ,X)  p(θ) prod_i p(yi|θ,Xi)
# new choice                       --reward-- prior ---likelihood---

For example in Pyroed p(θ) prod_i p(yi|θ,Xi) is approximated variationally, and E[y|θ,X] is just a tensor contraction (although there it is optimized via annealed Gibbs sampling rather than via dynamic programming, as an aside we could add an annealed Gibbs sampling interpretation to Funsor for sum-product contractions).

What would be a good example here, where we might leverage Funsor's dynamic programming to implement both the integral and argmax operators?

@fritzo fritzo added the examples Examples and tutorials label Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples Examples and tutorials
Projects
None yet
Development

No branches or pull requests

1 participant