Google Summer of Code'20 Highlights with NumFOCUS¶
This post is meant to summarize the work done over the GSoC coding period. Let's get started real quick.
About the project¶
My GSoC proposal was about adding a Variational Inference interface to PyMC4. Apart from MCMC algorithms, VI proposes an approximating distribution to fit the posterior. The whole plan was to implement two Variational Inference algorithms - Mean Field ADVI and Full Rank ADVI.
Resolving Key challenges¶
Key Challenges | Solutions proposed | How its resolved |
---|---|---|
theano.clone equivalent for TF2 |
Model execution with replaced inputs | Normal distribution's sample method is executed over flattened view of parameters |
Flattened view of parameters | Use tf.reshape() |
Used tf.concat() with tf.reshape() |
Optimizers for ELBO | Use tf.keras.optimizers | Optimizers either from TFv1 or TFv2 with defaults from pymc3.updates can be used |
Initialization of MeanField and Full Rank ADVI | Manually set bijectors | Relied on tfp.TransformedVariable |
Progress bar | Use tqdm or tf.keras.utils.Progbar |
A small hack over tf.print |
Minibatch processing of data | Capture slice in memory | This is the only incomplete feature. Maybe tf.Dataset API has to explored more or implement our own tfp.vi.fit_surrogate_posterior function. |
Community Bounding Period¶
- This was a super interesting period. I got to know about many PyMC core developers through slack.
- I spent the entire time learning about the basics of Bayesian statistics, prior, posterior predictive checks, and the theory of Variational Inference.
- I had also written a blog post during this interval about the nuts and bolts of VI and the implementation of Mean Field ADVI as well in Tensorflow Probability. Here is the blog post - Demystify Variational Inference.
- The most difficult part of learning VI was to understand the transformations because PyMC3 and TFP handle transformations differently.
Month 1¶
The coding period started from June 1 and my intention for this period was to add a very basic and general Variational Inference interface to PyMC4. Here is the PR #280 and workflow of the basic interface was -
- Get the vectorized
log prob
of the model. - For each parameter of the model, have a Normal Distribution with the same shape and then build a posterior using
tfd.JointDistributionSequential
. - Add optimizers with defaults from PyMC3 and perform VI using
tfp.fit_surrogate_posterior
. - Sample from
tfd.JointDistributionSequential
and there is no need of equivalent oftheano.clone
. - Transform the samples by quering the
SamplingState
butDeterministics
have to be added as well. - Resolve shape issues with ArviZ. In short, making
chains=1
.
I got the basic interface merged by late June and now, it was time to work upon Full Rank ADVI. I managed to open a PR #289 with Full Rank ADVI interface by the end of June.
Month 2¶
This was the most dramatic month of GSoC coding period. Because Full Rank ADVI proposed in PR #289 resulted in errors most of the time. Here is the gist of workflow that was followed to get some useful insights about the errors -
- Instead of solving the shape issues independently and posing a
MvNormal
distribution for each parameter, build the posterior using flattened view of parameters. - There were lots of NaNs in the ELBO, because of improper handling of transformations. As a result,
Interval
,LowerBounded
andUpperBounded
transformations were added as well. - Then came the issue of
Cholesky Decomposition errors
while working with Gaussian Processes and Variational Inference. Here are my few insights after rigorous testing with different inputs -- Use dtype
tf.float64
with FullRank ADVI to maintain positive definiteness of covariance matrix. - Avoid aggressive optimization of ELBO. Maintain learning rates around
1e-3
. - Stabilize the diagonal of covariance matrix by adding a small jitter.
- Double check for NaNs in the data.
- Use dtype
- Here the results after trying reparametrization and different jitter amounts while doing VI.
I got this PR merged by the end of July. And now, it was time to work on adding some features to ADVI.
Month 3¶
After adding missing transformations in PR #289, my mentor asked me to write a proposal so as the Bounded Distributions are inherited instead of we applying transformations manually to each distribution. I explored each possibility to make a generalized version of transformations as it is done in PyMC3 using tf.cond
. Since, we do not have values before model execution, it was difficult to use tf.cond
. Here is the proposal's source.
After getting an interface to use MeanField and FullRank ADVI, some features that are included in the PR #310 -
- Add a progress bar. (This is small hack over
tf.print
) - Test progress bar in different OS.
- Add
ParameterConvergence
criteria to test convergence. - Add LowRank Approximation.
I am still working on adding examples on hierarchical models and I hope to get it merged soon.
Contributions¶
The Pull Requests I have opened and got merged during GSoC. I have explained each one above but here I try to summarize.
- Add Variational Inference Interface: #280
- Add Full Rank Approximation: #289
- Add features to ADVI: #310 (WIP)
- Remove transformations for Discrete distributions: #314
Gists created¶
Whatever experiments I perform to aid my learnings, I polish them out and share through GitHub gists. I do not why but I started loving to share code through GitHub gists rather than Colab or GitHub repo. Here are all the experiments I performed with ADVI during this summer.
- Comparison of MeanField ADVI in TFP, PyMC3, PyMC4: Source
- Demonstration of shape issues while working with InferenceData: Source
- Playing around Convergence and Optimizers: Source
- Tracking all parameters including deterministics: Source
- Implementation of FullRank ADVI in TFP: Source
- Comparison of MeanField and FullRank ADVI over correlated Gaussians: Source
- Model flattening and Full Rank ADVI in PyMC4: Source
- Missing transformations in PyMC4: Source
- Testing transformations in PyMC4: Source
- Distribution Enhancement Proposal: Source
- Hacking
tf.print
for progress bar: Source - Parameter Convergence Checks in TFP: Source
Future Goals¶
Some future tasks I would like to work upon -
- Configure Mini Batch processing of data.
- Add Normalizing Flows to variational inference interface.
- Add support of Variational AutoEncoders to PyMC4.
Conclusion¶
It was an incredible experience contributing to open source. I have improved my Python skills. I want to thank my mentors @ferrine and @twiecki for being extremely supportive throughout this entire journey. I am loving my time with the PyMC community. Next, I also want to thank @numfocus community for sharing this opportunity via Google Summer of Code.
Thank you for being a part of this fantastic summer.
With , Sayam Kumar