Skip to content

ivallesp/papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper Notes

This repository contains my personal notes about the papers I read since the start of 2020. The papers below belong to different subfields of artificial intelligence (e.g. deep learning and reinforcement learning).

I usually complement my notes with some scraps of the paper that I consider important, and in some cases with some additional information such as content from other papers, from videos or blogs that I found useful in my process of understanding the paper. In the latter case, I usually link the additional resources into the references section; in case you find your content in my notes and I didn't reference it properly, please feel free to open an issue and I will update my notes accordingly.

The majority of my notes are written in markdown and contain LaTeX formulas. GitHub doesn't parse the formulas so if you want to read my notes more comfortably you should use a markdown interpreter that implements this feature. I personally use VSCode with the LaTeX Workshop extension.

Papers

[Notes] [Paper] - 2023 – Modular Deep Learning – Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Maria Ponti – arXiv Preprint

[Notes] [Paper] - 2023 – Mamba: Linear-Time Sequence Modeling with Selective State Spaces – Albert Gu, Tri Dao – arXiv Preprint

[Notes] [Paper] - 2019 – Parameter-Efficient Transfer Learning for NLP – Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly – arXiv Preprint

[Notes] [Paper] - 2023 – RWKV: Reinventing RNNs for the Transformer Era – Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Stella Biderman, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Jiaju Lin, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Guangyu Song, Xiangru Tang, Bolun Wang, Johan S. Wind, Stanislaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Qinghua Zhou, Jian Zhu, Rui-Jie Zhu – Neurips 2023

[Notes] [Paper] - 2024 – Mixtral of Experts – Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed – Mistral report

[Notes] [Paper] - 2023 – GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints – Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón, Sumit Sanghai – arXiv Preprint

[Notes] [Paper] - 2024 – ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering – Yakun Song, Zhuo Chen, Xiaofei Wang, Ziyang Ma, Xie Chen – arXiv Preprint

[Notes] [Paper] - 2021 – Align before Fuse: Vision and Language Representation Learning with Momentum Distillation – Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi – arXiv Preprint

[Notes] [Paper] - 2021 – Efficiently Modeling Long Sequences with Structured State Spaces – Albert Gu, Karan Goel, Christopher Ré – ICLR 2022

[Notes] [Paper] - 2023 – Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale – Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu – arXiv Preprint

[Notes] [Paper] - 2023 – Boosting Large Language Model for Speech Synthesis: An Empirical Study – Hongkun Hao, Long Zhou, Shujie Liu, Jinyu Li, Shujie Hu, Rui Wang, Furu Wei – arXiv Preprint

[Notes] [Paper] - 2023 – Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias – Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao – arXiv Preprint

[Notes] [Paper] - 2022 – FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness – Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré – arXiv Preprint

[Notes] [Paper] - 2020 – Conformer: Convolution-augmented Transformer for Speech Recognition – Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang – Interspeech 2020

[Notes] [Paper] - 2023 – SoundStorm: Efficient Parallel Audio Generation – Zalán Borsos, Matt Sharifi, Damien Vincent, Eugene Kharitonov, Neil Zeghidour, Marco Tagliasacchi – Google Research Report

[Notes] [Paper] - 2023 – HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec – Dongchao Yang, Songxiang Liu, Rongjie Huang, Jinchuan Tian, Chao Weng, Yuexian Zou – arXiv Preprint

[Notes] [Paper] - 2023 – Consistency Models – Yang Song, Prafulla Dhariwal, Mark Chen, Ilya Sutskever – arXiv Preprint

[Notes] [Paper] - 2023 – Scaling Speech Technology to 1,000+ Languages – Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli – arXiv Preprint

[Notes] [Paper] - 2022 – Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning – Utku Evci, Vincent Dumoulin, Hugo Larochelle, Michael C. Mozer – arXiv Preprint

[Notes] [Paper] - 2020 – Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis – Rafael Valle, Kevin Shih, Ryan Prenger, Bryan Catanzaro – International Conference of Learning Representations, 2021

[Notes] [Paper] - 2016 – Gaussian Error Linear Units (GELUs) – Dan Hendrycks, Kevin Gimpel – arXiv Preprint

[Notes] [Paper] - 2020 – Review of research on lightweight convolutional neural networks – Yan Zhou, Shaochang Chen, Yiming Wang, Wenming Huan – ITOEC 2020

[Notes] [Paper] - 2022 – Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training – J. Yang, Lei He – Microsoft Azure Speech report

[Notes] [Paper] - 2021 – Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN – Congyi Wang, Yu Chen, Bin Wang, Yi Shi – Report 2021

[Notes] [Paper] – 2022 – IQDUBBING: Prosody modeling based on discrete self-supervised speech representation for expressive voice conversion – JWendong Gan, Bolong Wen, Ying Yan, Haitao Chen, Zhichao Wang, Hongqiang Du, Lei Xie, Kaixuan Guo, Hai Li – ICASSP 2022

[Notes] [Paper] – 2021 – Speech-T: Transducer for Text to Speech and Beyond – Jiawei Chen, Xu Tan, Yichong Leng, Jin Xu, Guihua Wen, Tao Qin, Tie-Yan Liu – Neurips 2021

[Notes] [Paper] – 2017 – Density estimation using Real NVP – Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio – Google Brain 2017

[Notes] [Paper] – 2017 – MADE: Masked Autoencoder for Distribution Estimation – Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle – ICML 2015

[Notes] [Paper] – 2017 – Masked Autoregressive Flow for Density Estimation – George Papamakarios, Theo Pavlakou, Iain Murray – NIPS 2017

[Notes] [Paper] – 2021 – MLP-Mixer: An all-MLP Architecture for Vision – Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy – Google Preprint

[Notes] [Paper] – 2021 – Meta Pseudo Labels – Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le – IEEE Conference on Computer Vision and Pattern Recognition 2021

[Notes] [Paper] – 2013 – Pseudo-Label: The Simple and Efficient Semi-Supervised LearningMethod for Deep Neural Networks – Dong-Hyun Lee – ICML 2013

[Notes] [Paper] – 2020 – Glow: Generative Flow with Invertible 1x1 Convolutions – Diederik P. Kingma, Prafulla Dhariwal – NeurIPS 2020

[Notes] [Paper] – 2020 – Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech –Vatsal Aggarwal, Marius Cotescu, Nishant Prateek, Jaime Lorenzo-Trueba, Roberto Barra-Chicote – ICASSP 2020

[Notes] [Paper] – 2021 – Randomized Automatic Differentiation – Deniz Oktay, Nick McGreivy, Joshua Aduol, Alex Beatson, Ryan P. Adams – ICLR 2021

[Notes] [Paper] – 2021 – GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis – Rui Liu, Berrak Sisman, Haizhou Li – ICASSP 2021

[Notes] [Paper] – 1993 – Signature Verification using a "Siamese" Time Delay Neural Network – Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, Roopak Shah – NIPS, 1993

[Notes] [Paper] – 2016 – Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning – Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi – CVPR, 2016

[Notes] [Paper] – 2017 – Learning Transferable Architectures for Scalable Image Recognition – Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le – CVPR, 2018

[Notes] [Paper] – 2019 – EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks – Mingxing Tan, Quoc V. Le – International Conference on Machine Learning, 2019

[Notes] [Paper] – 2018 – MobileNetV2: Inverted Residuals and Linear Bottlenecks – Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen – The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

[Notes] [Paper] – 2017 – MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications – Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam – Google Report, 2017

[Notes] [Paper] – 2015 – Training Very Deep Networks (highway networks) – Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber – NIPS 2015

[Notes] [Paper] – 2017 – Densely Connected Convolutional Neural Networks – Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger – CVPR 2017

[Notes] [Paper] – 2020 – Controllable neural text to speech synthesis using intuitive prosodic features – Tuomo Raitio, Ramya Rasipuram, Dan Castellani – Apple report 2020

[Notes] [Paper] – 2020 – Knowledge distillation: A survey – Jianping Gou, Baosheng Yu, Stephen J. Maybank, Dacheng Tao – Sydney AI Centre report 2020

[Notes] [Paper] – 2015 – Distilling the Knowledge in a Neural Network – Geoffrey Hinton, Oriol Vinyals and Jeff Dean – Neural Information Processing Systems 2014

[Notes] [Paper] – 2020 – Glow-TTS: A generative flow for text-to-speech via Monotonic Alignment Search – Jaehyeon Kim, Sungwon Kim, Jungil Kong, Sungroh Yoon – Neurips 2020

[Notes] [Paper] – 2020 – Sequence to sequence singing synthesis using the feed forward transformer – Merlijn Blaauw, Jordi Bonada – ICASSP 2020

[Notes] [Paper] – 2019 – A Generalized Framework for Self-Play Training – Daniel Hernandez; Kevin Denamganaï; Yuan Gao; Peter York; Sam Devlin; Spyridon Samothrakis; James Alfred Walker – IEEE 2019

[Notes] [Paper] – 2020 – Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling – Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng – IEEE 2020

[Notes] [Paper] – 2020 – Controllable Neural Prosody Synthesis – Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore – Interspeech 2020

[Notes] [Paper] – 2020 – A Spectral Energy Distance for Parallel Speech Synthesis – Alexey A. Gritsenko, Tim Salimans, Rianne van den Berg, Jasper Snoek, Nal Kalchbrenner – 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada

[Notes] [Paper] – 2019 – Text Normalization Using Memory Augmented Neural Networks – Subhojeet Pramanik, Aman Hussain – Speech Communication

[Notes] [Paper] – 2020 – Voice conversion with a transformer network – Ruolan Liu, Xiao Chen, Xue Wen – IEEE

[Notes] [Paper] – 2020 – DeepSinger: Singing Voice Synthesis with Data Mined From the Web – Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu – KDD 2020

[Notes] [Paper] – 2018 – Efficient Neural Audio Synthesis – Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu – International Conference of Machine Learning 2018

[Notes] [Paper] – 2017 – Parallel WaveNet: Fast High-Fidelity Speech Synthesis – Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis – International Conference on Machine Learning 2018

[Notes] [Paper] – 2018 – Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions – Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu – ICASSP 2018

[Notes] [Paper] – 2017 – Tacotron: Towards End-to-End Speech Synthesis – Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous – INTERSPEECH 2017

[Notes] [Paper] – 2020 – TalkNet: Fully-Convolutional Non-Autoregressive Speech Synthesis Model – Stanislav Beliaev, Yurii Rebryk, Boris Ginsburg – INTERSPEECH 2020

[Notes] [Paper] – 2015 – A comprehensive survey of clustering algorithms – Dongkuan Xu, Yingjie Tian – Annals of Data Science, Springer, 2015

[Notes] [Paper] – 1989 – Optimal brain damage – Yann LeCun – Neural Information Processing Systems (NIPS 1989)

[Notes] [Paper] – 2019 – Towards achieving robust universal neural vocoding – Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal– Interspeech 2019

[Notes] [Paper] – 2019 – Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech – Daniel Korzekwa, Roberto Barra-Chicote, Bozena Kostek, Thomas Drugman, Mateusz Lajszczak – Interspeech 2019

[Notes] [Paper] – 2019 – In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data – Nishant Prateek, Mateusz Łajszczak, Roberto Barra-Chicote, Thomas Drugman, Jaime Lorenzo-Trueba, Thomas Merritt, Srikanth Ronanki, Trevor Wood – North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

[Notes] [Paper] – 2019 – Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask – Hattie Zhou, Janice Lan, Rosanne Liu, Jason Yosinski – Neural Information Processing Systems (NeurIPS) 2019

[Notes] [Paper] – 2016 – WaveNet: A Generative Model for Raw Audio – Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu – DeepMind report 2016

[Notes] [Paper] – 2019 – BERT: Pre-training of Deep Bidirectional Transformers for language Understanding – Jacob Devlin, Ming-Wei Cheng, Kenton Lee, Kristina Toutanova – North American Chapter of the Association for Computational Linguistics (NAACL) 2019

[Notes] [Paper] – 2019 – Weight agnostic neural networks – Adam Gaier and David Ha – Neural Information Processing Systems (NeurIPS) 2019

[Notes] [Paper] – 2020 – Once-for-All: Train One Network and Specialize it for Efficient Deployment – Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han – International Conference of Learning Representations (ICLR) 2020

[Notes] [Paper] – 2017 – Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles – Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell – Neural Information Processing Systems (NeurIPS) 2017

[Notes] [Paper] – 2019 – A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks – Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, Yarin Gal – Neural Information Processing Systems (NeurIPS, Bayesian DL workshop) 2019

[Notes] [Paper] – 2016 – Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning –Yarin Gal and Zoubin Ghahramani – International Conference of Machine Learning (ICML) 2016

[Notes] [Paper] – 2018 – Optimization for deep learning: theory and algorithms – Ruoyu Sun – University of Illinois report

[Notes] [Paper] – 2018 – The lottery ticket hypothesis: finding sparse, trainable neural networks – Jonathan Frankle, Michael Carbin – International Conference of Machine Learning (ICML) 2018

[Notes] [Paper] – 2019 – Temporal Pattern Attention for Multivariate Time Series Forecasting – Shun-Yao Shih, Fan-Keng Sun, Hung-yi Lee – European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2019

[Notes] [Paper] – 2020 – Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case – Neo Wu, Bradley Green, Xue Ben, Shawn O'Banion – Proceedings of the International Conference of Machine Learning (ICML), 2020

[Notes] [Paper] – 2018 – Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor – Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine – International Conference of Learning Representations, 2018

[Notes] [Paper] – 2016 – Continuous control with Deep Reinforcement Learning – Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra – International Conference of Learning Representations, 2016

[Notes] [Paper] – 2014 – Deterministic Policy Gradient Algorithms – David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmille – International Conference on Machine Learning, 2014

[Notes] [Paper] – 2017 – Proximal Policy Optimization Algorithms – John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov – OpenAI Report

[Notes] [Paper] – 2015 – Trust Region Policy Optimization – John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, Pieter Abbeel – Proceedings of the 31st International Conference on Machine Learning

[Notes] [Paper] – 2016 – Asynchronous Methods for Deep Reinforcement Learning – Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu – Proceedings of The 33rd International Conference on Machine Learning

[Notes] [Paper] – 2020 – A Survey of Deep Learning for Scientific Discovery – Maithra Raghu, Eric Schmidt – Cornell Univeristy and Schmidt Futures report

[Notes] [Paper] – 2017 – Attention Is All You Need – Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin – Neural Information Processing Systems (NIPS) 2017

[Notes] [Paper] – 2017 – Convolutional Sequence to Sequence Learning – Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin – International Conference of Machine Learning (ICML) 2017

[Notes] [Paper] – 2005 – Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method – Martin Riedmiller – Springer-Verlag Berlin Heidelberg 2005

[Notes] [Paper] – 2019 – Challenges of Real-World Reinforcement Learning – Gabriel Dulac-Arnold, Daniel Mankowitz, Todd Hester – International Conference of Machine Learning (ICML) 2019

[Notes] [Paper] – 2019 – Off-Policy Deep Reinforcement Learning without Exploration – Scott Fujimoto, David Meger, Doina Precup – ICML 2019

[Notes] [Paper] – 2019 – Way Off-Policy Batch Reinforcement Learning of Implicit Human Preferences in Dialog – Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard – Cambridge, preprint. Under review

[Notes] [Paper] – 2019 – Striving for simplicity in off-policy deep reinforcement learning – Rishabh Agarwal, Dale Schuurmans, Mohammad Norouzi – Neural Information Processing Systems (NeurIPS) 2019

[Notes] [Paper] – 2016 – Doubly Robust Off-Policy Value Evaluation for Reinforcement Learning – Nan Jiang and Lihong Li – Proceedings of the 33rd International Conference of Machine Learning

[Notes] [Paper] – 2020 – Up to two billion times acceleration of scientific simulations with deep neural architecture search – M. F. Kasim, D. Watson-Parris, L. Deaconu, S. Oliver, P. Hatfield, D. H. Froula, G. Gregori, M. Jarvis, S. Khatiwala, J. Korenaga, J. Topp-Mugglestone, E. Viezzer, S. M. Vinko – Science

[Notes] [Paper] – 2016 – Deep Reinforcement Learning in Large Discrete Action Spaces – Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin – Google Deepmind

[Notes] [Paper] – 2020 – Q-Learning in enormous action spaces via amortized approximate maximization – Tom Van de Wiele, David Warde-Farley, Andriy Mnih & Volodymyr Mnih – DeepMind report

[Notes] [Paper] – 2015 – Neural Machine Translation by Jointly Learning to Align and Translate – Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio – International Conference of Learning Representations (ICLR 2015)

[Notes] [Paper] – 2011 – Doubly Robust Policy Evaluation and Learning – Miroslav Dudík, John LangFord, Lihong Li – Proceedings of the 28th International Conference on Machine Learning

[Notes] [Paper] – 2016 – Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads – Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lianfeng Gao, Lihong Li, Li Deng – Microsoft Research and University of Washington

[Notes] [Paper] – 2016 – Offline Evaluation of Online Reinforcement Learning Algorithms – Travis Mandel, Yun-En Liu, Emma Brunskill, Zoran Popovic – Proceedings of the 30th AAAI Conference on Artificial Intelligence

[Notes] [Paper] – 2018 – Handling Large-Scale Action Space in Deep Q Network – Zhiheng Zhao, Yi Liang, Xiaoming Ji – 2018 International Conference on Artificial Intelligence and Big Data

[Notes] [Paper] – 2017 – Deep Learning with Depthwise Separable Convolutions – François Chollet – Google report