Single Headed Attention RNN: Stop Thinking With Your Head with Stephen Merity

EPISODE 325

December 12, 2019

LISTEN

Banner Image: Stephen Merity - Podcast Interview

Join our list for notifications and early access to events

About this Episode

Today we're joined by Stephen Merity, startup founder and independent researcher, with a focus on NLP and Deep Learning. Late last month, Stephen released his latest paper, Single Headed Attention RNN: Stop Thinking With Your Head, which we break down extensively in this conversation. Stephen details his primary motivations behind writing the paper; the fact that NLP research has been recently dominated by the use of transformer models, and the fact that these models are not the most accessible/trainable for broad use. We discuss the architecture of transformers models, and how he came to the decision of using SHA-RNNs for his research, how he built and trained the model, his approach to benchmarking, and finally his goals for the research in the broader research community.