Single Headed Attention RNN: Stop Thinking With Your Head with Stephen Merity

EPISODE 325

Join our list for notifications and early access to events

About this Episode

Today we're joined by Stephen Merity, startup founder and independent researcher, with a focus on NLP and Deep Learning. Late last month, Stephen released his latest paper, Single Headed Attention RNN: Stop Thinking With Your Head, which we break down extensively in this conversation. Stephen details his primary motivations behind writing the paper; the fact that NLP research has been recently dominated by the use of transformer models, and the fact that these models are not the most accessible/trainable for broad use. We discuss the architecture of transformers models, and how he came to the decision of using SHA-RNNs for his research, how he built and trained the model, his approach to benchmarking, and finally his goals for the research in the broader research community.
Connect with Stephen