Paper: Attention Is All You Need (63/365)

I can’t believe after my gripe yesterday that I have picked today’s paper. This paper claims that a feed-forward network with self-attention trains faster and performs better than a recurrent or a convolutional neural network on translation tasks. I have to come back to this at a later time though because I need to brush up on much of the terminology.

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

1 Response to Paper: Attention Is All You Need (63/365)

  1. Pingback: Paper: Concepts in a Probabilistic Language of Thought (64/365) | Latent observations

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s