Deistification of attention: Building it from scratch

May 11, 2025

Author: Marcello Politi

Originally published in the direction of artificial intelligence.

A gentle immersion in how attention helps the neural network is better to remember and not forgetPhoto Codioful (earlier gradient) on Unsplash

The mechanism of attention is often associated with the transformer architecture, but was already used in RNN. In the tasks of machine translation or MT (e.g. English-Italian), when you want to predict the next Italian word, you need your model to focus or pay attention to the most important English words that are useful for good translation.

Photo from https://medium.com/swlh/a-simple-ofview-of-rnn-lstm-andtation-mechanism-9e844763d07b

I will not go into the details of the RNN, but attention helped these models to alleviate the problem of the disappearing gradient and capture more relationships between words.

At one point we understood that the only important thing was the attention mechanism, and the entire architecture of the RNN was exaggerated. Hence, attention is all you need!

The classic note indicates where the words in the output sequence should focus attention in relation to words in the input sequence. This is important in the tasks of the sequence to the sequence, such as MT.

Self -understanding is a specific type of attention. It works between two elements in the same sequence. It provides information on how “correlated” words are in the same sentence.

In the case of a given token (or word) in the sequence, self -confidence generates a attention list that corresponds to all other tokens in the sequence. This … Read the full blog for free on the medium.

Published via AI

Deistification of attention: Building it from scratch

Author: Marcello Politi

LEAVE A REPLY Cancel reply

APLICATIONS

Onettext collects USD 4.5 million from Y Combinator, Khosla to come...

Utilizing AI to Boost Colonoscopy Participation Rates

Master AI Agents in 10+3 simple steps (no previous experience)

AI feedback loop: when the machines strengthen their own mistakes, trusting...

HOT NEWS

Welcome to AIO in the AI generative era

What is ChatGPT: Every question on ChatGPT Answered

The new tool evaluates the progress of learning to strengthen Myth...

University in West Virginia Leading the Way in Artificial Intelligence

POPULAR POSTS

Advantages and Disadvantages of the Top 14 AI Applications in 2024

National Recognition for GPHA Takoradi Hospital’s A.I. Application Focus Lab Week...

KRISP uses artificial intelligence to help Indians sound like Americans on...

POPULAR CATEGORY

Machine learning on a scale: Why Pyspark MLLIB still wins in...