Transformer has emerged as a prominent and widely-used tool in natural
language processing (NLP) due to its
exceptional performance, first proposed by Google in 2017. Transformers serve as key kernels supporting
the most popular large language models.
Structurally, the Transformer belongs to the encoder-decoder
design.
The encoder block features a multi-head self-attention mechanism and
a positional feed-forward network,
while the decoder block includes a multi-head cross-attention
mechanism.
These elements are linked by a combination of two operations: residual
connection and layer normalization.
In machine translation tasks, the input text is expressed by embedding and positional encoding to obtain
numerical representations of word embedding.
We propose TransforLearn for beginners as a tutorial tool for Transformers.
TransforLearn uses a visual approach to provide learners with a better learning experience through
interactive exploration.
Our major contributions can be listed as:
The implementation of TransforLearn is built on the foundational Transformer model. We visualize the forward propagation process of the training model: converting an input text to be translated into translation results. To better assist beginners in overcoming the hurdle of aligning model input and output with task requirements, we propose two exploration modes: architecture-driven exploration and task-driven exploration.
In the architecture-driven exploration, the system provides an overview of the Transformer architecture and presents the detailed modules. There is a hierarchical relationship between the overview and the detailed modules: blue tabbed views are the topmost structures of the Transformer; orange tabbed views are the first-level unfolding state, and green tabbed views show the second-level detailed operations. Users can drill down from architecture overview to module detailed views by specific interactions.
In the task-driven exploration, users will have a deeper understanding of the data flow transformation and model structure with the help of actual downstream tasks (machine translation in this system). By changing the decoding time step, users can discover the changes in data flow and final output results within the module.
This work is supported by the Natural Science Foundation of China (NSFC No.62202105) and Shanghai Municipal
Science and Technology Major Project (2021SHZDZX0103), General Program (No. 21ZR1403300), Sailing Program (No.
21YF1402900) and ZJLab.
Welcome to visit our website: FDU-VIS