multi-head attention
Multi-head attention is a mechanism in deep learning models, particularly in transformer architectures, that allows the model to focus on different parts of the input sequence simultaneously and capture diverse and valuable information. It achieves this by employing multiple attention heads, each independently attending to different parts of the input, and then combining their representations to obtain a comprehensive attention-based output.
Requires login.
Related Concepts (1)
Similar Concepts
- attention-based models
- attention-based sequence-to-sequence models
- attentional capture
- attentional focus
- attentional networks
- cross-modal attention
- crossmodal attention
- divided attention
- focus and attention
- generative adversarial networks with attention
- hierarchical attention networks
- recurrent neural networks with attention
- reinforcement learning with attention
- selective attention
- self-attention