专栏名称: 学姐带你玩AI

这里有人工智能前沿信息、算法技术交流、机器学习/深度学习经验分享、AI大赛解析、大厂大咖算法面试分享、人工智能论文技巧、AI环境工具库教程等……学姐带你玩转AI！

快手机器学习算法工程师一面

学姐带你玩AI · 公众号 · · 2024-07-11 18:08

正文

来源：投稿作者：LSC
编辑：学姐

unset unset 一面 unset unset

1.自我介绍

2.介绍自己对推荐流程的了解

3.介绍diffusion的原理

4.写一下diffusion的伪代码

5.写一下Masking Attention的伪代码

def masked_attention(Q, K, V, mask):
    """
    计算 Masked Attention 的伪代码函数
    
    Args:
    - Q: 查询矩阵，shape: [batch_size, num_heads, seq_length, head_dim]
    - K: 键矩阵，shape: [batch_size, num_heads, seq_length, head_dim]
    - V: 值矩阵，shape: [batch_size, num_heads, seq_length, head_dim]
    - mask: 掩码矩阵，用于屏蔽未来位置信息，shape: [batch_size, 1, seq_length, seq_length]
    
    Returns:
    - output: Masked Attention 的输出，shape: [batch_size, num_heads, seq_length, head_dim]
    """
    # 计算 Q 和 K 的点积
    scores = torch.matmul(Q, K.transpose(-1, -2))  # [batch_size, num_heads, seq_length, seq_length]
    
    # 缩放点积
    scores = scores / math.sqrt(Q.size(-1))
    
    # 将掩码加入到得分中
    scores = scores.masked_fill(mask == 0, float('-inf'))  # 屏蔽未来位置信息
    
    # 使用 softmax 函数进行归一化
    attention_weights = F.softmax(scores, dim=-1)  # [batch_size, num_heads, seq_length, seq_length]
    
    # 将注意力权重应用于值向量
    output = torch.matmul(attention_weights, V)  # [batch_size, num_heads, seq_length, head_dim]
    
    return output