机器翻译解释

此笔记本演示了使用预训练的 Transformer 模型进行机器翻译的文本到文本场景的模型解释。在此演示中，我们展示了关于两个不同模型的解释：英语到西班牙语 (https://hugging-face.cn/Helsinki-NLP/opus-mt-en-es) 和英语到法语 (https://hugging-face.cn/Helsinki-NLP/opus-mt-en-fr)。

[1]:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

import shap

英语到西班牙语模型

[2]:

# load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-es")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-es").cuda()

# define the input sentences we want to translate
data = [
    "Transformers have rapidly become the model of choice for NLP problems, replacing older recurrent neural network models"
]

解释模型的预测

[3]:

# we build an explainer by passing the model we want to explain and
# the tokenizer we want to use to break up the input strings
explainer = shap.Explainer(model, tokenizer)

# explainers are callable, just like models
shap_values = explainer(data, fixed_context=1)

floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)

可视化 SHAP 解释

[4]:

shap.plots.text(shap_values)

[0]

输出

Los

transformador

es

se

han

convertido

rápidamente

en

el

modelo

de

elección

para

problemas

N

LP

,

reemplaza

ndo

modelos

de

red

neuro

nal

recurrente

s

más

antiguos

输入

Transform

ers

have

rapidly

become

the

model

of

choice

for

N

LP

problems

,

replacing

older

recurrent

neural

network

models

英语到法语

[5]:

tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-fr").cuda()

[6]:

explainer = shap.Explainer(model, tokenizer)
shap_values = explainer(data)

Partition explainer: 2it [00:12,  6.35s/it]

[7]:

shap.plots.text(shap_values)

[0]

输出

Les

transformateurs

sont

rapidement

devenus

le

modèle

de

choix

pour

les

problèmes

de

N

LP

,

remplaçant

les

anciens

modèles

de

réseaux

neuro

naux

récurrent

s

输入

Trans

former

s