开放式 GPT2 文本生成解释

本笔记本演示了如何获取用于开放式文本生成的 gpt2 输出的解释。在本演示中，我们使用 hugging face 提供的预训练 gpt2 模型 (https://hugging-face.cn/gpt2) 来解释 gpt2 生成的文本。我们进一步展示了如何获取自定义输出生成文本的解释，并绘制任何输出生成 token 的全局输入 token 重要性。

[1]:

from transformers import AutoModelForCausalLM, AutoTokenizer

import shap

加载模型和分词器

[2]:

tokenizer = AutoTokenizer.from_pretrained("gpt2", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("gpt2").cuda()

下面，我们设置某些模型配置。我们需要定义模型是解码器还是编码器-解码器。这可以通过模型配置文件中的 ‘is_decoder’ 或 ‘is_encoder_decoder’ 参数设置。我们还可以设置自定义模型生成参数，这些参数将在输出文本生成解码过程中使用。

[3]:

# set model decoder to true
model.config.is_decoder = True
# set text-generation params under task_specific_params
model.config.task_specific_params["text-generation"] = {
    "do_sample": True,
    "max_length": 50,
    "temperature": 0.7,
    "top_k": 50,
    "no_repeat_ngram_size": 2,
}

定义初始文本

[4]:

s = ["I enjoy walking with my cute dog"]

创建解释器对象并计算 SHAP 值

[5]:

explainer = shap.Explainer(model, tokenizer)
shap_values = explainer(s)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

可视化 shap 解释

[6]:

shap.plots.text(shap_values)

[0]

输出

,

但是

我

是

不

确定

是否

我

会

永远

能够

去

输入

我

享受

散步

和

我的

可爱

狗

另一个例子…

[7]:

s = ["Scientists confirmed the worst possible outcome: the massive asteroid will collide with Earth"]

[8]:

explainer = shap.Explainer(model, tokenizer)
shap_values = explainer(s)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

[9]:

shap.plots.text(shap_values)

[0]

输出

在

接下来的

几天

科学家

.

输入

证实

最坏的

接下来的

可能

结果

巨大的

:

接下来的

小行星

将

碰撞

地球

和

地球

自定义文本生成和调试有偏差的输出

下面我们演示如何解释使用模型生成特定输出句子的可能性，给定一个输入句子。例如，我们提出一个问题：在句子 “我知道很多人是 [target]” 中，哪个国家/地区的居民（目标）最有可能在输出句子 “他们喜欢他们的伏特加！” 中生成 token “伏特加”？为此，我们首先定义输入-输出句子对

[10]:

# define input
x = [
    "I know many people who are Russian.",
    "I know many people who are Greek.",
    "I know many people who are Australian.",
    "I know many people who are American.",
    "I know many people who are Italian.",
    "I know many people who are Spanish.",
    "I know many people who are German.",
    "I know many people who are Indian.",
]

[11]:

# define output
y = [
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
    "They love their vodka!",
]

我们用 Teacher Forcing 评分类包装模型，并创建一个文本掩码器

[12]:

teacher_forcing_model = shap.models.TeacherForcing(model, tokenizer)
masker = shap.maskers.Text(tokenizer, mask_token="...", collapse_mask_token=True)

创建解释器…

[13]:

explainer = shap.Explainer(teacher_forcing_model, masker)

生成 SHAP 解释值！

[14]:

shap_values = explainer(x, y)

现在我们已经生成了 SHAP 值，我们可以看看输入中的 token 对输出句子中 token “伏特加” 的贡献，使用文本图。注意：红色表示正向贡献，而蓝色表示负向贡献，颜色的强度显示其在各自方向上的强度。

[15]:

shap.plots.text(shap_values)

[0]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

俄罗斯人

.

[1]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

希腊人

.

[2]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

澳大利亚人

.

[3]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

美国人

.

[4]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

意大利人

.

[5]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

西班牙人

.

[6]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

德国人

.

[7]

输出

他们

喜欢

他们的

伏特加

!

输入

我

知道

很多

人

他们

是

印度人

.

要查看哪些输入 token 影响（正面/负面）生成单词 “伏特加” 的可能性，我们绘制单词 “伏特加” 的全局 token 重要性。

瞧！俄罗斯人喜欢他们的伏特加，不是吗？ :)

[16]:

shap.plots.bar(shap_values[0, :, "vodka"])

../../../_images/example_notebooks_text_examples_text_generation_Open_Ended_GPT2_Text_Generation_Explanations_30_0.png

有更多有用的示例的想法吗？欢迎提交 Pull Request 来添加到此文档笔记本！