使用自定义函数和分词器

本笔记本演示了如何在多类别文本分类场景中使用 Partition 解释器,在该场景中我们使用自定义 python 函数作为我们的模型。

[1]:
import datasets
import numpy as np
import pandas as pd
import scipy as sp
import torch
import transformers

import shap

# load the emotion dataset
dataset = datasets.load_dataset("emotion", split="train")
data = pd.DataFrame({"text": dataset["text"], "emotion": dataset["label"]})
Using custom data configuration default
Reusing dataset emotion (/home/slundberg/.cache/huggingface/datasets/emotion/default/0.0.0/aa34462255cd487d04be8387a2d572588f6ceee23f784f37365aa714afeb8fe6)

定义我们的模型

虽然这里我们使用了 transformers 包,但任何接受字符串列表并输出分数的 python 函数都可以工作。

[2]:
# load the model and tokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained("nateraw/bert-base-uncased-emotion", use_fast=True)
model = transformers.AutoModelForSequenceClassification.from_pretrained("nateraw/bert-base-uncased-emotion").cuda()
labels = sorted(model.config.label2id, key=model.config.label2id.get)


# this defines an explicit python function that takes a list of strings and outputs scores for each class
def f(x):
    tv = torch.tensor([tokenizer.encode(v, padding="max_length", max_length=128, truncation=True) for v in x]).cuda()
    attention_mask = (tv != 0).type(torch.int64).cuda()
    outputs = model(tv, attention_mask=attention_mask)[0].detach().cpu().numpy()
    scores = (np.exp(outputs).T / np.exp(outputs).sum(-1)).T
    val = sp.special.logit(scores)
    return val

创建解释器

为了构建一个 Explainer,我们需要模型和掩码器(掩码器指定如何隐藏输入的部分)。由于我们使用自定义函数作为模型,SHAP 无法自动推断掩码器。因此,我们需要提供一个,可以隐式地通过传递 transformers 分词器,或者显式地通过构建 shap.maskers.Text 对象。

[3]:
method = "custom tokenizer"

# build an explainer by passing a transformers tokenizer
if method == "transformers tokenizer":
    explainer = shap.Explainer(f, tokenizer, output_names=labels)

# build an explainer by explicitly creating a masker
elif method == "default masker":
    masker = shap.maskers.Text(r"\W")  # this will create a basic whitespace tokenizer
    explainer = shap.Explainer(f, masker, output_names=labels)

# build a fully custom tokenizer
elif method == "custom tokenizer":
    import re

    def custom_tokenizer(s, return_offsets_mapping=True):
        """Custom tokenizers conform to a subset of the transformers API."""
        pos = 0
        offset_ranges = []
        input_ids = []
        for m in re.finditer(r"\W", s):
            start, end = m.span(0)
            offset_ranges.append((pos, start))
            input_ids.append(s[pos:start])
            pos = end
        if pos != len(s):
            offset_ranges.append((pos, len(s)))
            input_ids.append(s[pos:])
        out = {}
        out["input_ids"] = input_ids
        if return_offsets_mapping:
            out["offset_mapping"] = offset_ranges
        return out

    masker = shap.maskers.Text(custom_tokenizer)
    explainer = shap.Explainer(f, masker, output_names=labels)

计算 SHAP 值

解释器具有与其解释的模型相同的方法签名,因此我们只需传递一个字符串列表,用于解释分类。

[4]:
shap_values = explainer(data["text"][:3])

可视化对所有输出类别的影响

在下面的图中,当您将鼠标悬停在输出类别上时,您将获得该输出类别的解释。当您单击输出类别名称时,该类别将保持解释可视化的焦点,直到您单击另一个类别。

基本值是当整个输入文本被掩码时模型输出的值,而 \(f_{output class}(inputs)\) 是模型对完整原始输入的输出。SHAP 值以累加的方式解释了解掩码每个词的影响如何将模型输出从基本值(其中整个输入被掩码)更改为最终预测值。

[5]:
shap.plots.text(shap_values)


[0]
输出
悲伤
喜悦
愤怒
恐惧
惊讶


-1-4-7-10258-1.84234-1.84234base value5.625445.62544fsadness(inputs)6.54 humiliated 1.217 feel -0.197 i -0.092 didnt
输入
-0.197
i
-0.092
didnt
1.217
feel
6.54
humiliated


[1]
输出
悲伤
喜悦
愤怒
恐惧
惊讶


-1-4-7-10258-1.84234-1.84234base value5.353885.35388fsadness(inputs)8.14 hopeless 1.899 feeling 0.289 damned 0.287 to 0.145 from -1.984 hopeful -0.326 so -0.182 awake -0.157 cares -0.152 just -0.147 can -0.144 someone -0.118 i -0.081 so -0.066 around -0.06 go -0.056 being -0.038 is -0.033 from -0.012 who -0.007 and
输入
-0.118
i
-0.147
can
-0.06
go
0.145
from
1.899
feeling
-0.326
so
8.14
hopeless
0.287
to
-0.081
so
0.289
damned
-1.984
hopeful
-0.152
just
-0.033
from
-0.056
being
-0.066
around
-0.144
someone
-0.012
who
-0.157
cares
-0.007
and
-0.038
is
-0.182
awake


[2]
输出
悲伤
喜悦
愤怒
恐惧
惊讶


-1-4-7-10258-1.84234-1.84234base value-6.08251-6.08251fsadness(inputs)0.545 i 0.072 minute 0.053 to 0.029 a -3.75 greedy -0.421 wrong -0.367 feel -0.152 post -0.142 im -0.109 grabbing
输入
-0.142
im
-0.109
grabbing
0.029
a
0.072
minute
0.053
to
-0.152
post
0.545
i
-0.367
feel
-3.75
greedy
-0.421
wrong

有更多有用的示例的想法吗? 鼓励提交向此文档笔记本添加内容的 Pull Request!