机器翻译解释
此笔记本演示了使用预训练的 Transformer 模型进行机器翻译的文本到文本场景的模型解释。在此演示中,我们展示了关于两个不同模型的解释:英语到西班牙语 (https://hugging-face.cn/Helsinki-NLP/opus-mt-en-es) 和英语到法语 (https://hugging-face.cn/Helsinki-NLP/opus-mt-en-fr)。
[1]:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import shap
英语到西班牙语模型
[2]:
# load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-es")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-es").cuda()
# define the input sentences we want to translate
data = [
"Transformers have rapidly become the model of choice for NLP problems, replacing older recurrent neural network models"
]
解释模型的预测
[3]:
# we build an explainer by passing the model we want to explain and
# the tokenizer we want to use to break up the input strings
explainer = shap.Explainer(model, tokenizer)
# explainers are callable, just like models
shap_values = explainer(data, fixed_context=1)
floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
可视化 SHAP 解释
[4]:
shap.plots.text(shap_values)
[0]
输出
Los
transformador
es
se
han
convertido
rápidamente
en
el
modelo
de
elección
para
problemas
N
LP
,
reemplaza
ndo
modelos
de
red
neuro
nal
recurrente
s
más
antiguos
输入
1.965
Transform
5.114
ers
1.903
have
-0.505
rapidly
0.186
become
0.101
the
-0.225
model
0.325
of
-0.114
choice
0.081
for
-0.096
N
0.021
LP
-0.247
problems
-0.417
,
0.053
replacing
0.025
older
0.05
recurrent
0.172
neural
0.105
network
-0.114
models
-0.1
输入
1.965
Transform
5.114
ers
1.903
have
-0.505
rapidly
0.186
become
0.101
the
-0.225
model
0.325
of
-0.114
choice
0.081
for
-0.096
N
0.021
LP
-0.247
problems
-0.417
,
0.053
replacing
0.025
older
0.05
recurrent
0.172
neural
0.105
network
-0.114
models
-0.1
输入
7.261
Transform
4.398
ers
-0.073
have
0.104
rapidly
-0.194
become
0.024
the
0.131
model
0.117
of
0.001
choice
0.242
for
0.092
N
-0.103
LP
-0.173
problems
0.14
,
0.16
replacing
0.203
older
0.247
recurrent
0.094
neural
0.261
network
0.309
models
0.267
输入
7.261
Transform
4.398
ers
-0.073
have
0.104
rapidly
-0.194
become
0.024
the
0.131
model
0.117
of
0.001
choice
0.242
for
0.092
N
-0.103
LP
-0.173
problems
0.14
,
0.16
replacing
0.203
older
0.247
recurrent
0.094
neural
0.261
network
0.309
models
0.267
输入
-0.165
Transform
-0.11
ers
-0.009
have
-0.035
rapidly
0.017
become
0.002
the
-0.008
model
-0.011
of
-0.015
choice
-0.014
for
-0.006
N
-0.007
LP
0.004
problems
0.012
,
-0.01
replacing
-0.002
older
-0.006
recurrent
-0.009
neural
-0.003
network
-0.003
models
0.007
输入
-0.165
Transform
-0.11
ers
-0.009
have
-0.035
rapidly
0.017
become
0.002
the
-0.008
model
-0.011
of
-0.015
choice
-0.014
for
-0.006
N
-0.007
LP
0.004
problems
0.012
,
-0.01
replacing
-0.002
older
-0.006
recurrent
-0.009
neural
-0.003
network
-0.003
models
0.007
输入
-0.101
Transform
1.591
ers
0.787
have
-0.91
rapidly
5.289
become
-0.661
the
-0.702
model
-0.672
of
-0.026
choice
0.042
for
-0.014
N
0.026
LP
-0.086
problems
-0.046
,
-0.0
replacing
-0.036
older
-0.014
recurrent
0.042
neural
0.016
network
0.021
models
-0.052
输入
-0.101
Transform
1.591
ers
0.787
have
-0.91
rapidly
5.289
become
-0.661
the
-0.702
model
-0.672
of
-0.026
choice
0.042
for
-0.014
N
0.026
LP
-0.086
problems
-0.046
,
-0.0
replacing
-0.036
older
-0.014
recurrent
0.042
neural
0.016
network
0.021
models
-0.052
输入
-0.385
Transform
-0.282
ers
6.018
have
-1.286
rapidly
1.998
become
-0.009
the
-0.315
model
-0.146
of
-0.0
choice
0.016
for
-0.003
N
0.005
LP
-0.014
problems
-0.107
,
-0.039
replacing
-0.019
older
-0.03
recurrent
-0.028
neural
-0.126
network
-0.054
models
-0.064
输入
-0.385
Transform
-0.282
ers
6.018
have
-1.286
rapidly
1.998
become
-0.009
the
-0.315
model
-0.146
of
-0.0
choice
0.016
for
-0.003
N
0.005
LP
-0.014
problems
-0.107
,
-0.039
replacing
-0.019
older
-0.03
recurrent
-0.028
neural
-0.126
network
-0.054
models
-0.064
输入
0.139
Transform
-0.046
ers
-1.362
have
0.861
rapidly
4.329
become
0.817
the
0.113
model
-0.155
of
0.001
choice
0.01
for
0.054
N
0.001
LP
-0.032
problems
0.079
,
0.038
replacing
0.054
older
0.05
recurrent
-0.012
neural
-0.065
network
-0.009
models
0.205
输入
0.139
Transform
-0.046
ers
-1.362
have
0.861
rapidly
4.329
become
0.817
the
0.113
model
-0.155
of
0.001
choice
0.01
for
0.054
N
0.001
LP
-0.032
problems
0.079
,
0.038
replacing
0.054
older
0.05
recurrent
-0.012
neural
-0.065
network
-0.009
models
0.205
输入
-0.418
Transform
-0.502
ers
-1.065
have
12.239
rapidly
-0.061
become
-0.198
the
-0.227
model
-0.344
of
0.125
choice
0.06
for
0.053
N
-0.061
LP
-0.031
problems
0.16
,
0.117
replacing
0.119
older
0.132
recurrent
0.076
neural
-0.017
network
0.112
models
0.252
输入
-0.418
Transform
-0.502
ers
-1.065
have
12.239
rapidly
-0.061
become
-0.198
the
-0.227
model
-0.344
of
0.125
choice
0.06
for
0.053
N
-0.061
LP
-0.031
problems
0.16
,
0.117
replacing
0.119
older
0.132
recurrent
0.076
neural
-0.017
network
0.112
models
0.252
输入
-0.461
Transform
-0.455
ers
-0.602
have
-0.274
rapidly
2.285
become
0.748
the
-0.015
model
-0.458
of
0.328
choice
0.121
for
-0.099
N
0.03
LP
-0.015
problems
0.062
,
0.112
replacing
0.075
older
0.061
recurrent
0.08
neural
0.224
network
0.118
models
0.035
输入
-0.461
Transform
-0.455
ers
-0.602
have
-0.274
rapidly
2.285
become
0.748
the
-0.015
model
-0.458
of
0.328
choice
0.121
for
-0.099
N
0.03
LP
-0.015
problems
0.062
,
0.112
replacing
0.075
older
0.061
recurrent
0.08
neural
0.224
network
0.118
models
0.035
输入
-0.025
Transform
-0.006
ers
-0.099
have
0.298
rapidly
-0.899
become
1.928
the
2.488
model
0.826
of
0.354
choice
0.108
for
0.033
N
-0.129
LP
-0.222
problems
0.132
,
-0.025
replacing
0.045
older
0.038
recurrent
0.001
neural
-0.045
network
-0.032
models
0.05
输入
-0.025
Transform
-0.006
ers
-0.099
have
0.298
rapidly
-0.899
become
1.928
the
2.488
model
0.826
of
0.354
choice
0.108
for
0.033
N
-0.129
LP
-0.222
problems
0.132
,
-0.025
replacing
0.045
older
0.038
recurrent
0.001
neural
-0.045
network
-0.032
models
0.05
输入
0.015
Transform
-0.107
ers
-0.047
have
0.385
rapidly
-1.63
become
0.418
the
8.907
model
-1.483
of
0.934
choice
0.131
for
0.131
N
-0.261
LP
-0.187
problems
-0.002
,
0.127
replacing
0.204
older
0.184
recurrent
0.186
neural
0.179
network
0.206
models
0.222
输入
0.015
Transform
-0.107
ers
-0.047
have
0.385
rapidly
-1.63
become
0.418
the
8.907
model
-1.483
of
0.934
choice
0.131
for
0.131
N
-0.261
LP
-0.187
problems
-0.002
,
0.127
replacing
0.204
older
0.184
recurrent
0.186
neural
0.179
network
0.206
models
0.222
输入
0.248
Transform
0.074
ers
0.239
have
0.344
rapidly
0.198
become
-0.31
the
-0.476
model
1.036
of
-0.421
choice
-0.153
for
0.208
N
-0.169
LP
0.013
problems
0.316
,
-0.043
replacing
0.023
older
0.139
recurrent
-0.068
neural
-0.406
network
-0.323
models
0.415
输入
0.248
Transform
0.074
ers
0.239
have
0.344
rapidly
0.198
become
-0.31
the
-0.476
model
1.036
of
-0.421
choice
-0.153
for
0.208
N
-0.169
LP
0.013
problems
0.316
,
-0.043
replacing
0.023
older
0.139
recurrent
-0.068
neural
-0.406
network
-0.323
models
0.415
输入
-0.737
Transform
-0.698
ers
-0.744
have
-0.49
rapidly
-0.08
become
-0.075
the
3.675
model
2.756
of
13.188
choice
-1.316
for
-0.737
N
-0.664
LP
-1.415
problems
-3.552
,
-0.0
replacing
-0.065
older
-0.086
recurrent
0.133
neural
0.131
network
-0.006
models
-0.36
输入
-0.737
Transform
-0.698
ers
-0.744
have
-0.49
rapidly
-0.08
become
-0.075
the
3.675
model
2.756
of
13.188
choice
-1.316
for
-0.737
N
-0.664
LP
-1.415
problems
-3.552
,
-0.0
replacing
-0.065
older
-0.086
recurrent
0.133
neural
0.131
network
-0.006
models
-0.36
输入
-0.012
Transform
0.092
ers
-0.033
have
0.202
rapidly
-0.036
become
-0.013
the
0.457
model
0.559
of
-0.364
choice
4.933
for
0.016
N
-0.196
LP
-0.099
problems
-1.145
,
-0.042
replacing
0.056
older
0.056
recurrent
0.077
neural
0.062
network
0.118
models
0.056
输入
-0.012
Transform
0.092
ers
-0.033
have
0.202
rapidly
-0.036
become
-0.013
the
0.457
model
0.559
of
-0.364
choice
4.933
for
0.016
N
-0.196
LP
-0.099
problems
-1.145
,
-0.042
replacing
0.056
older
0.056
recurrent
0.077
neural
0.062
network
0.118
models
0.056
输入
0.079
Transform
0.117
ers
0.209
have
0.139
rapidly
0.018
become
-0.287
the
0.186
model
-0.2
of
-1.179
choice
1.432
for
0.926
N
-0.733
LP
9.825
problems
-1.025
,
0.129
replacing
-0.013
older
0.024
recurrent
0.146
neural
0.125
network
0.151
models
0.084
输入
0.079
Transform
0.117
ers
0.209
have
0.139
rapidly
0.018
become
-0.287
the
0.186
model
-0.2
of
-1.179
choice
1.432
for
0.926
N
-0.733
LP
9.825
problems
-1.025
,
0.129
replacing
-0.013
older
0.024
recurrent
0.146
neural
0.125
network
0.151
models
0.084
输入
0.214
Transform
0.04
ers
0.095
have
0.15
rapidly
0.009
become
-0.093
the
0.295
model
-0.15
of
0.445
choice
0.115
for
9.504
N
-0.924
LP
0.377
problems
-0.05
,
0.158
replacing
0.053
older
0.027
recurrent
-0.029
neural
0.107
network
0.165
models
0.375
输入
0.214
Transform
0.04
ers
0.095
have
0.15
rapidly
0.009
become
-0.093
the
0.295
model
-0.15
of
0.445
choice
0.115
for
9.504
N
-0.924
LP
0.377
problems
-0.05
,
0.158
replacing
0.053
older
0.027
recurrent
-0.029
neural
0.107
network
0.165
models
0.375
输入
-0.06
Transform
-0.148
ers
-0.117
have
-0.103
rapidly
-0.15
become
-0.125
the
-0.096
model
-0.101
of
-0.067
choice
-0.019
for
0.226
N
11.549
LP
0.346
problems
-0.492
,
-0.099
replacing
0.01
older
0.037
recurrent
0.051
neural
-0.038
network
0.001
models
-0.105
输入
-0.06
Transform
-0.148
ers
-0.117
have
-0.103
rapidly
-0.15
become
-0.125
the
-0.096
model
-0.101
of
-0.067
choice
-0.019
for
0.226
N
11.549
LP
0.346
problems
-0.492
,
-0.099
replacing
0.01
older
0.037
recurrent
0.051
neural
-0.038
network
0.001
models
-0.105
输入
0.021
Transform
-0.048
ers
0.034
have
0.298
rapidly
0.151
become
-0.083
the
0.365
model
-0.084
of
-0.375
choice
-0.361
for
-0.949
N
-0.021
LP
2.092
problems
3.632
,
0.407
replacing
-0.024
older
-0.051
recurrent
-0.085
neural
0.054
network
0.057
models
-0.069
输入
0.021
Transform
-0.048
ers
0.034
have
0.298
rapidly
0.151
become
-0.083
the
0.365
model
-0.084
of
-0.375
choice
-0.361
for
-0.949
N
-0.021
LP
2.092
problems
3.632
,
0.407
replacing
-0.024
older
-0.051
recurrent
-0.085
neural
0.054
network
0.057
models
-0.069
输入
-0.053
Transform
0.107
ers
0.073
have
0.072
rapidly
0.194
become
0.006
the
-0.023
model
0.057
of
0.313
choice
0.151
for
0.032
N
0.222
LP
0.435
problems
1.236
,
13.012
replacing
-0.981
older
-0.599
recurrent
-0.558
neural
-0.465
network
-0.598
models
-1.671
输入
-0.053
Transform
0.107
ers
0.073
have
0.072
rapidly
0.194
become
0.006
the
-0.023
model
0.057
of
0.313
choice
0.151
for
0.032
N
0.222
LP
0.435
problems
1.236
,
13.012
replacing
-0.981
older
-0.599
recurrent
-0.558
neural
-0.465
network
-0.598
models
-1.671
输入
-0.023
Transform
-0.058
ers
-0.042
have
-0.055
rapidly
-0.083
become
-0.055
the
0.06
model
-0.079
of
-0.02
choice
-0.087
for
-0.011
N
0.014
LP
-0.052
problems
-0.053
,
1.259
replacing
0.08
older
0.078
recurrent
-0.038
neural
0.233
network
0.072
models
-0.259
输入
-0.023
Transform
-0.058
ers
-0.042
have
-0.055
rapidly
-0.083
become
-0.055
the
0.06
model
-0.079
of
-0.02
choice
-0.087
for
-0.011
N
0.014
LP
-0.052
problems
-0.053
,
1.259
replacing
0.08
older
0.078
recurrent
-0.038
neural
0.233
network
0.072
models
-0.259
输入
-0.129
Transform
0.1
ers
-0.07
have
-0.092
rapidly
0.007
become
0.082
the
-0.012
model
0.1
of
-0.049
choice
0.112
for
0.022
N
-0.035
LP
-0.057
problems
0.009
,
-0.412
replacing
1.349
older
0.254
recurrent
-0.399
neural
-0.955
network
10.014
models
-0.665
输入
-0.129
Transform
0.1
ers
-0.07
have
-0.092
rapidly
0.007
become
0.082
the
-0.012
model
0.1
of
-0.049
choice
0.112
for
0.022
N
-0.035
LP
-0.057
problems
0.009
,
-0.412
replacing
1.349
older
0.254
recurrent
-0.399
neural
-0.955
network
10.014
models
-0.665
输入
-0.009
Transform
0.011
ers
-0.018
have
0.026
rapidly
-0.029
become
-0.061
the
-0.044
model
-0.044
of
-0.007
choice
0.005
for
-0.05
N
-0.027
LP
-0.013
problems
-0.072
,
-0.472
replacing
0.303
older
-0.589
recurrent
-0.661
neural
2.694
network
0.398
models
-0.359
输入
-0.009
Transform
0.011
ers
-0.018
have
0.026
rapidly
-0.029
become
-0.061
the
-0.044
model
-0.044
of
-0.007
choice
0.005
for
-0.05
N
-0.027
LP
-0.013
problems
-0.072
,
-0.472
replacing
0.303
older
-0.589
recurrent
-0.661
neural
2.694
network
0.398
models
-0.359
输入
-0.082
Transform
-0.095
ers
-0.112
have
0.017
rapidly
-0.188
become
-0.253
the
0.208
model
-0.286
of
0.346
choice
0.015
for
0.013
N
0.007
LP
0.143
problems
-0.112
,
0.129
replacing
0.054
older
-0.099
recurrent
-1.02
neural
9.753
network
0.115
models
-0.567
输入
-0.082
Transform
-0.095
ers
-0.112
have
0.017
rapidly
-0.188
become
-0.253
the
0.208
model
-0.286
of
0.346
choice
0.015
for
0.013
N
0.007
LP
0.143
problems
-0.112
,
0.129
replacing
0.054
older
-0.099
recurrent
-1.02
neural
9.753
network
0.115
models
-0.567
输入
0.088
Transform
-0.106
ers
-0.063
have
-0.083
rapidly
-0.025
become
-0.015
the
-0.077
model
-0.007
of
-0.052
choice
-0.053
for
0.252
N
0.186
LP
-0.029
problems
-0.266
,
-0.986
replacing
-0.897
older
-1.607
recurrent
12.728
neural
2.121
network
-0.289
models
-1.581
输入
0.088
Transform
-0.106
ers
-0.063
have
-0.083
rapidly
-0.025
become
-0.015
the
-0.077
model
-0.007
of
-0.052
choice
-0.053
for
0.252
N
0.186
LP
-0.029
problems
-0.266
,
-0.986
replacing
-0.897
older
-1.607
recurrent
12.728
neural
2.121
network
-0.289
models
-1.581
输入
-0.096
Transform
-0.006
ers
-0.059
have
-0.137
rapidly
-0.084
become
-0.003
the
-0.044
model
0.021
of
0.022
choice
0.02
for
0.229
N
-0.034
LP
-0.128
problems
-0.244
,
-0.783
replacing
-0.757
older
-1.16
recurrent
5.378
neural
1.653
network
0.323
models
-0.681
输入
-0.096
Transform
-0.006
ers
-0.059
have
-0.137
rapidly
-0.084
become
-0.003
the
-0.044
model
0.021
of
0.022
choice
0.02
for
0.229
N
-0.034
LP
-0.128
problems
-0.244
,
-0.783
replacing
-0.757
older
-1.16
recurrent
5.378
neural
1.653
network
0.323
models
-0.681
输入
0.158
Transform
-0.05
ers
0.002
have
0.019
rapidly
0.028
become
0.011
the
0.071
model
-0.067
of
0.022
choice
0.035
for
0.001
N
0.065
LP
-0.059
problems
-0.272
,
-0.002
replacing
-1.659
older
11.424
recurrent
1.279
neural
1.16
network
1.056
models
-1.005
输入
0.158
Transform
-0.05
ers
0.002
have
0.019
rapidly
0.028
become
0.011
the
0.071
model
-0.067
of
0.022
choice
0.035
for
0.001
N
0.065
LP
-0.059
problems
-0.272
,
-0.002
replacing
-1.659
older
11.424
recurrent
1.279
neural
1.16
network
1.056
models
-1.005
输入
0.01
Transform
0.049
ers
-0.033
have
-0.045
rapidly
0.041
become
0.087
the
-0.127
model
0.028
of
-0.074
choice
-0.009
for
-0.062
N
-0.005
LP
0.025
problems
-0.048
,
-0.06
replacing
0.381
older
0.279
recurrent
-0.037
neural
0.034
network
1.056
models
-0.139
输入
0.01
Transform
0.049
ers
-0.033
have
-0.045
rapidly
0.041
become
0.087
the
-0.127
model
0.028
of
-0.074
choice
-0.009
for
-0.062
N
-0.005
LP
0.025
problems
-0.048
,
-0.06
replacing
0.381
older
0.279
recurrent
-0.037
neural
0.034
network
1.056
models
-0.139
输入
-0.009
Transform
-0.103
ers
-0.078
have
0.185
rapidly
-0.083
become
-0.113
the
0.072
model
-0.072
of
0.067
choice
0.196
for
0.007
N
0.071
LP
-0.02
problems
-0.163
,
-1.823
replacing
7.364
older
1.044
recurrent
-0.746
neural
0.438
network
0.369
models
-0.486
输入
-0.009
Transform
-0.103
ers
-0.078
have
0.185
rapidly
-0.083
become
-0.113
the
0.072
model
-0.072
of
0.067
choice
0.196
for
0.007
N
0.071
LP
-0.02
problems
-0.163
,
-1.823
replacing
7.364
older
1.044
recurrent
-0.746
neural
0.438
network
0.369
models
-0.486
输入
-0.138
Transform
-0.027
ers
-0.066
have
-0.117
rapidly
-0.16
become
0.105
the
-0.201
model
0.046
of
-0.147
choice
-0.016
for
0.056
N
0.017
LP
-0.152
problems
-0.216
,
-0.623
replacing
6.3
older
-0.004
recurrent
-0.432
neural
0.464
network
0.788
models
-0.331
输入
-0.138
Transform
-0.027
ers
-0.066
have
-0.117
rapidly
-0.16
become
0.105
the
-0.201
model
0.046
of
-0.147
choice
-0.016
for
0.056
N
0.017
LP
-0.152
problems
-0.216
,
-0.623
replacing
6.3
older
-0.004
recurrent
-0.432
neural
0.464
network
0.788
models
-0.331
英语到法语
[5]:
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-fr").cuda()
[6]:
explainer = shap.Explainer(model, tokenizer)
shap_values = explainer(data)
Partition explainer: 2it [00:12, 6.35s/it]
[7]:
shap.plots.text(shap_values)
[0]
输出
Les
transformateurs
sont
rapidement
devenus
le
modèle
de
choix
pour
les
problèmes
de
N
LP
,
remplaçant
les
anciens
modèles
de
réseaux
neuro
naux
récurrent
s
输入
1.472
Trans
-0.258
former
2.359
s
1.248
have
0.828
rapidly
0.9 / 2
become the
-0.292
model
-0.25
of
-0.246
choice
-0.226
for
0.205 / 2
NLP
0.174
problems
-0.067
,
-0.393
replacing
-0.353
older
-0.165
recurrent
-0.017
ne
0.041
ural
-0.213
network
-0.005
models
0.0
输入
1.472
Trans
-0.258
former
2.359
s
1.248
have
0.828
rapidly
0.9 / 2
become the
-0.292
model
-0.25
of
-0.246
choice
-0.226
for
0.205 / 2
NLP
0.174
problems
-0.067
,
-0.393
replacing
-0.353
older
-0.165
recurrent
-0.017
ne
0.041
ural
-0.213
network
-0.005
models
0.0
输入
5.633
Trans
5.908
former
0.405
s
0.187
have
0.166
rapidly
-0.203 / 2
become the
0.218
model
0.186
of
0.223
choice
0.201
for
0.082 / 2
NLP
-0.597
problems
-0.101
,
0.444
replacing
0.366
older
0.081
recurrent
0.06
ne
-0.134
ural
0.171
network
0.017
models
0.0
输入
5.633
Trans
5.908
former
0.405
s
0.187
have
0.166
rapidly
-0.203 / 2
become the
0.218
model
0.186
of
0.223
choice
0.201
for
0.082 / 2
NLP
-0.597
problems
-0.101
,
0.444
replacing
0.366
older
0.081
recurrent
0.06
ne
-0.134
ural
0.171
network
0.017
models
0.0
输入
-0.37
Trans
-0.08
former
0.541
s
1.041
have
0.749
rapidly
1.98 / 2
become the
0.046
model
0.0
of
-0.014
choice
0.011
for
-0.026 / 2
NLP
-0.066
problems
0.02
,
-0.034
replacing
-0.009
older
-0.035
recurrent
-0.062
ne
-0.09
ural
-0.068
network
-0.162
models
-0.0
输入
-0.37
Trans
-0.08
former
0.541
s
1.041
have
0.749
rapidly
1.98 / 2
become the
0.046
model
0.0
of
-0.014
choice
0.011
for
-0.026 / 2
NLP
-0.066
problems
0.02
,
-0.034
replacing
-0.009
older
-0.035
recurrent
-0.062
ne
-0.09
ural
-0.068
network
-0.162
models
-0.0
输入
0.213
Trans
0.196
former
0.608
s
2.817
have
4.054
rapidly
1.703 / 2
become the
0.158
model
0.056
of
0.053
choice
0.091
for
0.118 / 2
NLP
0.328
problems
-0.01
,
0.199
replacing
0.196
older
0.182
recurrent
0.054
ne
0.059
ural
0.157
network
-0.03
models
0.0
输入
0.213
Trans
0.196
former
0.608
s
2.817
have
4.054
rapidly
1.703 / 2
become the
0.158
model
0.056
of
0.053
choice
0.091
for
0.118 / 2
NLP
0.328
problems
-0.01
,
0.199
replacing
0.196
older
0.182
recurrent
0.054
ne
0.059
ural
0.157
network
-0.03
models
0.0
输入
0.09
Trans
0.236
former
0.008
s
1.237
have
1.239
rapidly
4.845 / 2
become the
0.042
model
0.045
of
-0.098
choice
-0.141
for
-0.087 / 2
NLP
0.188
problems
0.044
,
0.184
replacing
0.17
older
-0.057
recurrent
-0.06
ne
0.041
ural
-0.097
network
-0.03
models
-0.0
输入
0.09
Trans
0.236
former
0.008
s
1.237
have
1.239
rapidly
4.845 / 2
become the
0.042
model
0.045
of
-0.098
choice
-0.141
for
-0.087 / 2
NLP
0.188
problems
0.044
,
0.184
replacing
0.17
older
-0.057
recurrent
-0.06
ne
0.041
ural
-0.097
network
-0.03
models
-0.0
输入
0.36
Trans
0.286
former
0.299
s
0.044
have
0.169
rapidly
3.663 / 2
become the
1.22
model
0.159
of
0.301
choice
0.327
for
0.116 / 2
NLP
-0.69
problems
-0.003
,
0.0
replacing
0.045
older
-0.038
recurrent
-0.022
ne
-0.048
ural
0.168
network
-0.091
models
-0.0
输入
0.36
Trans
0.286
former
0.299
s
0.044
have
0.169
rapidly
3.663 / 2
become the
1.22
model
0.159
of
0.301
choice
0.327
for
0.116 / 2
NLP
-0.69
problems
-0.003
,
0.0
replacing
0.045
older
-0.038
recurrent
-0.022
ne
-0.048
ural
0.168
network
-0.091
models
-0.0
输入
0.399
Trans
0.529
former
-0.203
s
0.05
have
0.128
rapidly
0.104 / 2
become the
5.148
model
1.552
of
0.23
choice
0.24
for
-0.019 / 2
NLP
-0.76
problems
0.027
,
-0.015
replacing
0.023
older
0.014
recurrent
-0.136
ne
-0.033
ural
-0.246
network
1.029
models
-0.0
输入
0.399
Trans
0.529
former
-0.203
s
0.05
have
0.128
rapidly
0.104 / 2
become the
5.148
model
1.552
of
0.23
choice
0.24
for
-0.019 / 2
NLP
-0.76
problems
0.027
,
-0.015
replacing
0.023
older
0.014
recurrent
-0.136
ne
-0.033
ural
-0.246
network
1.029
models
-0.0
输入
0.043
Trans
0.055
former
0.023
s
0.061
have
0.099
rapidly
0.256 / 2
become the
0.342
model
1.852
of
0.52
choice
0.231
for
-0.0 / 2
NLP
0.063
problems
-0.044
,
-0.014
replacing
0.005
older
-0.061
recurrent
-0.021
ne
-0.047
ural
0.147
network
-0.008
models
-0.0
输入
0.043
Trans
0.055
former
0.023
s
0.061
have
0.099
rapidly
0.256 / 2
become the
0.342
model
1.852
of
0.52
choice
0.231
for
-0.0 / 2
NLP
0.063
problems
-0.044
,
-0.014
replacing
0.005
older
-0.061
recurrent
-0.021
ne
-0.047
ural
0.147
network
-0.008
models
-0.0
输入
-0.289
Trans
-0.081
former
0.09
s
-0.033
have
0.006
rapidly
-0.109 / 2
become the
0.826
model
1.703
of
5.048
choice
1.871
for
-0.05 / 2
NLP
0.297
problems
0.048
,
0.018
replacing
0.084
older
0.051
recurrent
-0.115
ne
0.32
ural
-0.616
network
-0.312
models
-0.0
输入
-0.289
Trans
-0.081
former
0.09
s
-0.033
have
0.006
rapidly
-0.109 / 2
become the
0.826
model
1.703
of
5.048
choice
1.871
for
-0.05 / 2
NLP
0.297
problems
0.048
,
0.018
replacing
0.084
older
0.051
recurrent
-0.115
ne
0.32
ural
-0.616
network
-0.312
models
-0.0
输入
0.178
Trans
0.105
former
0.106
s
0.084
have
0.103
rapidly
0.228 / 2
become the
0.18
model
0.483
of
0.754
choice
1.923
for
0.078 / 2
NLP
0.171
problems
0.001
,
-0.024
replacing
0.073
older
0.036
recurrent
0.045
ne
0.119
ural
0.069
network
-0.13
models
0.0
输入
0.178
Trans
0.105
former
0.106
s
0.084
have
0.103
rapidly
0.228 / 2
become the
0.18
model
0.483
of
0.754
choice
1.923
for
0.078 / 2
NLP
0.171
problems
0.001
,
-0.024
replacing
0.073
older
0.036
recurrent
0.045
ne
0.119
ural
0.069
network
-0.13
models
0.0
输入
0.029
Trans
0.039
former
-0.016
s
0.019
have
0.049
rapidly
-0.081 / 2
become the
-0.13
model
-0.149
of
0.267
choice
0.28
for
1.29 / 2
NLP
0.908
problems
-0.047
,
-0.017
replacing
0.007
older
0.084
recurrent
0.006
ne
0.115
ural
0.174
network
-0.017
models
0.0
输入
0.029
Trans
0.039
former
-0.016
s
0.019
have
0.049
rapidly
-0.081 / 2
become the
-0.13
model
-0.149
of
0.267
choice
0.28
for
1.29 / 2
NLP
0.908
problems
-0.047
,
-0.017
replacing
0.007
older
0.084
recurrent
0.006
ne
0.115
ural
0.174
network
-0.017
models
0.0
输入
-0.014
Trans
0.288
former
-0.042
s
-0.035
have
-0.017
rapidly
-0.222 / 2
become the
0.423
model
0.159
of
0.991
choice
1.207
for
-0.417 / 2
NLP
6.502
problems
-0.012
,
0.024
replacing
0.036
older
0.207
recurrent
0.135
ne
0.165
ural
0.212
network
-0.171
models
0.0
输入
-0.014
Trans
0.288
former
-0.042
s
-0.035
have
-0.017
rapidly
-0.222 / 2
become the
0.423
model
0.159
of
0.991
choice
1.207
for
-0.417 / 2
NLP
6.502
problems
-0.012
,
0.024
replacing
0.036
older
0.207
recurrent
0.135
ne
0.165
ural
0.212
network
-0.171
models
0.0
输入
-0.135
Trans
-0.282
former
-0.061
s
-0.131
have
-0.119
rapidly
-0.16 / 2
become the
0.071
model
0.086
of
0.044
choice
0.093
for
1.575 / 2
NLP
-0.446
problems
-0.065
,
-0.083
replacing
0.008
older
0.092
recurrent
0.074
ne
0.1
ural
0.151
network
0.09
models
-0.0
输入
-0.135
Trans
-0.282
former
-0.061
s
-0.131
have
-0.119
rapidly
-0.16 / 2
become the
0.071
model
0.086
of
0.044
choice
0.093
for
1.575 / 2
NLP
-0.446
problems
-0.065
,
-0.083
replacing
0.008
older
0.092
recurrent
0.074
ne
0.1
ural
0.151
network
0.09
models
-0.0
输入
0.019
Trans
0.07
former
-0.019
s
0.07
have
0.086
rapidly
0.017 / 2
become the
0.049
model
-0.209
of
0.495
choice
0.261
for
5.606 / 2
NLP
1.711
problems
0.116
,
-0.002
replacing
-0.007
older
-0.049
recurrent
0.622
ne
0.196
ural
0.289
network
-0.001
models
0.0
输入
0.019
Trans
0.07
former
-0.019
s
0.07
have
0.086
rapidly
0.017 / 2
become the
0.049
model
-0.209
of
0.495
choice
0.261
for
5.606 / 2
NLP
1.711
problems
0.116
,
-0.002
replacing
-0.007
older
-0.049
recurrent
0.622
ne
0.196
ural
0.289
network
-0.001
models
0.0
输入
-0.105
Trans
-0.089
former
-0.115
s
-0.075
have
-0.079
rapidly
-0.2 / 2
become the
-0.012
model
-0.039
of
0.014
choice
0.067
for
12.045 / 2
NLP
0.028
problems
0.12
,
-0.041
replacing
0.009
older
0.021
recurrent
-0.345
ne
-0.186
ural
-0.249
network
-0.289
models
-0.0
输入
-0.105
Trans
-0.089
former
-0.115
s
-0.075
have
-0.079
rapidly
-0.2 / 2
become the
-0.012
model
-0.039
of
0.014
choice
0.067
for
12.045 / 2
NLP
0.028
problems
0.12
,
-0.041
replacing
0.009
older
0.021
recurrent
-0.345
ne
-0.186
ural
-0.249
network
-0.289
models
-0.0
输入
0.115
Trans
0.051
former
0.106
s
0.134
have
0.131
rapidly
0.254 / 2
become the
0.252
model
0.211
of
0.286
choice
0.267
for
0.295 / 2
NLP
0.395
problems
2.342
,
0.046
replacing
0.047
older
0.026
recurrent
-0.024
ne
-0.02
ural
0.002
network
0.01
models
0.0
输入
0.115
Trans
0.051
former
0.106
s
0.134
have
0.131
rapidly
0.254 / 2
become the
0.252
model
0.211
of
0.286
choice
0.267
for
0.295 / 2
NLP
0.395
problems
2.342
,
0.046
replacing
0.047
older
0.026
recurrent
-0.024
ne
-0.02
ural
0.002
network
0.01
models
0.0
输入
0.077
Trans
0.064
former
-0.121
s
0.05
have
0.056
rapidly
0.461 / 2
become the
-0.097
model
-0.117
of
0.171
choice
0.053
for
0.617 / 2
NLP
0.394
problems
1.075
,
7.315
replacing
-0.007
older
0.187
recurrent
0.127
ne
0.005
ural
-0.073
network
0.399
models
0.0
输入
0.077
Trans
0.064
former
-0.121
s
0.05
have
0.056
rapidly
0.461 / 2
become the
-0.097
model
-0.117
of
0.171
choice
0.053
for
0.617 / 2
NLP
0.394
problems
1.075
,
7.315
replacing
-0.007
older
0.187
recurrent
0.127
ne
0.005
ural
-0.073
network
0.399
models
0.0
输入
-0.096
Trans
-0.141
former
0.058
s
-0.012
have
-0.003
rapidly
0.065 / 2
become the
0.087
model
0.064
of
0.048
choice
-0.003
for
0.118 / 2
NLP
0.171
problems
0.181
,
0.381
replacing
0.875
older
-0.038
recurrent
-0.009
ne
0.156
ural
-0.119
network
0.816
models
-0.0
输入
-0.096
Trans
-0.141
former
0.058
s
-0.012
have
-0.003
rapidly
0.065 / 2
become the
0.087
model
0.064
of
0.048
choice
-0.003
for
0.118 / 2
NLP
0.171
problems
0.181
,
0.381
replacing
0.875
older
-0.038
recurrent
-0.009
ne
0.156
ural
-0.119
network
0.816
models
-0.0
输入
-0.515
Trans
0.816
former
-0.072
s
-0.101
have
-0.079
rapidly
-0.006 / 2
become the
0.036
model
0.009
of
0.007
choice
-0.002
for
0.075 / 2
NLP
0.206
problems
0.06
,
1.8
replacing
5.123
older
-0.095
recurrent
-0.42
ne
-0.549
ural
0.315
network
0.695
models
0.0
输入
-0.515
Trans
0.816
former
-0.072
s
-0.101
have
-0.079
rapidly
-0.006 / 2
become the
0.036
model
0.009
of
0.007
choice
-0.002
for
0.075 / 2
NLP
0.206
problems
0.06
,
1.8
replacing
5.123
older
-0.095
recurrent
-0.42
ne
-0.549
ural
0.315
network
0.695
models
0.0
输入
0.136
Trans
-0.031
former
-0.011
s
-0.176
have
-0.175
rapidly
-0.512 / 2
become the
0.349
model
0.142
of
0.059
choice
0.145
for
0.366 / 2
NLP
-0.093
problems
0.308
,
0.346
replacing
-0.076
older
0.576
recurrent
-0.248
ne
-0.201
ural
-0.246
network
7.662
models
0.0
输入
0.136
Trans
-0.031
former
-0.011
s
-0.176
have
-0.175
rapidly
-0.512 / 2
become the
0.349
model
0.142
of
0.059
choice
0.145
for
0.366 / 2
NLP
-0.093
problems
0.308
,
0.346
replacing
-0.076
older
0.576
recurrent
-0.248
ne
-0.201
ural
-0.246
network
7.662
models
0.0
输入
-0.102
Trans
-0.178
former
-0.044
s
-0.037
have
-0.038
rapidly
-0.118 / 2
become the
0.008
model
0.086
of
-0.044
choice
0.01
for
0.07 / 2
NLP
0.115
problems
0.047
,
-0.151
replacing
-0.115
older
-0.403
recurrent
-0.03
ne
-0.036
ural
1.945
network
0.65
models
-0.0
输入
-0.102
Trans
-0.178
former
-0.044
s
-0.037
have
-0.038
rapidly
-0.118 / 2
become the
0.008
model
0.086
of
-0.044
choice
0.01
for
0.07 / 2
NLP
0.115
problems
0.047
,
-0.151
replacing
-0.115
older
-0.403
recurrent
-0.03
ne
-0.036
ural
1.945
network
0.65
models
-0.0
输入
0.319
Trans
-0.188
former
-0.122
s
-0.072
have
-0.055
rapidly
-0.236 / 2
become the
-0.03
model
-0.067
of
-0.088
choice
-0.038
for
0.238 / 2
NLP
0.181
problems
0.067
,
-0.326
replacing
-0.316
older
-0.304
recurrent
-0.199
ne
-0.103
ural
6.655
network
0.994
models
0.0
输入
0.319
Trans
-0.188
former
-0.122
s
-0.072
have
-0.055
rapidly
-0.236 / 2
become the
-0.03
model
-0.067
of
-0.088
choice
-0.038
for
0.238 / 2
NLP
0.181
problems
0.067
,
-0.326
replacing
-0.316
older
-0.304
recurrent
-0.199
ne
-0.103
ural
6.655
network
0.994
models
0.0
输入
-0.193
Trans
0.107
former
-0.093
s
-0.031
have
-0.026
rapidly
-0.164 / 2
become the
0.008
model
0.005
of
0.018
choice
0.07
for
0.327 / 2
NLP
0.155
problems
0.132
,
0.319
replacing
0.593
older
1.052
recurrent
3.827
ne
5.713
ural
-0.619
network
0.134
models
-0.0
输入
-0.193
Trans
0.107
former
-0.093
s
-0.031
have
-0.026
rapidly
-0.164 / 2
become the
0.008
model
0.005
of
0.018
choice
0.07
for
0.327 / 2
NLP
0.155
problems
0.132
,
0.319
replacing
0.593
older
1.052
recurrent
3.827
ne
5.713
ural
-0.619
network
0.134
models
-0.0
输入
0.028
Trans
0.018
former
-0.135
s
-0.058
have
-0.056
rapidly
-0.29 / 2
become the
-0.018
model
-0.072
of
0.05
choice
0.031
for
-0.022 / 2
NLP
0.08
problems
0.004
,
-0.209
replacing
-0.088
older
0.139
recurrent
1.144
ne
3.272
ural
-0.052
network
-0.004
models
-0.0
输入
0.028
Trans
0.018
former
-0.135
s
-0.058
have
-0.056
rapidly
-0.29 / 2
become the
-0.018
model
-0.072
of
0.05
choice
0.031
for
-0.022 / 2
NLP
0.08
problems
0.004
,
-0.209
replacing
-0.088
older
0.139
recurrent
1.144
ne
3.272
ural
-0.052
network
-0.004
models
-0.0
输入
-0.021
Trans
-0.172
former
-0.003
s
-0.032
have
-0.019
rapidly
-0.042 / 2
become the
0.003
model
-0.017
of
0.026
choice
-0.005
for
0.106 / 2
NLP
0.057
problems
0.05
,
1.109
replacing
2.527
older
6.576
recurrent
0.588
ne
0.39
ural
0.432
network
0.171
models
0.0
输入
-0.021
Trans
-0.172
former
-0.003
s
-0.032
have
-0.019
rapidly
-0.042 / 2
become the
0.003
model
-0.017
of
0.026
choice
-0.005
for
0.106 / 2
NLP
0.057
problems
0.05
,
1.109
replacing
2.527
older
6.576
recurrent
0.588
ne
0.39
ural
0.432
network
0.171
models
0.0
输入
-0.017
Trans
0.001
former
-0.026
s
-0.009
have
-0.005
rapidly
-0.025 / 2
become the
0.002
model
-0.001
of
0.005
choice
0.006
for
0.024 / 2
NLP
0.019
problems
0.018
,
-0.023
replacing
-0.005
older
0.056
recurrent
-0.0
ne
0.049
ural
0.1
network
0.048
models
-0.0
输入
-0.017
Trans
0.001
former
-0.026
s
-0.009
have
-0.005
rapidly
-0.025 / 2
become the
0.002
model
-0.001
of
0.005
choice
0.006
for
0.024 / 2
NLP
0.019
problems
0.018
,
-0.023
replacing
-0.005
older
0.056
recurrent
-0.0
ne
0.049
ural
0.1
network
0.048
models
-0.0
有更多有帮助的示例的想法吗? 欢迎提交 Pull Request 来为此文档笔记本做出贡献!