[crazy][hobby][spam] Automated Reverse Engineering

k gmkarl at gmail.com
Sat Jan 1 14:35:38 PST 2022


i'm looking at https://github.com/huggingface/transformers/blob/master/examples/flax/summarization/run_summarization_flax.py#L534
, which is for flax as a summarization task, and noting that the
decoder input ids are the labels shifted by one.  i'm thinking that
summarization is basically the same as translation: seq2seq.  don't
really know.

here's their flax loss function:
https://github.com/huggingface/transformers/blob/master/examples/flax/summarization/run_summarization_flax.py#L688

and at https://github.com/huggingface/transformers/blob/master/examples/flax/summarization/run_summarization_flax.py#L714
i'm thinking that the labels are _not_ passed to the model ('pop'),
which lines up with not seeing the parameter in the source cod for the
flax model


More information about the cypherpunks mailing list