29 Dec
2021
29 Dec
'21
9:31 p.m.
ok, this was great motion i think vanilla models have a maximum sequence length. this can be expanded by altering the algorithm to not be O(n^2) for memory in the attention function. there's a paper out there on one approach to this. another idea is to chunk the text in some way and train models to handle the chunked text. that could also become two small projects instead of one if something like reinforcement learning is used to make that effective.