[ot][personal] updates on aiml discovery

Wed Feb 15 03:43:27 PST 2023

i wanted to learn about huggingface’s rlhf project and visited their
blog. no post yet.

[note: i used to prefer the opencog approach to AI which needs no
nvidia cards. i would add subcontextual knowledgesets and a sugar
interface to simplify names and use.]

- i found huggingface’s peft work. their peft library is a more
integrated form of adapters that they are actively working on. just
like adapters it lets you train large models on low-end systems by
working off pretrained ones,

- i found their demo of decision transformers. this is a paper from
2021 which normalizes a structure for using autoregressive models as
reinforcement learning policies (they plug gpt2 into an agent that
learns to walk, without trial using only recorded data). since this is
integrated into formal and normative instructions i’m interested in
exploring it, and planning to. we’ll see how this share influences
that decision,