[ot][personal] updates on aiml discovery
i wanted to learn about huggingface’s rlhf project and visited their blog. no post yet. [note: i used to prefer the opencog approach to AI which needs no nvidia cards. i would add subcontextual knowledgesets and a sugar interface to simplify names and use.] - i found huggingface’s peft work. their peft library is a more integrated form of adapters that they are actively working on. just like adapters it lets you train large models on low-end systems by working off pretrained ones, - i found their demo of decision transformers. this is a paper from 2021 which normalizes a structure for using autoregressive models as reinforcement learning policies (they plug gpt2 into an agent that learns to walk, without trial using only recorded data). since this is integrated into formal and normative instructions i’m interested in exploring it, and planning to. we’ll see how this share influences that decision,
regarding peft it is sad that huggingface rejected adapterhub’s attempt to reach out to combine codebases with them and instead implemented their own solution independently. i briefly engaged with them around this on an issue i opened and closed this morning. i failed to draw their attention sufficiently to adapterhub’s prior attempt to work together. it kind of looks like they wanted to do it completely differently.
it is much harder for me to continue learning about decision transformers after sharing these things still deciding whether to be public or private
i slimmed through decisiontransformers and have a simple conception of it, and i made https://github.com/xloem/datadecisions which is just a stub with a wrapper providing for multi-valued reward. pleasant to accomplish something, i have an appointment today.
after i posted my small complaint around hf’s handling of peft somebody linked to colossalai in an unrelated discord where i hadn’t mentioned this colossalai is a similar model training acceleration framework with some support for peft that says it is looking to connect with other contributors. it is based on energonai for inference which appears to have a unified architecture for popular gpt, opt, bert pretrained models ( https://github.com/hpcaitech/EnergonAI/blob/master/energonai/model/model_fac... ). the colossalai news is that they are actively developing yet-another-clone of the chatgpt training framework using their acceleration library. https://github.com/hpcaitech/ColossalAI/tree/main/applications/ChatGPT
I tried posting an idea publicly at https://github.com/hwchase17/langchain/issues/1111#issue-1589016921 . quote:
Hi,
I would like to propose a relatively simple idea to increase the power of the library. I am not an ML researcher, so would appreciate constructive feedback on where the idea goes.
If langchain’s existing caching mechanism were augmented with optional output labels, a chain could be made that would generate improved prompts by prompting with past examples of model behavior. This would be especially useful for transferring existing example code to new models or domains.
A first step might be to add an API interface and cache field for users to report back on the quality of an output or provide a better output.
Then, a chain could be designed and contributed that makes use of this data to provide improved prompts.
Finally, this chain could be integrated into the general prompt system, so a user might automatically improve prompts that could perform better.
This would involve comparable generalization of parts to that already seen within the subsystems of the project.
What do you think?
participants (1)
-
Undescribed Horrific Abuse, One Victim & Survivor of Many