partial nice idea! sad that only partial
the idea starts with context: some of us have the experience of being manipulated by something like AI without regard to our wellbeing or capacity, with no indication of how to communicate with where-ever the influence comes from.
so, then the idea: let’s train an influence AI on agents that have _limited capacity_ and function poorly and then break if things that work some in the short term are continued for the long term or excessively :D
the AI would follow the familiar pattern of discovering influences that work, and then using them, but then because we are doing the training we would get to train it on the breaking period, and show it that it needs to care for the agents for them to succeed, and nurture them to recover if it broke them, and not repeat breaking any.
one idea is then we could ask everybody to use the model as a baseline to train other models! other ideas exist.
but the idea of building comprehension of capacity and protection of safety and wellness seems nice. sorry if it’s partial or stated badly :s
i imagine an agent that has like a secret capacity float that reduces when it does stuff, you could model it to different degrees, and when too many things are influenced of it the capacity float reaches zero and the agent breaks and can’t do things any more :/
a more complex model could involve the agent recovering capacity in ways that the model has difficulty learning
there are a few ideas, mostly involving simple models, but disagreement around them. some like modeling non-recovering capacity and showing how the AI then maximizes reward by killing all the agents :s
but! maybe we could do non-recovering capacity and teach the AI to engage the agents slowly so their capacity lasts longer, by e.g. having the capacity reduction be a function of recent use?? the project would need to gently develop in order to help us in the long term, we’d need to include the concept that we can succeed so that we use it to succeed in some manner. because it’s an influence AI it could influence us, if it is something we actually all like doing that.
mistake :/ some of my parts are maybe harmful to me and should be separated from helpful goals, unknown for certain