The original model defined the objective as minimizing communications costs while getting an estimate of the environment state, and sharing that estimate with all nodes. This objective has a flaw, in that it is always cheaper to let a single agent read an environment node, making that agent a single point of failure. This flaw is exacerbated by the Directed Acyclic Graph limitation, which means since
A0 must always read from the environment, it is always cheapest to have the entire network rely on
A0 for information.
I recently worked on developing a welfare function emphasizing robustness, or in this case, the ability of the network to complete its objective when a random agent is suddenly removed. The result should be a network without any single points of failure, although I am not accounting for multi-agent failures.
The result is as follows:
In this diagram, all agents receive information from
A0. However, most of them also receive information from
A2, which receives information from
A1, which is monitoring the entire environment. As a result, when
A0 is disabled, only nodes
A5 are negatively effected.
To force robustness I created eleven parallel versions of the graph. They have identical listen weights (the amount any agent tries to listen to any other agent), and begin with identical state weights (how information from a particular agent is factored in to the estimate of the environment), and identical output weights (how different inputs are factored in to the message that is sent).
The difference is that in each of these parallel graphs (except the first one) a single Agent is disabled, by setting all of its output weights to zero. The welfare of the solution is the average of the welfare for each graph.
But why are the state weights and output weights allowed to diverge? Aren’t we measuring ten completely different graphs then?
Not quite. The topology of the network is defined by its listen weights, so we will end up with the same graph layout in each configuration. To understand why the other weights are allowed to diverge, consider an analogy to a corporate scenario:
You are expected to get reports from several employees (Orwell, Alice, and Jim) and combine them to make your own final report. When you hear Jim has been fired, you no longer wait for his report to make your own. Taking input (which isn’t coming) from Jim in to consideration would be foolish.
Similarly, each graph adjusts its state weights and output weights to no longer depend on information from the deleted agent, representing how a real organization would immediately respond to the event.
Then why can’t the listen weights change, too?
This model represents instantaneous reaction to an agent being removed. While over time the example corporation would either replace Jim or restructure around his absence, you cannot instantly redesign the corporate hierarchy and change who is reporting to who. Meetings take time to schedule, emails need to be sent, and so on.
This objective is still limited by the acyclic graph structure, but provides a baseline for valuing resiliency mathematically. Once the acyclic problem is tackled this solution will be revisited.
This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.