Social networks should not be DAGs

This post continues the discussion of modeling social organizations using Neural Networks.

The Setup

Over the past week, I have been working with a neural network that represents a DAG of the following scenario:

  • There are ten agents (people, computers, etc) in an organization

  • There are five “environment” nodes, representing information in the world

  • Each agent can either listen to an environment node, or listen to messages from another node

  • There is a cost to all communication, as described in my previous post

  • Each agent is calculating the average of the five environment values

The neural network is attempting to minimize communications costs, while maximizing the accuracy of each agent’s information. The equation for loss looks like the following:

for each agent:
    difference += mean(environment) - mean(agent's estimate of each environment node)
    cost += listening_cost_for_agent + speaking_cost_for_agent
total_loss = (difference / num_agents) + cost

To start with, all agents will listen to each agent under them (A9 can listen to everyone, A8 to everyone except A9, …), and will listen to all environment nodes. The neural network will then remove edges until it reaches maximum efficiency. We’ll begin assuming messages between agents are just as expensive as listening to the environment.

Results

In an ideal world, the solution to the problem is as follows:

Absolute Minima

This has the lowest total cost because each agent only has to listen to one other node, except for A0, who listens to all the environment nodes and distributes the information. However, the neural network regularly produces solutions like the following:

Local Minima

In this model, A0, A1, and A2 all read from the environment nodes, and everyone else listens to A2. Why?

This is a problem known as a local minima. As the neural network slowly took optimization steps (“learning”), it realized it could reduce cost if A1 stopped listening to A0. Unfortunately, this means A1 must listen to all the environment nodes to get an accurate estimate. If it wanted to listen to A0 instead it would first have to add an edge to listen to A0, which would increase cost, and is therefore unlikely to be chosen. This gets the entire network “stuck”.

Problems with the Model

The main issue leading to the above local-minima is that we are optimizing for an unnatural situation. With a directed acyclic graph, A0 must always listen to all the environment nodes, since it is forbidden from listening to any other agents. Therefore, we should not be optimizing for lowest communications cost, but for the fastest path from each agent to A0.

The model is also inaccurate in that it minimizes only total cost, not per-agent cost. If we grow the environment, to say a million nodes, it becomes ridiculous to suggest a single agent should be listening to the whole environment. A more accurate representation of the world would minimize the cost for each agent, distributing work, and possibly coming up with something closer to the following:

Communist Neural Network

This produces slightly higher total message costs, but each agent on the inner ring only needs to listen to two and send two messages, and those on the outer ring only need to listen to a single message.

Next Steps

Moving forward, we need a new model for network communication costs - preferably one not based on a DAG. From there we can develop other models, representing fragmented information (not every agent needs to know everything about the environment), redundancy problems, and so on.

Attribution

This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 6/23/17