 ## Social networks should not be DAGs

This post continues the discussion of modeling social organizations using Neural Networks.

### The Setup

Over the past week, I have been working with a neural network that represents a DAG of the following scenario:

• There are ten agents (people, computers, etc) in an organization

• There are five “environment” nodes, representing information in the world

• Each agent can either listen to an environment node, or listen to messages from another node

• There is a cost to all communication, as described in my previous post

• Each agent is calculating the average of the five environment values

The neural network is attempting to minimize communications costs, while maximizing the accuracy of each agent’s information. The equation for loss looks like the following:

``````for each agent:
difference += mean(environment) - mean(agent's estimate of each environment node)
cost += listening_cost_for_agent + speaking_cost_for_agent
total_loss = (difference / num_agents) + cost
``````

To start with, all agents will listen to each agent under them (`A9` can listen to everyone, `A8` to everyone except `A9`, …), and will listen to all environment nodes. The neural network will then remove edges until it reaches maximum efficiency. We’ll begin assuming messages between agents are just as expensive as listening to the environment.

### Results

In an ideal world, the solution to the problem is as follows: This has the lowest total cost because each agent only has to listen to one other node, except for `A0`, who listens to all the environment nodes and distributes the information. However, the neural network regularly produces solutions like the following: In this model, `A0`, `A1`, and `A2` all read from the environment nodes, and everyone else listens to `A2`. Why?

This is a problem known as a local minima. As the neural network slowly took optimization steps (“learning”), it realized it could reduce cost if `A1` stopped listening to `A0`. Unfortunately, this means `A1` must listen to all the environment nodes to get an accurate estimate. If it wanted to listen to `A0` instead it would first have to add an edge to listen to `A0`, which would increase cost, and is therefore unlikely to be chosen. This gets the entire network “stuck”.

### Problems with the Model

The main issue leading to the above local-minima is that we are optimizing for an unnatural situation. With a directed acyclic graph, `A0` must always listen to all the environment nodes, since it is forbidden from listening to any other agents. Therefore, we should not be optimizing for lowest communications cost, but for the fastest path from each agent to `A0`.

The model is also inaccurate in that it minimizes only total cost, not per-agent cost. If we grow the environment, to say a million nodes, it becomes ridiculous to suggest a single agent should be listening to the whole environment. A more accurate representation of the world would minimize the cost for each agent, distributing work, and possibly coming up with something closer to the following: This produces slightly higher total message costs, but each agent on the inner ring only needs to listen to two and send two messages, and those on the outer ring only need to listen to a single message.

### Next Steps

Moving forward, we need a new model for network communication costs - preferably one not based on a DAG. From there we can develop other models, representing fragmented information (not every agent needs to know everything about the environment), redundancy problems, and so on.