Although the framework proposed by Bello at Neural Combiantorial Optimization with Reinforcement Learning works well on TSP, it is not applicable to more complicated combinatorial optimization problems in which the system representation varies over time, such as Vehicle Routing Problem(VRP). Thus, this paper propose a new model to overcome drawback of previous.
Reference:
As figure 1 have been show, once change a part of elements, we must update the whole input.
Therefore, in new model, thet just simply leave out the encoder RNN and directly use the embedded inputs instead of the RNN hidden states. By this modification, many of the computational complications disappear, without decreasing the model’s efficiency.