Although the framework proposed by Bello at Neural Combiantorial Optimization with Reinforcement Learning works well on TSP, it is not applicable to more complicated combinatorial optimization problems in which the system representation varies over time, such as Vehicle Routing Problem(VRP). Thus, this paper propose a new model to overcome drawback of previous.
Reference: * Reinforcement Learning for Solving the Vehicle Routing Problem * code