Yixiong's Blog

Reinforcement Learning for Solving the Vehicle Routing Problem

Posted on 2024-06-11 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 700 Reading time ≈ 1 mins.

Although the framework proposed by Bello at Neural Combiantorial Optimization with Reinforcement Learning works well on TSP, it is not applicable to more complicated combinatorial optimization problems in which the system representation varies over time, such as Vehicle Routing Problem(VRP). Thus, this paper propose a new model to overcome drawback of previous.

Reference: * Reinforcement Learning for Solving the Vehicle Routing Problem * code

Pointer Networks

Posted on 2024-06-05 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 2.5k Reading time ≈ 4 mins.

Pointer Network(Ptr-Net) uses attention as a pointer to select a member of the input sequence as the output. The author shows that Ptr-Net could solve three challenging geometric problems - finding planner convex hulls, computing Delunary triangulations, and the planner Travelling Salesman Problem.

Reference: * Pointer Networks * Pointer Networks简介及其应用 * TSP问题从DP算法到深度学习3：Pointer Network * 2022最新版-李宏毅机器学习深度学习课程

Learning to Perform Local Rewriting for Combinatorial Optimization

Posted on 2024-06-03 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 3.7k Reading time ≈ 7 mins.

这篇文章提出了NeuRewriter的方法，使用强化学习中 actor-critic 的方式训练。

感觉类似于 “A Monte Carlo Policy Gradient Method with Local Search for Binary Optimization” 中的操作。当然了，NeuRewriter是最先提出来的，时间跨度基本都有4年了。

Reference: * Learning to Perform Local Rewriting for Combinatorial Optimization * github地址

Research Reviews of Combinatorial Optimization Methods Based on Deep Reinforcement Learning

Posted on 2024-06-01 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 4.3k Reading time ≈ 8 mins.

The application of deep reinforcement learning to combinatorial optimization methods.

Reference: - Machine Learning for Combinatorial Optimization: a Methodological Tour d’Horizon

Link:
- Learning to Perform Local Rewriting for Combinatorial Optimization
- graph networks
- pointer networks
- Neural Combiantorial Optimization with Reinforcement Learning

Feynman path integrals

Posted on 2024-05-26 Edited on 2025-05-08 In Physics
Symbols count in article: 33k Reading time ≈ 1:01

考虑到状态的演化，最直接的想法是通过经典力学通过拉格朗日量来构建经典的运动路径。如何处理的问题不是精确的状态，而是概率密度的分布，如何求概率密度分布的演化？路径积分则是十分自然的想法。将每一个小的状态，以任意的路径进行演化，最后得到最终转态的概率分布。这样处理的麻烦在于，不是每一条路径都是等价的，有一些路径发生的概率高一些，另一些则低一些。如何表示这样的概率分布，以及实际得到演化后的概率分布，则是本文要探讨的。

当然，这只是一个简单的版本，需要更多的讨论。

参考： * QM - 路径积分 (Path Integral) PT. 1 - 基本构架 * QM - 路径积分 (Path Integral) PT. 2 - 求解实例

KAN: Kolmogorov–Arnold Networks

Posted on 2024-05-08 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 4.8k Reading time ≈ 9 mins.

文献地址： * KAN: Kolmogorov–Arnold Networks * GitHub地址

传统的全连接神经网络用于拟合非线性函数，然而全连接网络真的是最好的架构了么？对于全连接网络有明显的缺点，显存开销大，可解释性差。

本文作者提出了KAN网络架构，其基于 Kolmogorov-Arnold representation theorem。该网络架构将“参数进行非线性激活”，然后通过全连接。

Multi-Layer Perceptrons (MLPs) vs. Kolmogorov-Arnold Networks (KANs)

Machine-learning-assisted Monte Carlo fails at sampling computationally hard problems

Posted on 2024-04-19 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 5.8k Reading time ≈ 11 mins.

展示利用机器学习提升的MC在一些模型中的失败： * 为什么说失败？哪些指标说明失败？ * 在怎样的模型中？这种模型具有什么样的特点？ * 实验条件是什么？

参考文献： * Machine-learning-assisted Monte Carlo fails at sampling computationally hard problems 配套代码10.5281/zenodo.7567683 * Neural Annealing and Visualization of Autoregressive Neural Networks in the Newman–Moore Model * Glassy dynamics and aging in an exactly solvable spin model * Boundary conditions dependence of the phase transition in the quantum Newman-Moore mode * Visualizing the Loss Landscape of Neural Nets

A Survey of Monte Carlo Tree Search Methods

Posted on 2024-04-17 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 7k Reading time ≈ 13 mins.

本文为蒙特卡洛树搜索综述节选。

蒙特卡洛树搜索（Monte Carlo Tree Search，简称MCTS）是一种用于决策过程的模拟和搜索算法，特别适用于解决复杂的顺序决策问题。以下是MCTS的几个核心步骤：

选择（Selection）：从根节点开始，根据某种策略（如UCB1公式），选择一条路径直到达到一个未扩展的节点。
拓展（Expansion）：在当前选中的节点上添加一个新的子节点。
模拟（Simulation）：从新扩展的节点开始进行随机模拟（即走子），直到游戏结束，并记录结果。
反向传播（Backpropagation）：将模拟的结果回传，更新沿途节点的统计信息。

值得一提的是，MCTS通过不断重复上述过程，逐渐构建出一个代表不同行动及其结果概率的搜索树。最终，算法会选择具有最高胜率的行动作为推荐的行动。

此外，MCTS与传统的博弈树搜索相比，它不需要构建整个博弈树，而是通过模拟来评估局面，这使得它能够处理更大规模的搜索空间。然而，MCTS也存在一些局限性，比如对于非零和游戏的适应性、在面对随机性或不完全信息时的表现等问题。尽管如此，MCTS仍然是现代人工智能领域的一个重要工具，尤其是在棋类游戏中的应用，如AlphaGo/Zero等著名程序就是基于MCTS的变体。

Machine Learning with Graphs

Posted on 2024-04-15 Edited on 2025-05-08 In Machine Learning
Symbols count in article: 27k Reading time ≈ 49 mins.

图论（Graphs）是数学的一个分支，它研究的是图这种数学结构。图论中的图是由点（也称为顶点）和连接这些点的线（也称为边）组成的。点通常用来代表事物，而边则表示事物之间的关系。图可以是有向的，也可以是无向的，这取决于边是否有方向。

图论的研究内容包括图的各种性质，如度（一个点连接的边的数量）、连通性（图中的点是否都通过边相互可达）、路径（点与点之间的连线序列）等。图论不仅在数学领域内有广泛的应用，还在计算机科学、工程学、经济学等多个领域发挥着重要作用。

因多次发现与图论相关的内容，故在这里记录学习图论的笔记与想法。

参考文献： * 图深度学习（葡萄书） * The Healthy Birds Trio, Jure Leskovec, Notes for Stanford CS224W Machine Learning with Graphs, 2020 * 《图强化学习：原理与实践入门》谢文杰、周炜星，清华大学出版社不推荐阅读，东拼西凑强行逻辑合理，很多内容点不到就止，同时不给深入阅读的材料。 * Spectral and Algebraic Graph Theory Incomplete Draft

感谢Datawhale开源社区提供学习平台与相关资源。

Ising formulations of many NP problems

Posted on 2024-04-15 Edited on 2025-04-03 In Math
Symbols count in article: 3.9k Reading time ≈ 7 mins.

为许多NP-complete和NP-hard问题提供了Ising形式，包括Karp的21个NP完全问题中的所有问题。这收集并扩展了从分区、覆盖和可满足性到Ising模型的映射。在每种情况下，所需的自旋数最多是问题大小的三次方。这项工作可能对设计绝热量子优化算法有用。别的优化算法同样有用。

文献： * 本文Ising formulations of many NP problems * On the computational complexity of Ising spin glass models * E. Boros and P.L. Hammer. “Pseudo-Boolean optimization”, Discrete Applied Mathematics 123 155 (2002). * Reducibility among Combinatorial Problems * M. Me ́zard and A. Montanari. Information, Physics and Computation (2009) * Y. Fu and P.W. Anderson. “Application of statistical mechanics to NP-complete problems in combinatorial optimisation”, Journal of Physics A19 1605 (1986).