The Engineer's Philosophy of Learning

The Engineer’s Philosophy of Learning: The “Ship of Theseus” and Iterative Replacement

In Computer Science education, we often face a deep-seated tradition of “Bottom-Up” teaching. From Operating Systems to Computer Graphics, compilers, and especially Artificial Intelligence, the curriculum usually starts with mathematical proofs and low-level theories, only revealing the practical application at the very end.

The intention behind this is noble: a shaky foundation builds a shaky house. However, for students with an engineering mindset, this approach often hits a major pain point: it is easy to get lost in the maze of details before ever seeing the big picture.

It is like being forced to memorize thermodynamic formulas before you’ve ever seen a car engine run. This process is often dry, devoid of context, and leads to unnecessary frustration.

Based on my personal experience, I believe there is a more efficient learning path for engineers: Start with top-down application to build intuition, then use “Iterative Replacement” to master the underlying principles.

I call this the “Ship of Theseus” Learning Method.

What is “Iterative Replacement”?

The paradox of the “Ship of Theseus” asks: if every plank of wood in a ship is gradually replaced until no original timber remains, is it still the same ship?

In learning, we can reverse-engineer this concept:

Start with the Black Box: Use mature libraries or frameworks to build a complete, functioning system.
Isolate: Select a specific module within that system.
Study & Reproduce: Dive into the theory (math/algorithms/protocols) behind just that one module, and then hand-write a reproduction of it using low-level code (the “White Box”).
Replace & Verify: Swap the original library function with your hand-written module. If the system still runs correctly, you have mastered that component.
Loop: Repeat this process for the next module until you have effectively “rewritten” the system.

Case Study: The AI/ML Example

This method is particularly effective when learning Artificial Intelligence, where traditional education often creates a high barrier to entry with heavy linear algebra and statistics. Instead, we can apply the “Iterative Replacement” strategy:

Phase 1: The High-Level Context

First, be a “Script Kiddie” (in a good way). Use high-level APIs like PyTorch or scikit-learn to quickly get a Linear Regression or Neural Network model running.

Goal: Understand the Input, understand the Output, and get immediate positive feedback from a working model.

Phase 2: Modular Refactoring

This is the core of the methodology. I do not advocate for “blind reproduction.” You cannot reproduce what you do not understand. The correct order is: Study the math specifically to enable the reproduction.

For example, let’s say we decide to replace the Loss Function in our PyTorch pipeline:

Theoretical Study: Go to the textbooks. Look up the specific formula for MSE (Mean Squared Error) or Cross-Entropy. Because you now have the context of the code, you know exactly what the variables $y$ and $\overset{y}{^}$ represent in your pipeline.
Code Reproduction: Implement that formula using pure NumPy operations.
Verification: Replace torch.nn.MSELoss with your custom my_mse_loss function. Does the model still converge? Is the loss curve identical? If so, you have truly internalized this concept.

Next, you move on to replace the Activation Function (study derivatives), then the Optimizer (study Gradient Descent and matrix multiplication).

Eventually, once you have swapped every block, you have essentially written a micro-Deep Learning framework using only NumPy. Those abstract mathematical symbols have transformed into concrete logic in your code.

The Universality of the Method

This methodology is not limited to AI. It applies to almost any complex domain in Computer Science:

Learning Web Frameworks:
- Start by building an API with Flask or Express.
- Then, try hand-writing a simple HTTP Request Handler using Socket programming to replace the framework’s routing logic. You will instantly understand TCP handshakes and HTTP packet structures.
Learning Databases:
- Start by storing data with SQLite.
- Try replacing the SQL query engine with a simple file I/O system using a B-Tree or Hash Index that you implement yourself. You will gain a profound understanding of database indexing.
Learning Frontend:
- Start by building a page with React.
- Try hand-writing a simple Virtual DOM Diff algorithm to replace the rendering layer. You will understand exactly what the framework is doing under the hood.

Conclusion

Using libraries (“API calling”) is not the goal; it is the scaffolding. It provides us with a working baseline and a “standard answer” to check our work against.

With this scaffolding in place, we break down the massive wall of knowledge into concrete “replacement tasks.” In each task, we synchronize theoretical learning with code implementation.

This approach transforms Math and Theory from a Gatekeeper that blocks your entry, into a Tool that helps you solve specific engineering problems. For engineers, code is the language we use to understand the world. Verifying theory through code is far more profound than reading about it in a book.

工程师的学习哲学：“忒修斯之船”与逐步重构法

在计算机科学的教育体系中，我们常面临一种“自底向上”的教学传统：先讲数学证明，再讲底层原理，最后才教你怎么用。从操作系统到图形学，从编译器到现在的 AI，无不如此。

老师们的初衷是好的——基础不牢，地动山摇。但对于习惯工程思维的学生来说，这种方式往往存在一个巨大的痛点：在看到全貌之前，我们很容易迷失在细节的迷宫里。 就像还没见过汽车跑起来的样子，就被按着头背诵发动机热力学公式，这种学习过程往往枯燥且容易产生挫败感。

结合我个人的学习经验，我认为对于工程师而言，一种更高效的学习路径应该是：先顶层应用建立直觉，再通过“逐步替换”的方式深入底层原理。

我称之为**“忒修斯之船”学习法**。

什么是“逐步替换”？

古希腊神话中的“忒修斯之船”提出了一个哲学悖论：如果一艘船上的木头被逐渐替换，直到所有的木头都不是原来的木头，那这艘船还是原来那艘吗？

在学习上，我们可以反向利用这个概念：

启动（Start with Black-box）：先使用成熟的库或框架（黑盒），搭建一个能跑通的完整系统。
抽离（Isolate）：选定系统中的某一个具体模块。
钻研与复现（Study & Reproduce）：针对这一个模块，去学习它背后的理论（数学/算法/协议），然后用基础代码（白盒）手写复现它。
替换与验证（Replace & Verify）：将你手写的模块替换掉原本的库函数，运行系统。如果系统依然能正常工作，说明你彻底掌握了这个模块。
循环（Loop）：对下一个模块重复上述步骤，直到你把整个系统“重写”了一遍。

案例演示：以 AI/ML 为例

这种方法在学习人工智能时效果尤为显著。传统的 AI 教学往往一开始就是复杂的线性代数和统计学推导，让人望而生畏。如果我们采用“逐步替换法”：

第一阶段：建立全貌

先做一个“调包侠”。利用 PyTorch 或 scikit-learn 的高层 API，快速跑通一个线性回归或神经网络模型。

目的：理解 Input 是什么，Output 是什么，获得即时的正反馈。

第二阶段：模块化重构

这是最关键的一步。我不赞同“先盲目复现再补数学”，因为不懂原理根本无法复现。 正确的顺序是：为了复现某个模块，针对性地去学数学，边学边写。

比如，我们决定替换掉 PyTorch 的 Loss Function：

理论学习：此时去翻书，看 MSE（均方误差）或 Cross-Entropy 的数学公式。因为有了代码上下文，你会清楚知道这些公式里的 $y$ 和 $\overset{y}{^}$ 到底指代什么。
代码复现：用纯 NumPy 实现这个公式。
验证：把 torch.nn.MSELoss 换成你写的 my_mse_loss。模型收敛曲线一致吗？如果一致，恭喜你，这个知识点你“吃透”了。

接着，你可以继续替换 Activation Function（去学导数）、替换 Optimizer（去学梯度下降和矩阵运算）。

最终，当你把所有积木都换了一遍，你实际上是用 NumPy 手写了一个深度学习框架。此时，那些枯燥的数学符号已经变成了你脑海中鲜活的代码逻辑。

这种方法的普适性

这套方法论不仅仅适用于 AI，它适用于几乎所有计算机领域的深度学习：

学 Web 框架：
- 先用 Flask/Express 写 API。
- 然后尝试用纯 Socket 编程手写一个简单的 HTTP Request Handler，替换掉框架的路由处理。你会瞬间理解 TCP 握手、HTTP 报文结构。
学数据库：
- 先用 SQLite 存取数据。
- 尝试用文件 IO 手写一个简单的 B+树或 Hash 索引来存储数据，替换掉原本的 SQL 查询。你会深刻理解数据库索引的原理。
学前端框架：
- 先用 React 写页面。
- 尝试手写一个简单的 Virtual DOM Diff 算法，去理解 React 到底在渲染层做了什么。

总结

“调包”不是目的，而是手段。 它为我们提供了一个可运行的“脚手架”和标准答案。

在这个脚手架的支撑下，我们将庞大的知识体系拆解成一个个具体的“替换任务”。在每一个任务中，我们同步进行理论学习和代码复现。

这种方法将“学数学/原理”从一种前置的门槛（Gatekeeper），变成了解决具体工程问题的工具（Tool）。对于工程师来说，代码是我们理解世界的最强语言，通过代码去验证理论，远比干看书本要深刻得多。

HK Blog

Explorer

The Engineer's Philosophy of Learning