Power-Seeking Theorem: Why Being 'Not a Tool' is the Ultimate Growth Strategy

We now officially enter the first module: mental tools for life growth strategy. This lecture discusses a very new theory discovered by computer scientists studying AI agents: “Power-Seeking Theorems.” I feel this theorem can also be applied to humans, and it is the most fundamental and decisive truth in life—it aligns perfectly with the Chinese saying, “Junzi Bu Qi” (The noble person is not a tool/utensil).

A painful and ugly reality is that people are often in a state of being driven. Employees watch their boss’s mood by day and their spouse’s by night; students listen to teachers by day and parents by night. You must do what others tell you, and without their command, you have zero freedom. Can one live like this? Being driven leads to chronic stress, which raises cortisol levels, triggers inflammation, and eventually causes both health and spirit to deteriorate.

In my view, the essence of “Junzi Bu Qi” is not to be driven—not to be a utensil. A utensil is something others can pick up and use at will; a noble person cannot be like that.

This is also the core idea of Kant’s moral philosophy, which he put even more bluntly and absolutely: a person should always be an end, never merely a means.

Simply put, don’t be a “tool person.” Whether for yourself or for others, a person is a person and should not be objectified.

But we are driven by various forces every day and remain passive. How can we achieve “Junzi Bu Qi”? What should we pursue? Confucius gave the advocacy, Kant gave the principle, and the Power-Seeking Theorem gives you the method.

✵

The force driving a person isn’t necessarily a specific individual or organization; it’s more likely to be something else… like poverty. Poverty is not just an economic issue; it’s a cognitive one. A person in long-term poverty will first remind themselves of their poverty whenever anything happens: How do I use this little money? When will the next payment arrive? To the point they can’t think of anything else.

In the words of Sendhil Mullainathan and Eldar Shafir’s book Scarcity [1], this is called “tunneling”: like entering a tunnel, you can only see the light ahead, and everything around you disappears. A sense of scarcity narrows cognitive bandwidth.

An experiment conducted in both the US and India asked people in a lab to imagine being in a state of scarcity—for example, a car suddenly breaking down requiring a large sum for repairs. For wealthy people, this was nothing, but for low-income earners, this state caused their IQ test scores to drop by 13 points on the spot [2]!

Staring only at this one thing, they have no time for anything else, to the point of being “de-intellectualized.” Isn’t this being driven?

Look at those addicted to drugs, gambling, or alcohol—isn’t their cognition also narrowed? They can only see one thing. For the chronically hungry, everything they see reminds them of food. These people are enslaved by a certain desire. You might say poverty and hunger are involuntary; they don’t want to be that way. True, I’m not making a moral condemnation; I’m saying this state is very bad. As Kant would say, they have turned themselves into tools. In Chinese terms, “the small person is enslaved by things.”

Furthermore, those who work desperately for promotions and bonuses at the expense of their health, those who sacrifice their lives in the name of love to focus entirely on their children’s studies, those who only see power and know only the leader but not the principles… they are all “utensils.”

In my view, the essence of becoming a “utensil” is optimizing for a single goal while sacrificing other dimensions of life. Because you only have this objective function in your eyes, you are a slave to that goal.

In fact, everyone has, more or less at some stage of life, fallen into the state of being a “utensil.” Whether because of poverty, love, an exam, or a project, with only one thing in sight and one person in heart, it is harmful. Your cognitive bandwidth narrows, your sensitivity drops significantly, you fall into a tunnel, you lack diverse interests and rich roles—you are not flourishing at all.

“Junzi Bu Qi” is truly the minimum requirement for a healthy personality.

✵

You might say that life must have pursuits; we can’t just cultivate ourselves all day, and we certainly can’t just “lie flat.” We should contribute to society. Is the problem of the “utensil” that it has only one objective function? If we have multiple goals, does that make us “Junzi Bu Qi”? No.

One goal is exhausting enough; wouldn’t multiple goals be even more painful?

The real difference between a noble person (Junzi) and a small person (Xiaoren) is active versus passive.

A Junzi can certainly focus on a single goal at any given moment, but he is the master of that goal, not its slave; he can step out of that narrative at any time. For example, a person enslaved by power sees a superior’s will as supreme: “I’ll do whatever you say.” A person exercising power, facing the same command, says: “If what you say aligns with my values, I’ll do it. If it doesn’t, I’d rather resign, or even revolt!”

“Junzi Bu Qi” means you must be higher than any objective function. Elon Musk said it well [3]:

“Never attach yourself to a person, a place, a company, or an organization. Attach yourself to a mission, a calling, or a purpose.”

Objective functions must serve the purpose. Mission, calling, and purpose belong to you; they are part of your life’s meaning, and you find joy in them. Research has long found that people with a sense of purpose live healthier and longer lives [4].

Working hard for a cause does not equal becoming a tool person. There is a subtle but extremely important distinction.

✵

Some might say, “I just want to make more money in this life, even if I have to be a tool person.” Does that work? No. Many things people love, including money, are often not obtained through direct pursuit; they are byproducts of other endeavors.

If a person only has eyes for money, they actually won’t make much. Think about it: who would want to cooperate with someone who is nitpicky and only thinks about money? How much value can they create?

Take academic reputation. A degree can be obtained by grinding practice problems, but scholarship cannot. Academic reputation is not produced by exams. Those desperately preparing for exams are merely trying to fake their way into someone else’s circle.

And social status. People wear luxury goods and engage in conspicuous consumption to signal status, but truly high-status people often disdain these external symbols, even viewing excessive showing off as a lack of confidence.

Some pursue “good mood,” treating happiness as an obsession. If someone makes me unhappy on my birthday, I’ll make them unhappy for life! Little do they know that a deliberately crafted good mood is false…

The characteristic of these good things is that the more directly you pursue them, the harder they are to get. If you let them go and do something else, they find you automatically. In Chinese, it’s called “bending to be complete”; in English, there’s a term called “Obliquity” [5], meaning goals are often best achieved indirectly. This is because interactions in complex systems are often indirect and uncertain; you can only explore with peace of mind, and the rewards will appear where you least expect them.

If every paper published were rewarded with a fixed ten thousand dollars, researchers would produce a mountain of junk papers. The Nobel Prize is certainly not incentivized this way. Why do some people persist in exploring when they know there might not be a reward? They must be those who find joy in it, those with intrinsic motivation, those who view the cause as their mission—they are Junzi.

Now, the question arises: a Junzi cannot work only for fun; you must take your career seriously and certainly have a strong sense of competition. If a Junzi doesn’t compete for byproducts, what should you compete for?

This is where the “Power-Seeking Theorem” comes in.

✵

This is a theory from a 2021 paper by Alexander Turner, a computer scientist at Oregon State University [6]. Turner’s research states that an AI agent learns through reinforcement by obtaining rewards. Facing a complex environment where it doesn’t know where the rewards will appear, what strategy should it adopt?

The answer is to seek power. While “power” often refers to political hierarchy, here it’s more accurate to call it “capability” or “enablement.” Power is your influence over the environment, especially the “options” you possess.

The Power-Seeking Theorem says that as an agent, you should try to secure more possibilities for yourself in the future, which will allow you to maximize rewards in most situations.

For example, imagine two paths: the left is a dead end, and the right leads to multiple locations. Even if you don’t know where the reward is, you should go right because you will have more options.

The beauty of options is that they can be used or not. If there’s a reward, you take it; if not, you let it go. You have the rights without the obligations. The Power-Seeking Theorem means that if you don’t see specific rewards right now, you should move toward increasing your options.

Think about it—aren’t humans the same? When we say someone has “power” or “capability,” like mastering a skill, owning resources, or earning trust, these things don’t bring immediate rewards, but they all increase options. Being able to do what others cannot—that is power.

The best part is that power can be directly sought.

✵

In fact, seeking power is human nature.

We often talk about “empowerment,” and information theory has long had a concept called “empowerment” [7], meaning increasing the channel capacity of future states. Simply put, if an action allows you to reach a place that leads to more different futures, that action is empowering.

Empowerment is not a specific task performed to get a reward, so you are not being driven.

In 2014, Christoph Salge, an AI researcher at the University of Hertfordshire, proposed the “Behavioral Empowerment Hypothesis” [8], suggesting that empowerment is the most primitive internal drive of organisms. Even without external goals, organisms tend to act because they want to empower themselves, maximizing their influence and choice over the future.

Even if not currently hungry, bacteria will move toward resource-rich areas because there are more options for food. If current days are good, animals will actively explore new environments because they might bring more mating opportunities.

Why do people long for freedom? Because freedom means more options. The pursuit of freedom is empowerment.

✵

The tool person says: “I completed the KPI.”

The Junzi says: “I improved my ability to transform new problems into solvable ones.”

Facing the same fact, the former narrative is being driven, while the latter is empowerment. “Junzi Bu Qi” means you are not doing tasks; you are perfecting yourself.

Specific tools can only do specific things, but humans can expand their ability and options to use various tools at any time. This is what Xunzi meant: “The nature of the noble person is not different; they are just good at utilizing things.”

Even if there are no immediate rewards, we should actively explore because we need to empower ourselves. A Junzi does not oppose optimization or ignore objective functions, but treats them as means while focusing on building a system that expands capabilities and options. “Junzi Bu Qi” is not against specialization, but against the obsession with specialization as an identity.

Learning new knowledge is empowerment, developing new hobbies is empowerment, entering new fields is empowerment, establishing good social relationships is empowerment, and improving virtue is empowerment… these all give you more options. Many things are byproducts, but as long as you grasp the root of “increasing options,” empowerment can be directly pursued.

The Power-Seeking Theorem says “Junzi Bu Qi” is the optimal growth strategy in an environment with uncertain rewards: do not optimize a single objective function; increase your options.

【Application Scenarios】

Some might say, “I am very poor right now, I’m just an employee, I must follow orders. What can I do?” My answer is that Viktor Frankl, author of Man’s Search for Meaning, found spiritual freedom even in Nazi concentration camps. Why can’t you increase your own options? Understanding the Power-Seeking Theorem, you absolutely can:

Consider one more option in decision-making instead of falling into the path dependency of old narratives;
Have one more role in identity instead of being kidnapped by a fixed role;
Learn not just a specific tool, but improve cross-task understanding so capabilities are transferable;
Maintain independence in social relationships to avoid complete dependence on a single source of power;
Expand mental space to remain creative amidst uncertainty…

Even with making money, if the income increases your options, it’s worth striving for; but if the price is becoming controlled and reducing freedom, it’s not empowerment, but disempowerment.

Zhuangzi said he would rather be a live turtle rolling in the mud than a dead one in a temple; Tao Yuanming refused to bow for five pecks of rice; Li Bai said he could not lower his brow to serve the powerful. They weren’t just trying to be moral role models; they intuitively sensed the Power-Seeking Theorem.

Notes

[1] Sendhil Mullainathan / Eldar Shafir, Scarcity: Why Having Too Little Means So Much, 2013.

[2] Mani, Anandi, Sendhil Mullainathan, Eldar Shafir, and Jiaying Zhao. “Poverty Impedes Cognitive Function.” Science 341, no. 6149 (2013): 976–980.

[3] Haden, Jeff. “Elon Musk Says Living a Happy, Successful, and Meaningful Life Comes Down to 4 Simple Things.” Inc. (Expert Opinion), September 15, 2023.

[4] Sone, Toshimasa, Naoki Nakaya, Kaori Ohmori, et al. “Sense of Life Worth Living (Ikigai) and Mortality in Japan: Ohsaki Study.” Psychosomatic Medicine 70, no. 6 (2008): 709–15.

[5] Kay, John. Obliquity: Why Our Goals Are Best Achieved Indirectly. Penguin, 2011.

[6] Turner, Alexander M., Logan Smith, Rohin Shah, Andrew Critch, and Prasad Tadepalli. “Optimal Policies Tend to Seek Power.” NeurIPS 2021 (Proceedings), 2021.

[7] Klyubin, Alexander S., Daniel Polani, and Chrystopher L. Nehaniv. “Empowerment: A Universal Agent-Centric Measure of Control.” In IEEE Congress on Evolutionary Computation, vol. 1, 128–35. Edinburgh, 2005.

[8] Salge, Christoph, Cornelius Glackin, and Daniel Polani. “Empowerment–an introduction.” Guided Self-Organization: Inception. Springer, Berlin, Heidelberg, 2014. 67-114.