Tuesday, September 19, 2006

How to Make a Robot with Independent Will?

Cognitive scientists and computer programmers have managed to create robots and computer programs that solve various problems. We have computers that play chess or even a sort of football, that build automobiles, that are worthy opponents in strategy games, that help fighter plane pilots, that suggest you what other books to buy on Amazon or that place personalized Google adds on web pages. But all these robots are subordinated to the ascribed goal – they cannot invent their own goals and pursue them, they can even rarely invent their own methods for solving a certain problem. Our smartest robot is dumber than the dumbest ant. How could this be overcome?

The main problem is how to ascribe a certain value to some goal. How do we rank our goals, how do we decide what is worth pursuing and what is less worthy of being pursued?

Most computer scientists think about this problem the other way around: the goal is given and the problem is to determine the best method for reaching that goal. It is supposed that when one makes a decision, one chooses the means for a attaining a certain goal (the goals are given). This is why any decision is supposed to be just a mater of computation – one just needs to compute the most efficient path toward that goal, the path that uses the minimum amount of resources or which has the least cost or which has the least amount of unwanted side-effects etc. But where do the goals themselves come from? And where do the criteria for what is the best path toward a goal come from?

Economics offers a different perspective. One assumes that at a certain moment in time, the resources are given and one needs to decide what to use these resources for. In other words, the means are given and one chooses the goals. Of course, not everybody aims for the same things and this fact seems to destroy from start any possibility of a robot with independent will: the fact that in the same situation (the same resources are given) different people chose different things seems to imply that the act of making a choice is not the result of an algorithmic process that could be mimicked or approximated by a computer program.

But again, economics offers a different clue.

The law of marginal utility



In all textbooks, this law is named the law of diminishing marginal utility because people focus on one particular aspect of the law. The textbook law says that something seems more valuable if it is less accessible. By "something" it is meant either a product or a service.

The word "marginal" refers to the fact that we don't judge the value of a product or of a service in general (for example we don't think about the value of air or of clothes or of bread in general), but we consider what we can actually use (we think about the value of the air we need in order to breath now, of the clothes we use, of one loaf of bread and so on). In other words, the reason why gold for instance is more valuable (and thus costs more) than air is that we don't compare the value of gold in general to the value of air in general, but we compare the gold that we have access to with the air we have access to. Air is much more accessible than gold and thus it seems much less valuable (air is free, gold is expensive).

The law of diminishing marginal utility is the reason behind the law of supply and demand in economics (the idea that the price is determined by the ratio between supply and demand). The same product or service has a different accessibility to different people. For instance, shoes are highly accessible to a shoemaker and bread is highly accessible to a baker but not the other way around. Thus, bread has a higher value for the shoemaker and the shoes have a higher value for the baker – and thus, they are both willing to make an exchange. This law explains, therefore, why people are willing to exchange things. Furthermore, when one has the idea of money in place, which is a general medium of exchange (it is a product everybody is willing to accept because everybody else is willing to accept it), this law eventually translates into the law of supply and demand.

The image above shows the law of diminishing marginal utility for three different products (or services). As you can see in some cases, when the product becomes more and more accessible, its value decreases rapidly, in other cases, it decreases more slowly. When the accessibility is very close to zero, the product or service has some initial value which can be higher or smaller (scarce water for instance has a very high value, much larger than scarce gold). Moreover, a product or a service can be too much – so much that its value becomes negative. For example, plastic bags are useful, but when there are too many of them, we call them garbage and we are even willing to pay to get rid of them; or commercials are useful to some extent but when there are too many, they become such an annoyance that we are even willing to pay for TV stations that don't have them.

But what happens when the accessibility becomes negative? This is the domain that is missing in the standard economic textbooks. A negative accessibility means that that product or service does not (yet) exist. More inaccessible it is, and thus more unrealistic it seems, less we value its potential. Thus, we have a law of increasing marginal utility when accessibility is negative.

The general law is a curve (a parable). This curve has some portions that are approximately linear and the economists usually focus on these portions (they think about actual products that are not very close to zero accessibility). This is why they speak about the diminishing marginal utility and usually take this law to be linear.



This law tells you how to get from accessibility to a judgment of value. But how do we judge the accessibility given that we know what the resources are? The point is that different individuals have different skills and thus have different abilities of using the same resources. You give flour to a baker, bread becomes very accessible to him; you give the same flour to a shoemaker, bread doesn't become much more accessible to him. Thus, if you know the skills an individual has, you can (at least in principle) get from the available resources to accessibility and finally to value judgments.

And the individuals will choose first the products and services that have the highest value, and then those that have smaller and smaller values until all the available resources have been used. They can choose even to pursue things that don't yet exist because, as the general law shows, potential products or services that seem within reach (their accessibility is not too much on the negative side) are assigned quite a large value. When new resources become available (because their skills have grown or because various prices have dropped – i.e. the skills of other people have grown) the goals at the bottom of this hierarchy get activated. And of course, the hierarchy itself can change from time to time for the same reasons.



The robot

Thus, in order to have a robot capable of assuming goals on its own, it has to incorporate the law of marginal utility. It would have to asses the accessibility of various goals given the available resources and its skills, and the law of marginal utility will allow the robot to make value judgments – to assign a certain value to each goal. Having done this, it would have a hierarchy of goals.

But the robot would also have to make some resources management judgments – it would have to asses in what ways following one goal (allocating some resources in some direction) limits the possibility of following other goals (that need the same resources) or raises the possibility of following other goals (a more complex goal can be achieved step by step – it can be divided into sub-goals). Thus, the situation is quite complex, as goals are not independent of one another – they are part of a structure that has to be determined by the resources management program. We can say this: there is a constant feed-back between the value assigned to a certain goal (and thus the hypothetical decision to pursue it) and the accessibility of other goals. This feed-back mechanism raises and lowers the values and accessibilities of all goals until a state of equilibrium is reached – in that moment one has determined the hierarchy of goals.

But there is something more: the time factor. In some sense, time is nothing but one of the resources: If you are willing to pursue something for a long time (if you "have time"), that thing becomes more accessible. For how long are you willing to pursue various goals? What determines this time factor or the time allocation for various goals?

It's hard to tell what the complete answer to this question is in case of people, but the simple answer, the one that could be used by the first robots and the one that is used in economics, is this: The time factor is determined by the predictive abilities of the robot's models of the world. In other words, if the robot can accurately predict how various factors change over a time period T, but no longer than that, it would be risky to make plans that involve those factors on a period longer than T.

For example, a construction company (or the town hall that pays the company) can invest a lot in a bridge because it can predict that the bridge will be useful for a long time. But a fashion company could not invest too much based on assumptions over what people will wear two years from now. Thus, predictability determines (at least approximately) the time factors.

When I think about such issues as making a robot with independent will, a robot capable of deciding for itself what goals to pursue, I get a mixture of two feelings. For one thing, I think it is possible – as I tried to illustrate here. On the other hand, it seems so complicated! It seems to me that the problem is less of a conceptual matter but of sheer computation. At this point, we can only marvel at how our brains manage to solve the resources management problem and figure out how different goals hamper or help each other, how they assign values to goals (probably using something more complex than the law of marginal utility), how they estimate the time factors and account for the possible errors, and also re-evaluate the entire system from time to time questioning the most basic goals and wondering about the meaning of life. And not only that our moist computer weighing only a little more than one kilogram is capable of doing all this, but it is also doing it so fast!


No comments:

Post a Comment

.....................................................................................................................
.....................................................................................................................