Having designated chess, with its clearly defined rules and winning conditions, to be a game conducive to attack by artificially intelligent systems, we will now consider the idea of winning conditions in a more general light, specifically with regard to linguistic tasks. The question, in short, is, “Can we define linguistic games and/or tasks with clearly formalized success conditions?” (The term “success conditions” has been substituted for “winning conditions” because not all of the examples we’ll be looking at share an equal degree of family resemblance to games. Nevertheless, a task with a success condition and a game with a winning condition will often be similarly tractable for similar reasons, so the substitution should be a valid one in all the following cases.)

When dealing with success conditions, it often becomes all too easy to beg the question. Consider a hypothetical machine that takes a word-pair as input. For example: (“dog”,”cat”), (“knife”,”blue”), (“kill”,”murder”), and (“fuse”,”join.”) The machine’s task is to output whether or not the words in the pair are synonyms. If we could program a context-free success condition, then the success-condition-checker algorithm would have to perform a task equivalent to the task whose success it is designed to check. In other words, it would have to (figuratively) say to itself, “I just outputted that ‘kill’ and ‘murder’ are synonyms. Are these words, in fact, synonyms? Let me check…” This inquiry merely repeats the inquiry originally being tested.

An ideal success condition is one that is completely context-free. A computer that takes as input a chess position and outputs whether or not one of the players is in checkmate is almost trivial to implement. The fact that chess’s success condition is trivially defined allows those who research the problem domain of chess to focus entirely on the heuristics used to reach the winning condition, not on defining the winning condition itself. However, finding linguistic domains with this kind of clearly defined success condition is a pipe dream.

Luckily, there are several ways to skirt the issue of creating a context-free success condition.

1) Man-Made Datasets

The first (and perhaps the most common) solution is to fake a context-free success condition by creating a large dataset of context-dependent success conditions. For the synonym-checker-program above, the need for a built-in success condition can be somewhat alleviated by using prejudged, man-made datasets like these: (“dog”,”cat”, FALSE), (“knife”,”blue”, FALSE), (“kill”,”murder”, TRUE), (“fuse”,”join”, TRUE), etc.

The interesting thing about this method is that for every such dataset constructed, there exists a hypothetical (yet trivial) algorithm that can perform with %100 success on that dataset. The algorithm would simply be allowed to access a copy of the very dataset against which it is being tested. This is similar to a chess program that is invested with a very extensive opening book but no other decision making procedures. Such a program would be able to play high level chess for several moves before its reservoir of data ran dry. Likewise a synonym checker would be able to successfully recognize synonyms for as long as the word-pairs happened to exist in its dataset.

The goal of using such a dataset, however, is not usually to make a trivial algorithm that operates only on predefined input data but to check (and help train) a less trivial algorithm that operates on more a general dataset. In effect, using such a dataset can turn a problem domain which is not game-like into one with reasonably well defined success conditions.

As a side note, in the particular case of the synonym-checker, the best way to stupefy the whole task might actually be to make a more-or-less exhaustive list of every possible word-pair. Thanks to many centuries of stellar lexicographic work, almost every non-technical word in the English language has been collected, documented, and defined. And to make matters easier, the average educated person only knows about 20,000 of these words, so making a table of every possible word-pair would entail about 20,000^2 entries. If each English word is, on average, 7 letters long and if each letter were represented as an eight-bit ASCII character, then the number of total bits used to store all of the words would equal 20,000^2 * 7 * 8. To each entry, we would need to append a TRUE/FALSE bit in order to designate whether the pair is a synonym or not. That’s another 20,000 bits. And I suppose we would need a special ASCII character used as a separator (like a comma) between the words. That’s another eight-bits per entry: 160,000 bits. So the grand total is:

(20,000^2) * 7 * 8 + 20,000 + 160,000 = 22,400,180,000 bits

Because there are 8,589,934,592 bits in a gigabyte, we would require a database totaling a measly 2.6 gigabytes. These days, this kind of hard drive space is pretty small. And even then, we could greatly reduce the size by including in the database only valid synonyms — with the assumption that anything not in the database is a non-synonym. Granted, there are actually about 500,000 words in the Oxford English Dictionary (which would require a word-pair database of more than 1,500 gigabytes) — but using a database with only the 20,000-word vocabulary possessed by an average intelligent human suggests that the algorithm and a human being would score equally well in the synonym-recognizing game. And that’s usually the goal, after all: human-level performance on intelligence-requiring tasks.

2) Partially Checkable Success Conditions

Instead of making the success condition context-independent by rolling together a lot of context-dependent conditions, another option is to make use of context-free conditions that aren’t quite sufficient. Before looking at an example, here’s a sample task:

Given a statement, produce a statement in response that is both 1) something an intelligent human might say and 2) doesn’t repeat any previous outputs. For example:

Input: “What’s your name?”

Output: “I’m Stephen.”

Input: “That cat is pretty.”

Output: “I hate cats.”

This is a stripped-down version of the Turing test — a version that completely disregards the context of any larger discourse. Nonetheless, this is a hard problem for a computer to deal with. Certainly, we can write an algorithm that checks whether a possible output has been previously outputted; but checking whether the output is something an intelligent human might say presupposes an ability to recognize something fundamental about intelligent language production, which is precisely what this particular task is designed to accomplish. Nonetheless, we can build a potentially adequate success condition by checking quasi-necessary conditions such as grammaticality. There exist parts-of-speech taggers and grammar checkers which can judge whether “I hate cats” and “I’m Stephen” are syntactically valid, which can possibly be stipulated as a necessary condition within the realm of this particular task. But grammaticality can, by no means, be considered a sufficient condition. After all, many unintelligent responses may be grammatically correct. But at least it’s something. If an algorithm always produces grammatical responses, it’s quite a bit closer to its goal than an algorithm that produces gibberish.

3) Tackle the Success Condition as a Separate Problem

The problem of defining a success condition can sometimes be tackled as its own problem — perhaps as a prerequisite to converting the original problem domain into a game-like problem domain.

This route suggests a combination of the two previously mentioned ways of dealing with an elusive success condition. The first method above should be recognizable as part-one of the two-part task of stupefication: available context-dependent knowledge must be distilled into a database-like form. And the second method above should be recognizable as part-two of the two-part task of stupefication: discoverable context-independent heuristics must be applied to the problem domain whenever the previously mentioned context-dependent knowledge is inadequate. If problem X has a success condition whose verification is itself a task in need of stupefication, then perhaps problem X is not ready to be tackled yet. The problem of making a program that will pass the Turing test provides a nice example of where abandoning the problem, in favor of tackling the evaluation of the problem’s success condition, could lead to an extremely fruitful simplification.

Instead of making a Turing Test Passer (TTP), suppose we turned our attention to making an Automated Turing Test Judge (ATTJ), which would take a conversation as input and would output whether or not the conversation’s interlocutors were both human beings or not. This is hardly an easy problem, but it is certainly an easier one. Why? Because its success conditions are extremely conducive to being stupefied by constructing a large dataset. There exist hundreds of thousands of literary dialogues, television scripts, recorded conversations, transcribed interrogations, instant messenger exchanges, and online forum debates — all available for use in a multifarious dataset of different linguistic interactions, each of which an intelligent human being would likely find to be intelligible. In addition, finding a dataset of conversations that a human being would likely not find intelligible would be more difficult but could be automatically generated by scrambling random words and/or sentences in the valid data. This dataset of good and bad input could then be used to test any number of artificially intelligent systems designed to be an ATTJ. This kind of automated testing is precisely what a TTP project lacks. A TTP can only be evaluated according to the judgment of human being who either is or is not fooled by the TTP. With a more clearly definable success condition, an ATTJ project would have a definite leg up on its less clearly defined counterpart. And who knows? If a successful ATTJ were to be created, it might constitute a major step toward creating a successful TTP. Formally defining the winning condition of the Turing Test would be like programming a chess computer to recognize checkmate — an ability which it could hardly survive without.

In closing, it is important to realize that creating an ATTJ and creating a TTP are not equivalent tasks. An off-the-cuff and incorrect supposition would be that if a computer can judge a Turing test then it can pass a Turing test. But this simply isn’t necessarily the case. We can see this clearly by assuming the existence of a theoretically perfect ATTJ — a program that can (for any inputted conversation) output whether or not the conversation’s interlocutors were human beings. In general, the task of deciding whether an input has a given property may be impressive — but it is much less difficult than generating an output with a given property. Suppose we attempted to bootstrap our ATTJ into a TTP by using the following method:

1) At any given point during the Turing test, if it is the ATTJ’s turn to talk, generate all possible responses that are one word in length (all 20,000 of them, let’s say.) We’ll call this set “R” (for “responses”) and we’ll call any given word in it “r.”

2) Give the ATTJ the following input: The conversation-so-far plus the first “r” in “R.” If the ATTJ decides that the conversation is one that a human being would have, then the ATTJ should respond with “r.” Otherwise, try the next “r” and the next “r” until one works, or until “R” is exhausted.

3) If a valid “r” is not found, proceed to generate all possible responses that are two words in length (all 20,000^2 of them, let’s suppose.) Call this set “R-2” and repeat step 2 above for all “r-2.” Then repeat step 3 again for “R-3” and “R-4” and so on until one of the following occurs: A) a valid response is found, in which case the process starts over at step 1; B) the size of “R-N” becomes larger than the computer’s available memory; C) The time it takes for the ATTJ algorithm to evaluate the validity of the current “r-N” is longer than an acceptable time limit.

Assuming the existence of a perfect ATTJ, this method might actually work for small N-values. But larger N-values quickly become intractable. And certain statements like “What’s your favorite 100-word sentence?” would seem to require responses with a large number of words. (An N of size 20,000^100 is likely to be intractable for most modern hardwares.)

This is hardly a formal proof that the existence of an ATTJ does not imply the existence of a TTP. Nonetheless it is strong evidence that a very simple algorithm would fail to convert an ATTJ into a TTP. Furthermore, the need for a more complex algorithm suggests with even more force that an ATTJ is not a sufficient condition for a TTP.

Additionally, those familiar with processes like evolutionary computation and neural net training will recognize that the creation of an ATTJ program is a domain with characteristics that make it very conducive to problem solving techniques that specialize in pattern recognition. A successful ATTJ, if one is ever created, is likely to utilize the pattern recognition abilities of neural nets and/or complex statistical models capable of capturing the common mathematical features of valid human-generated conversations. These methods, while historically highly successful at recognizing the key features of various system, do not immediately imply the ability to generate new systems with the same features. For example, a neural net with the ability to differentiate between a human face and a non-human face would not immediately be able to produce pictures of human faces. And a statistical model that could evaluate whether words in close proximity were words that tended (in common usage) to appear in close proximity would not necessarily be able to generate its own sensical sentences.

The upshot of everything is that we can gain a lot by first trying to turn a problem domain into a game-like domain with formally definable success conditions. And, failing this, much can be gained by tackling the evaluation of a domain’s success conditions as a unique problem domain in itself. This shifting of domains may lead to a useful reduction in the required level of algorithmic complexity, just as creating an Automated Turing Test Judge represents an easier milestone for artificial intelligence researchers than does the creation of a Turing Test Passer.

In chess, controlling the center isn’t always a good idea. For example, when you have the opportunity to effectively sacrifice your bishop on h6, or when you’re at risk of getting checkmated, controlling the center ought to be far from your mind. It would seem that, in addition to the heuristic “Control the center,” a computer also needs heuristical methods for choosing which heuristics to apply in a given situation: i.e., “When the sacrifice on h6 doesn’t lead to a win, try to control the center” or “When you’re about to get checkmated, forget about controlling the center.” Notice how each of these second-order heuristics serve to locate the first-order heuristics in a particular context. By contextualizing essentially context-independent chess wisdom like “Control the center” second-order heuristics help raise chess playing machines to a much higher level of play than any collection of first-order heuristics would be capable of. The problem, however, should be evident: which second-order heuristic should be applied at any given time? For example, what if I’m about to get checkmated but I also have a potentially fruitful bishop sacrifice on h6? The answer is, it depends on the context. Does the bishop sacrifice put the other king in check? Just how close are you to getting checkmated anyway? So a third-order heuristic is necessary. Etc.

It may sound like I’m setting up an ad infinitum proof that computers cannot play chess. But that, of course, would be silly, thanks to Deep Blue, who proved such arguments to be deeply suspect. Instead, I’m trying to show that the act of stupefication involves, at least in part, the navigation of and formalization of a potentially infinite recursive stack of heuristics. That the layers of heuristical guidance are not piled infinitely deep and, indeed, that they obviously need not be piled infinitely deep is proof that there exists a discoverable stopping place amidst the recursive descent. After all, we know that a computer can, with a finite stack of recursive rules, accomplish a high level of chess play.

This should indicate to us that arguments which seek to predict “what computers can’t do” and which cite as evidence the infinitely recursive nature of a particular problem domain may need to be reevaluated — given that a finite number of recursive heuristics may be sufficient to emulate human-caliber performance. And any suppositions about a potentially infinite stack of heuristics may be wholly irrelevant.

I mention this because linguistic systems are known to possess potentially unbridled (and largely uncharted) recursive structures. And incidentally, linguistic systems happen to be an important frontier in AI research. So my question (to paraphrase Turing) is, can a machine be programmed with enough heuristics (of various orders) to function at an apparently high level of linguistic competence? If the answer is yes (which is a huge “if”), then perhaps one way to effect such a breakthrough would be to first begin by programming systems to play very simple languages games that have clearly defined winning conditions and clearly defined transition rules (as mentioned in this post.)

More to come regarding language games.

Simply put, heuristics are rules for governing how a formal system ought to move (where “move” is understood to mean “transition from one state to another according to valid transition rules.”) But for our purposes, we should understand heuristics to have a quality of being definable fully within the system itself. In other words, the heuristic “Control the center” would have been useless to Deep Blue if the idea of “control” and the idea of “center” were impossible to define syntactically. (Note: This should be reminiscent of our similar stipulation that winning conditions be definable within the system itself — an idea mentioned here.)

Furthermore, as we have already mentioned, heuristics that govern a formal system designed to play a game (like chess) can be understood to be much more context-independent than rules of the form, “Look up your position in an opening book and make the indicated move.” It is important to note, however, that context-dependence is not a binary quality — but rather a spectrum. Consider a simple game in which two players are competing to acquire the largest number of stones, which are initially residing in a central pile. The rules are that, on your turn, you may 1) take a stone from the central pile, or 2) put a stone back into the central pile. The best strategy in this (rather boring) game should be obvious: “On your turn, take a stone. Never put one back.”

Within this simple game, the winning heuristic can be considered context-free. It doesn’t matter how many stones are in the central pile, or how many stones your opponent has, or how many previous moves have been played — you always want to take one stone. This heuristic depends on no other game-related concerns.

In chess, most (if not all) rules are more dependent on context than in the previous example. Some rules are more context-dependent than others because they have their context built into them explicitly: i.e. “If your position looks like this (context), always play this move (transition).” But most of the truly interesting heuristics are more context-independent: like, “Control the center,” and “Don’t bring your queen out early.”

Ironically, the really tricky parts of stupefying a task involve trying to apply such more-or-less context-independent heuristics in their proper context. (We should certainly note that, “more-or-less context-independent heuristics” are a far cry from context-free heuristics.) When is it best to control the center? How early is too early with regard to bringing the queen out? Should one bring out one’s queen early if that leads to controlling the center? How does one decide between two context-independent heuristics? How does one decide just how context-independent a particular heuristic is? This will be the subject of the next post.

Note: This post is a further unpacking of a concept introduced in this post. That concept can be stated briefly as follows: Stupefication works, in part, via the distillation of context-dependent human knowledge into a database. But the computer makes use of such “knowledge” in ways that do not mirror human usages. And to draw a naive parallel between the way computers use databases and the way human beings use databases is to commit a fallacy which I see committed all too often by those who espouse the glories of Good Old Fashioned Artificial Intelligence.

Human beings use chess opening books like maps to a territory. We recognize an intrinsic connection between games of chess and strings of numbers and letters like

“1) e4 e5 2) Nf3 Nc6 3) Bb5”

A human being familiar with algebraic notation will see these symbols and connect them with their meaning, just as many of us, upon seeing the word “apple,” conjure up an image of a red fruit. Knowing the connection between symbol and meaning, between the syntax and its semantics, the human chess player can use the recorded chess game like a map, locating a previous game that reached a position identical to (or very similar to) her current game’s position. Then she can use that game to follow a previously cut path through the thicket of chess variations which lie ahead of her.

Computers don’t. By this, I mean that a computer “follows” the map in the sense that I might “follow” a straight line when falling down a well. The machine makes no semiotic connection between the syntax of algebraic notation and the semantics of chess activity. It merely obeys physical laws — albeit a very complex set of them. That’s the neat thing about algorithms. They allow computer scientists to subject computers to a mind-bogglingly complex set of physical laws (in this case, a series of high and low voltages, represented as ones and zeros) such that they can play chess, emulate bad conversations, and display web pages like this one. But despite the many layers of abstraction that lie between Deep Blue’s spectacular middle-game play and the succession of high and low voltages that make it all possible, the fact remains that the computer functions in a purely deterministic way. Incidentally, this is exactly how you or I would function when subjected to the law of gravity after being dropped down a well.

“Ah, yes,” say the gallant defenders of Strong AI (in Searle’s sense of the term.) “But how can you prove that human beings are actually doing anything different? Even human beings may operate according to purely deterministic rules — in which case, we all ‘fall down wells’ when we think.”

The only weight this objection carries is the weight of confusion. When I say, “Humans do semantics; computers just operate deterministically on syntax,” the objection says, “Ah, but how do you know semantics isn’t just a series of deterministic operations on syntax?” Philosophers like Searle and Floridi have tackled this issue before. Here’s my stab at dealing with it concisely:

Even if I grant that when we say “semantics” we are actually referring to a deterministic and essentially digital process, the fact remains that the deterministic-and-essentially-digital-process that I am engaged in when doing “semantics” is of a fundamentally different nature than the deterministic-and-essentially-digital-process that Deep Blue is engaged in when its software obeys the lines of code that tell it to refer to its opening book and to play 2) Nf3 in response to 1) … e5. So can I prove that the universe doesn’t operate deterministically?  No. Can I prove that our minds aren’t, by nature, digital? No. But I can reveal the aforementioned objection for the house of smoke and mirrors that it is. My point (and Searle’s point) is that human beings do one thing and computers do something very different. And making such a proposition doesn’t require a complete knowledge of the nature of semantics or any amount of empirical evidence that thought isn’t digitally based. The objection mentioned above implies the need for such a presupposition even though no such need exists.  And the onus lies on the objector to prove that human beings and computers engage in processes that share any kind of functional similarity.

Dreyfus attacks several of the foundational presuppositions of AI in his book What Computers Can’t Do.

1) The Biological Assumption — That we act, on the biological level, according to formal rules, i.e., that our brain is a digital computer and our minds are analogous to software.

2) The Psychological Assumption — That, regardless of whether our brains are digital computers, our minds function by performing calculations, i.e., by running algorithms.

3) The Epistemological Assumption — That regardless of whether our brains are digital computers or whether our minds run algorithms, the things our minds do can be described according to formal rules (and hence, by algorithms.) This is, naturally, a weaker assumption, yet one required by the idea of stupefication.

4) The Ontological Assumption — That “the world can be exhaustively analysed in terms of context free data or atomic facts” (205).

The epistemological assumption is the one that we ought to be concerned with at the moment, as evidenced by this quotation on the matter:

“[The question] is not, as Turing seemed to think, merely a question of whether there are rules governing what we should do, which can be legitimately ignored. It is a question of whether there can be rules even describing what speakers in fact do” (203).

In light of the previous post on descriptive rules, we can posit that stupefication requires a kind of epistemological assumption: that mental tasks like that of playing chess and (perhaps) of communicating in a natural language can be described by formal rules, even if those formal rules have nothing to do with what we happen to be doing while performing that task.

In Dreyfus’s book, he undermines the epistemological assumption (along with the three other assumptions) by showing that they cannot be held in all cases and with regard to all human activities. However, I don’t think this is necessarily very crippling. Even if there can be no comprehensive set of rules or formal systems that fully describe all intelligent human behaviour, AI is hardly done for. The questions merely change from ones like “Can we make a formal system that fully describes task X?” to “How close can we get to describing task X in a formal system?” And this may well put us back in Turing’s court, where the benchmark is how many people the formal system can fool.

In other words, I’m questioning the validity of this proposition:

“[The] assumption that the world can be exhaustively analyzed in terms of context free data or atomic facts is the deepest assumption underlying work in AI and the whole philosophical tradition” (Dreyfus, 205).

Dreyfus is probably right in terms of the majority of the research that has been done in AI over the past century. But this ontological assumption need not be the “deepest assumption” underlying projects that seek to stupefy. For in stupefication, one takes up the mantel of the epistemological assumption while relegating the ontological assumption to a hypothesis that must be empirically verified, not necessarily assumed — and certainly not assumed “exhaustively,” as Dreyfus suggests.


Dreyfus, Hubert L. What Computers Can’t Do. New York : Harper and Row, 1979

“Consider the planets. They are not solving differential equations as they swing around the sun. They are not following any rules at all; but their behavior is nonetheless lawful, and to understand their behavior we find a formalism — in this case — differential equations — which expresses their behavior according to a rule” (Dreyfus, 189).

In other words, rules are descriptive, not prescriptive. Given the proper descriptive rules, computer scientists and mathematicians can model the movements of the planets, even though the planets never do any mathematical calculations. In a similar way, given the proper descriptive rules, computer scientists might be able to model the movements of a human chess player and (perhaps) the “movements” of a human interlocutor. But the planets, the chess player, and the interlocutor need not have anything whatsoever to do with the formal systems that describe them. This is the fallacy that stupefication helps us skirt and which traditional GOFAI often fails to skirt.

Thus, the kind of language games mentioned at the end of my previous post, and which we’ll talk about later, need not be games that human beings play and need not be governed by rules that govern human linguistic practices.


Dreyfus, Hubert L.  What Computers Can’t Do. New York : Harper and Row, 1979.

Hubert Dreyfus wrote his book What Computers Can’t Do long before Luciano Floridi came onto the scene. Yet the following point seems specifically constructed to shed light on the problem of relevance (mentioned in this post):

“As long as the domain in question can be treated as a game, i.e., as long as what is relevant is fixed, and the possibly relevant factors can be defined in terms of context-free primitives, then computers can do well in the domain” (Dreyfus, 27).

Dreyfus doesn’t expound upon exactly what kinds of games he has in mind; but I think it’s safe to say that he isn’t talking about all games. After all, there are certainly games like soccer (which is analogue) and nomic (which is unstable) that would foil a computer readily.

But there are certain games with qualities that make them ideal domains for attack by projects in artificial intelligence. Chess is one of these games. Let us try to itemize the qualities that make such games so conducive to formalization:

1) Such games consist of states.

2) Such games have rules that govern changes in state.

3) Such games are stable, i.e., the rules either stay constant or change only in correspondence with other rules that do stay constant.

4) Such games are transparent, i.e., the rules can be known because they are simple enough to understand.

5) Such games have a bounded set of rules, i.e., the rules can be itemized because they are finite in number.

6) Such games have a bounded set of states, i.e., the number of possible game states is finite, even if astronomical.

7) Such games have winning conditions that can be assessed from within the system itself, i.e., there are rules that can designate some states as won and others as lost. (Note: we can weaken this condition to include games that cannot be won or lost; but there must still exist rules that designate some states as better than others or worse than others, in order for such games to be conducive to productive computational analysis.)

To wrap all of this into a tidy package: such games (considered to be a collection of states, transition rules, and evaluation rules) must be representable as a finite state machine. If so, then they can be represented syntactically. And algorithms can be written for their governance.

Bear in mind, however, that this is a necessary condition, not a sufficient one. The above criteria merely distinguish games that can be formalized from ones that can’t. But within the set of games that can be formalized, there can (and most likely do) exists games with such complex states or such complex transition rules that they are computationally intractable. So we must add another necessary condition:

8) Such games must be tractable, i.e., not only must they have a finite number of states, transition rules, and evaluation rules; these states and rules must be few enough and simple enough to effectively compute the state-to-state transitions required for playing the game.

But even this addition doesn’t guarantee that the game will be a domain in which artificial intelligence projects can thrive. Formalization and tractability don’t imply that an artificially intelligent system (or its creators) will be capable of applying heuristics and/or strategic rules necessary for a high level of play.

Nonetheless, considering Deep Blue’s success in the face of so much skepticism, a little optimism might be in order if the above conditions happen to be met.

In closing, my food-for-thought question of the day is, “Can linguistic domains be transformed into games that meet the above criteria?” I think we’ll visit Wittgenstein soon. He has quite a bit to say about language games.


Dreyfus, Hubert L. What Computers Can’t Do. New York : Harper and Row, 1979.

The Problem of Relevance

April 26, 2008

Let me start by giving a quotation we’ve dealt with before:

“[For an AI project to be successful, it is necessary that] all relevant knowledge/understanding that is presupposed and required by the successful performance of the intelligent task, can be discovered, circumscribed, analyzed, formally structured and hence made fully manageable through computable processes” (Floridi, 146.)

Floridi’s use of the word “relevant” is suspect here. He gives us no indication of whether he means “that which is relevant to us when performing a task X” or “that which is relevant to a computer when performing the stupefied version of task X.” Considering that Floridi advocates a non-mimetic approach to artificial intelligence, I think we should assume that he means the latter.

But this leaves the would-be creators of an artificially intelligent system in a pickle:

Q: How do we stupefy task X?

A: You need to make a new task or series of tasks which require less intelligence overall.

Q: Ah. How do we do that?

A: Luciano Floridi might suggest (given the above quotation) that you need to first discover, circumscribe, analyze, and formally structure all the knowledge relevant to the stupefied task.

Q: But how do I know what’s relevant before the stupefied task even exists?

In other words, I know what’s relevant to me when, say, playing chess. But I also know that the stupefied task of chess-playing bears little resemblance to the task I perform when playing chess. So I can conclude that the things relevant to the stupefied task of chess-playing might be things that aren’t at all relevant to me when playing chess.

Here’s a more formal statement of the problem of relevance:

How can we discover rules for governing a formal system that does X if we don’t yet know how the formal system will do X?

I don’t have an answer except to say that this problem reveals that the task of stupefication — in addition to being one that can require tremendous human intelligence — is also one that can require a great deal of human creativity. I think you have to say, “Hmmm. Maybe a chess-playing machine could benefit from a complicated formula for weighing the comparative values of a chess position’s material, structural, and temporal imbalances.” Human beings don’t use such a strict formula, so it’s simply an educated guess that a computer could effectively put one into practice. The reason the guess is educated, though, is that the following mental maneuver gets performed: “If a human being could evaluate a position based on a complicated formula, precisely and accurately calculated, she might play a better game of chess.” This hypothetical reasoning is where the creative act resides.

Of course, some such guesses are more educated than others. For example, it makes sense to assume that a computer could benefit (at least in the opening phase of a chess game) from the ability to reference the moves of a few thousand grandmaster games and to copy those moves in its own games. This assumption makes sense because a human too would benefit from such grandmaster assistance — which is why this activity is generally frowned upon by chess tournament administrators.

Unfortunately, the problem of relevance (and the creativity required to surmount it) can suddenly become critical when the domain of the problem is such that we cannot reason by analogy, cannot consider that which is relevant to us and hypothesize that something similar might be relevant to a formal system. For example, when trying to stupefy linguistic tasks instead of chess-playing tasks, we are immediately confronted by the problem that we don’t even know what’s relevant to us when we speak let alone what might be relevant to a formal system which functions nothing like us. Why do we say what we say? How do we know what we say is correct? How do we understand each other? These are questions that have been attacked in countless ways by philosophers for over two-thousand years.

And the upshot of the problem of relevance is that, even if some lucky philosopher happened to have gotten it right, happened to have discovered the essence of language and how we use it, that answer may or may not have anything to do with how we might go about creating a formal system that mimics human linguistic practices.


Floridi, Luciano. Philosophy and Computing: An Introduction. London and New York: Routledge, 1999.

When we stupefy a task, we construct another task or series of tasks, the sum of which require less intelligence overall.

The trivial example:

If my task is to produce a novel, you could stupefy the task for me by writing the novel and letting me copy it. The act of copying a novel requires less intelligence than creating a novel, yet it appears — if examined only in terms of its final product — identical to the act of creation. By letting me copy your novel, you have stupefied a potentially intelligent act by extracting the intellectual requirement from it, taking the burden of intelligence upon your own shoulders, doing the intelligence-requiring work, and letting me produce the output.

Some people may rightly object to this example by pointing out that, overall, the amount of intelligence utilized has not decreased at all (and may in fact have increased.) For whether or not I wrote the novel, a novel was written; and writing a novel requires an expenditure of intelligence no matter who writes it. But let us remember that stupefication is often a task that requires intelligence (see this post.) Thus, you may have stupefied my task of creating a novel at the expense of great expenditures of your own intelligence. Likewise, the makers of Deep Blue created a computer that could trounce any of its creators in a chess game — but making the machine was a long and difficult road.

But no matter how you slice it, my aptly named “trivial example” is so absurdly simple as to be useless in the context of manufacturing artificial intelligence. After all, virtually any task (with a few notable exceptions) can be stupefied in this way. But doing so will never constitute the creation of an artificially intelligent system.

Deep Blue would have been a silly machine indeed if its “internal” calculations had been performed in real-time by human grandmasters hiding inside its belly (which, by the way, would have been merely a modern resurrection of the famous Mechanical Turk hoax from the 18th century.) It is unlikely that even the most naive of individuals would have considered The Turk an example of artificial intelligence. For The Turk is merely an example of old fashioned human intelligence.

That being said, this idea of hidden human intelligence connects in an interesting way to the quotation in this post, which suggests that when Kasparov played against Deep Blue, he was actually playing against “the ghosts of grandmasters past.” If so, then although there were no literal grandmasters hiding in the belly of the machine, years of powerful over-the-board intellect managed, in some form or another, to find its way into Deep Blue’s physical and/or virtual innards.

I offer the trivial example in order to initiate a more complex discussion later and because it demonstrates clearly that stupefication of a task is not a sufficient condition for the creation of artificial intelligence.

Artificial intelligence is not always aptly defined, and I’m not going to enter the ongoing debate about what constitutes artificial intelligence. I will merely suggest that the trivial example above is not it.

For one thing, the machinery of The Turk was no different from a rock, in that it obeyed purely physical laws and moved only when the grandmaster hiding inside manipulated it with his hands and feet. The Turk made no decisions. I will tentatively suggest here that in order to create an artificially intelligent system, the process of stupefication must result in 1) a new task, which 2) can be performed by a different entity and which 3) involves some level of decision making.

I mention this last qualification because I could easily make a machine that “writes novels” by programming it to print out novels that I had written. This would be The Turk all over again. And few people would be impressed.

I will address the idea of decision making later. To that end, consider the following tasks.

A Less Trivial Task: Creating a novel that has never been written before.

An Even Less Trivial Task: Stupefying the previous task