A Theory of Mechanized Thought
A Blueprint for Building a Creating Subject
The creating subject (the English by van Dalin) it was called the
'creative subject' by Kreisel') was imagined (by Brouwer) to be a thinking
entity that (over time) would come to learn things about its world. I have
chosen this as a metaphor for our goal. Our thinker is not as smart as the
original. It can forget and might have to retreat from 'mistakes'. But the
reason that this idea is relavent is that it considers the dynamic aspect
of reasoning and recognizes that the semantics of such a system is, of
necesity, different than the 'classical' understanding of reasoning.
For us, reasoning is not intuitionistic. Below we discuss finite structures,
called contexts, which can be used to reason and talk about both classical and
intuitionistic theories. It answers the question of how intuitionists can
understand enought classical reasoning to say they believe its 'wrong'. Any
thinker worth their salt would have to be able to understand and discuss this
difference.
Introduction
One of the origins of this work was a remark of John McCarthy (I don't know
if has been published but it has certainly been articulated to me and others
many times). To paraphase: "You can't expect a computer system to find a proof
if you can't actually tell it the proof explicitly". This thinking led to the
development of the FOL computer program in the early 1970s (ref). As a newly
minted PhD in logic (from the mathematics department under the direction of
Soloman Feferman at Stanford), I thought it would be an easy task to write a
computer system that could 1) repeat the work of Russell and Whitehead in
Principa Mathematica using modern notation and 2) add 'modern'
formalisms for set theory and mathematics and have the system that would be
able to accept and print out proofs of undergraduate and graduate math
homework problems that were good enough to be handed in and graded. Was I
mistaken (details in [MATH-HOMEWORK] !!!!!
The present work aims at reformulating the ideas of logic, recursion theory
and model theory in a way that can be used as a blueprint for building an
independent/autonomous artifact—a physical system—that has an
understanding of its world; can come to know new things; and can influence the
world around it. Furthermore, its understanding of the world, of nesessity,
must be good enough so that it can survive over time (perhaps a very long time
(maybe forever?)) and can be a sucesseful participant in society.
We start with the problem of building the mind/head of our artifact. We
understand full well that our artifact will have a boundry and both sensors
and effectors, but we beg (to start with) the question of how this 'head' is
connected to such devices and instead consentrate on what John McCrthy called
the 'epistemolical adaquacy' of our ideas.
Among the goals of This work is:
To describe how to build such an intelligent artifact
and
To give a view of sets, computation and theories that can act as a blueprint for this project.
We start this adventure by listing some properties our head/reasoner needs to
have:
Seeds of Hope
What does it mean to provide a blueprint for building a Creating
Subject? It is here that we driven to develop a shift in the way we look at
the notion of theory and how it is xsused.
We start this adventure by listing some properties our thinker needs to
have:
1) Heads and computer systems are finite structures. The
consequences of this observation cannot be overestimated. It gets directly
to the heart of the philosophical discussions of nominalism, finitism,
intuitionism, ... and the pragmatic discussions of system building. If a
reasoner is finite then any 'theory' whose model theory requires an infinite
collection of things to be present simply cannot be used as a blueprint for
building it. Every AI researcher laughs when I say this because they all know
that infinite collections of things cannot be put into a computer (and in
practice they don't even consider it), but the reality is that AI researchers
frequently try to explain what they are doing by using the vocabulary of
logic and by implication its infinitary notions, ignoring the fact that
logical ideas like deductive closure, function, the integers, ... all
explicitly involve infinite collections. We need to rethink this.
The systems AI researchers build and the descriptions they give of how
these systems work generally don't match. The attempt to use the vocabulary
and notions of logic is not satisfying and has frequently been a mistake. In
fairness there is little else in the literature for them to use. This
disconnect has resulted in many AI researchers disparaging logic and
proposing to reformulate the workings of reasoning using generally far
less thought out solutions. I believe The analyses logicians have made of
reasoning is, in many ways, correct but their use of formalism has been too
narrow and needs rethinking in light of what we know about real physical
computers. One of the goals of this work to explain and eliminate these
differences.
2) Mental Images Our thinker must be able to form 'mental images'
or 'mental models' of things. These images must be realizable as finite
data structures. Our proposal introduces a notion of 'type' and use them to
represent mental models. Our thinker will use these models for thinking
about (and reasoned about) things using other mental models. We call the
mental models of the things used in reasoning contexts.
When a speaker points and says "Let me introduce you to Fred." ("Meet
Fred") they are clearly making a statement about how the speaker proposes to
use language later in the conversation. The 'meaning' of this interaction is
that the speaker wants the listener to know that the word 'Fred' should be
used to refer to whatever 'mental model' the listener has of the thing the
speaker is pointing at. This is always the case when one person introduces
something new to someone else. In building our 'creating subject', this kind
of interaction has to be accomodated because the connection between words
and the 'mental images' we conjure up when we hear them is a fundamental
part of communication. One of the results of this work is to describe a
collection of data structures (types) and show how they can serve as our
thinker's mental models of things.
This seems like a good place to mention the note
[WHY-LISP]
which describes the idea of interning and its importance to
communication (and natural language processing.)
3) Metatheory. The conversation described in the previous section
is metamathematical in nature. The sentence quoted is about how we should
use language. Logicians are just beginning to take seriously the
`formalization of metatheory'. Kleene explicitally states that IM is an
informal mathematical discussion of formalization. (p.
65). It is my belief that a large percentage (40%) of natural language
is about how the words used in future dialog should be understood. This
implies that understanding conversation uses metatheoretical ideas in a
fundamental way. Any implicit assumption that the sentences of discourse
mainly relay 'facts' is not well grounded.
In addition, more straitforwardly, we use metatheory (eg to describe new
subsidiary deduction rules) all the time in our discussions of theories.
For the more logically minded how exactly do we seamlessly switch between
elementary and non-elementry theories of things. Certainly anything that
can discuss mathematics must be able to do this, but what principle is
being used here and in what place in out 'thinker' is it found. For AI
researchers involved in natural language processing this issue appears all
the time. (we discuss this in detail below)
While we are on the subject of metatheory, I think we also need to address
the question of meta-metatheory, meta-meta-metatheory, ... and how we can put
this 'infinite' tower of theories in a head. In the original FOL computer
system we showed how this 'hierarchy' was finitized and gave examples of it
use. (Prolegamina (the preprint) 'self reflection' Section 10 p 25.)
4) Reasoning. Reasonoing must be carried out using finite structures.
We propose structures (mental images) of a type we call contexts and we
develop logic in a way that is both consistent with ordinary first order
semantics and describle as operations on finite structures. Satisfaction can
be defined for contexts and we propose taking operations that are satisfaction
preserving (but not nesceearily validity proserving) as fundemental. Ordinary
validity preserving rules of inference the become special cases, but forgeting
is an acceptable operation.
5) Contextualization. Our thinker must be able to isolate the
context in which any reasoning takes place. The idea of context was used in
1974 in the computer program developed at the Stanford AI lab called FOL
(ref) which formalized the notion of a rule of inference as a consistency,
i.e., satisfaction, preserving map on contexts.
6) Contradictions. We believe that all reasoning takes place in some
context. There is no global mutually accepted collection of facts. Locally the
facts of a context are imagined to be consistent. We believe that if a context
implies a contradiction, i.e., 'False' is a consequence of the facts of the
context, then either we can look at the justifications of the assumbtions used
in the derivation of 'False' and determine which to 'blamed' (in the usual
Natural deduction style) or just declare the context is unusable for problem
solving. In this way contradictions never impede our ability to do problem
solving. The development of 'papa-consistent' logic is yet another instance of
attempting to imagine that all 'facts' are in one big (happy?) theory (the
swamp revisited).
The idea of context was later discussed by McCarthy (ref) where
7) Reasoning over Time. Human understanders (the only real examples
of understanders we have) reason over time. They forget things and make new
discoveries. Any adaquate description of reasoning must be able to account for
this behavior. The currently fashional ideas of logic (e.g., using the idea
that theories are deductively closed, for example) do not account for this.
8) Interruptability without Death. We are able to recover from wide
variety of interuptions but neither computers nor functions are very good at
it. In Peano arithmetic the addition and multiplication functions are not
computed. They are simply lookup tables!!!! The effort to multiply two numbers
does not depend on their size. From this point of view the arithmetic we
learned in elementary school (as well as what even the most hardened
mathematicians use in their every day life) cannot be Peano arithmetic or we
could all multiply arbitrarily large integers in our heads. So what did we
learn when we learned 'arithmetic'? Functions and prograns which claim to
'compute' sums are not prima facie interruptable. In case of functions
interuption doesn't even seem relevant notion and in the usual computation
theories of CS programs are imagined to compute 'functions' and so run to
completion.
Our thinker has to be more grounded. It must be able to realize that
computing the Ackermann function for the argument 300 will take longer than
it has and even if it has started to 'compute' its value it has to have a
mechanism to abort the attempt. The question I ask is by what machinery might
our thinker notice that it wasn't going to finish in the 'available' time? It
must be able to use what I called "the gossip principle" which means, that no
matter what it's doing, a system needs to be able to interupt itself and ask
questions about what it has been doing lately, i.e., gossip about itself, and
change its behavior on the basis of what it notices. Of course, the hard part
is that (by the gossip principle itself) this 'gossiping' must also be
interuptable!!!! I believe a theory that would allow you to formalize a
general version of this principle is not straight forward and implementing it
is problematic. The proposal of this work includes a theory of computation
systems that includes a form of interuption.
9) Information theoretic vs set theoretic semantics We propose to
base our semantics on information content rather then membership. Instead of
subset we use the notion of subtype. One thing is a subtype of another just in
case it provides more information. This type of thinking occurs in
mathematical algebra (not high school algebra). In our theory 3D space is a
subtype of 2D space because it contains more details, i.e., information about
its third dimension.
10) static vs before/after semantics Formalizing behaviors, actions and events.
11) perception
So, having made our list, the question is what to do?
Reasoning Questions
1) Distinguishing between fact and consequence
2) Mixing theories The usual proof of Wilson's theorem.
3) Elementary ve non-elementary theories Axions of group theory vs
theorems about normal subgroups - what principle was used to 'switch' and in
what reasong environment was this principle applied.
4) Classical Rreasoning We believe that basic reasoning is
5) Contextualized reasoning We intend to show that contexts can be used to formalize non-monotonic reasoning, default logic, modal logic, cross world indiviuation problens, ...
Representation Questions
1) representing one's own structure or meta theory fails.
1a) implementation issues
1a-1) representations ie colored quotes
2) what can I see
Other questions
1) What is 'natural language'
Philosophy issuse
1) classical reasoning carried out in a finite contextualized
reasoning environment.
2) de re/de dicto - there is ni de re its all de dicto.
3) foundations vs providing a model The very idea of foundations
hides the fact that most of the time this means reductionist foundations.
That is, we strive to describe a small collection of notions (a foundation)
and then proceed to 'build' everything from there. So, for example, the
usual formulations of first order senantics omit the use of 'if the else'
for two reasons: 1) it isn't a binary propositional function and 2) it
requires the definition of satisfaction and valitity to involve a mutual
recursive between the 'value' of a term and the 'truth' value of a formula.
On the other hand every computer programming language has some form of
conditional and no one bats an eye. Logicians frequently simplify the
'complexity' of something by 'reducing' it to previously understood notions.
This reduction actually affects our task in some bothersome ways.
Logicians renounce the use of complex data structures and reduce thinking
to PRA because 'any other reasonable data type can be encoded using Godel
numbering' and so there is no 'need' for anything but numbers. This also
is reflected in the use of set theory where it has become 'obvious' that
(at lesst much) of mathematics including integers can be encoded into a
set theory built entirerly out of the empty set (and some 'large' cardinals
eg omega). Of course there are set theories with urelements but as a
statistical matter their use is religated to special problems.
For us this way of looking at foundations is an interesting collection
of observations but misses the point. Things for us (and our creating
subject) had better include (at least mental images of) numbers, trees,
people, ... So we have adopted the following:
1) we need mental images of things (we will build these out of types)
that we assign an information theoretic semantics. A type deliniates
a collection of objects call examples. In general these collections
are 'infinite'. They are distinguished by the existence of a recognizer
for each type - that is, for any object the epistemic question we can ask
is 'is this object of type T' We build recognizers as programs in a specified
computation system.
Note that this discription mentions both computations and criterioa
(propositions) as part of the 'meaning' of a type. This is an example
of our use of mutual recursion to define our basic notions,
A simpler version of this idea of type was first articulated by Luca
Cardelli when he was thinking about designing programming languages.
The recognizer semantics of types can be understood is a realization of
the ideas in Skolem 1928 where we use a computation system to realize
the semantics. The logic of FOL contexts is many sorted with a partial
order on sorts that mimics the notion of subtype for types (which in tern
are used to define the semantics of FOL contexts). The use of partialy
ordered many sorted first order logic was developed in the original
implementation of FOL contexts and was used for (among other things)
formalizing GBN, which specifies 'element of' in a style characterized
by CS as polymorphic. The canonical description of many sorted first
order logic (Wang 19??) insists that different sorts are disjoint. (Which
clearly requires 'encoding' to formalize GBN.) (see [doing Math homework])
============================================================================
Artifacts exist in time and space and they change over time.
Theories do not provide an answer to these questions!!!!
Our goal is not to understand how correct reasonong can happen but rather to buikd a reasoner.
The question then becomes in what sense a theory might constitute a blueprint for building
something that thinks.
What do we know now
What we learn from computers is that we can actually build physical structures
in machines (computers)
What's different now that we have computers
'I see no wffs here' ==>
'I see no wffs here BUT I can make one'
Actual Objects
bool
integer
rational
char
string
symbol
pair
list
sexp
:true :false :idb
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
make-rational(int,int)
rational-get-deno,eter
rational-get-numerater
#\1 ...
""
make-pair(a,b)
pair-get-1st(p)
pair-get-2nd(p)
What We Mean by a Head
What are Things - Types
How do we 'do' Anything - Restartable Computations
How do we Representate Theories - Contexts
Heads
Why do we do Anything
Boundries
Physics
2) New World
a) TYPEsystem (replaces set theory)
Set Theory - Tsys
element set class - type
example - object - example
subset - subtype (computaable)
epsilon - exampleOf (computable)
comprehension - recognizer
new notions:
type-name
type-def
refinement
scaler
isin
ex
b) COMPsystem (replaces recursion theory)
Recursion Theory - Csys
domain - type
function - program for restartable computation
S/T predicate - stepper
new notions:
computation (as data)
continuation
c) FOLsystems (replaces theory and deduction)
Logic - FOLsystem
theory - context
language - language signature
sort - type-name
domain - type
model - simulation structure
interpretation - IDK logic
validity - satisfaction
proof - fact
rules of inference - satisfaction preserving maps
new notions:
IDK logic
d) ReasoningSystem = < a , b , c > appapp
3) Plan
a) 2.a-2.d) descriptive papers
b) formal docs
c) realization/implematation
This is weak idea HERE we want to actuallY build a reasoning structure!!!
|