MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_NextPart_01C946A9.91E01950" This document is a Single File Web Page, also known as a Web Archive file. If you are seeing this message, your browser or editor doesn't support Web Archive files. Please download a browser that supports Web Archive, such as Microsoft Internet Explorer. ------=_NextPart_01C946A9.91E01950 Content-Location: file:///C:/542A52F1/ilang_s1v1.htm Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="us-ascii"
The biological
evolution of information processing
in the evolut=
ion of
language & learning,
and
my search for=
the
missing link between body and mind
Section 1:
Asking the ri=
ght
question about the basis for
language comprehension and learning=
Rainer von
Königslöw, Ph.D.
=
Abstract
=
I
speculate that there is a missing link, something that connects physical
activity to mental activity.
Furthermore, I speculate that this missing link is related to the
biological evolution of language and learning. I investigate questions that are u=
sually
addressed in the field of neuroscience with empirical investigations. I propose a paradigm that investig=
ates
these questions from the perspective of information processing, and thus al=
so
fits into the field of artificial intelligence. I propose a design that uses an
‘inner language’ to control action sequences and to integrate
visual perception into action. I
investigate how this ‘inner language’ facilitates and enhances
learning. The investigation
demonstrates and validates the feasibility and benefits of the ‘inner
language’ design with working prototypes.
=
=
=
TOC
=
Chapter
1: Discovering the right ques=
tion
to investigate
Chapter
2: Experimentation based on
social-engineering
Chapter
3: Thought and action: the me=
ntal
and the physical
Chapter
4: Feelings and action
Chapter
5: Body and mind, the missing=
link
Chapter
6: The research paradigm R=
11;
experimentation & validation..
Chapter
7: The research paradigm R=
11;
modeling information content, flow, and processing
Chapter
8: Notes & comments ̵=
1;
status & future plans
Expanded TOC<= o:p>
=
Chapter
1: Discovering the right ques=
tion
to investigate
Topic
1: complexity versus chaos:
evolution
Topic
2: evolution and learning
Topic
3: Learning and forgetting is=
like
evolution and the second law of thermodynamics
Topic
4: structures -- complexity a=
nd
information content
Topic
5: information content for in=
nate
functions and instincts
Topic
6: complexity and information
content
Topic
7: investigating where theori=
es
come from
Topic
8: measuring the information
content of theories
Topic
9: exploring computer-simulat=
ed
language comprehension
Topic
10: separating the thesis
contribution from programming implementation
Topic
11: verification, falsificati=
on,
and the Turing test
Topic
12: wrong question: search for a universal theory of
language comprehension
Chapter
2: Experimentation based on
social-engineering
Topic
2: learning from instructions,
copying, and experimentation
Topic
3: knowledge engineering, exp=
ert
systems, computer-aided work
Topic
4: knowledge, language, and t=
asks
– metaphors for successful applications
Topic
5: knowledge engineering vs. =
task
re-engineering
Topic
6: Language learning and task
learning
Chapter
3: Thought and action: the me=
ntal
and the physical
Topic
1: language comprehension as a
mental activity
Topic
2: reactive and predictive le=
arning
Chapter
4: Feelings and action
Topic
1: feelings as motivator and
selector of actions
Chapter
5: Body and mind, the missing=
link
Topic
1: my dream about finding the
‘missing link’
Topic
2: the ‘missing link=
217;
– connecting mental controls to observable physical action
Topic
4: the mind - growing up in a
submarine
Chapter
6: The research paradigm R=
11;
experimentation & validation..
Topic
2: minimalist feasibility
Chapter
7: The research paradigm R=
11;
modeling information content, flow, and processing
Topic
1: complexity and information
content
Topic
2: layering and information c=
ontent =
Topic
3: layering, compression, and
information storage capacity
Topic
4: layering and information f=
low
Topic
5: layering and information
processing
Topic
6: learning and the evolution=
of
layering
Chapter
8: Notes & comments ̵=
1;
status & future plans
Topic
1: present work and future pl=
ans
=
=
=
=
Let me
introduce myself. I am a gene=
ralist
rather than a specialist. I h=
ave
degrees in Social Psychology, Computer Science, and Mathematical Physics. I submitted an M.A. thesis in Soci=
ology
and Anthropology, and done work in Measurement Theory. I had an office in Linguistics, wh=
ere my
most active thesis supervisor hung out.&nb=
sp;
I worked in aerospace, automotive, chemical, and mining. I developed technology for banks, =
Bingo
halls, and Beauty Pageants. I=
have
done things in legal, medical, and food.&n=
bsp;
I think I have been involved in most sectors of society, mostly at t=
he
research end and with the introduction of new, computer-based, technology.<=
span
style=3D'mso-spacerun:yes'> All of this is a long-winded apolo=
gy for
drawing analogies from all over.
Please bear with me as I zig-zag through a variety of topics. I hope I can pull it all together =
and
make sense for you at the end.
=
=
I have
been around computers for a long time.&nbs=
p;
I got to help Arthur Burks fire up a chunk of the Eniac at
=
=
A while
ago, when I was teaching a 4th year / graduate course on measure=
ment
theory, a student raised his hand and said “There is one thing that is
still clear.” The whole=
class
broke up with laughter, but it has stuck with me ever since. I have tried my hand at an informal
style, with editorial help from my daughter. I hope it is clear, but I would
appreciate any comments on what needs further clarification.
=
=
=
=
=
Even as
a kid I noticed that my room always ended up in a mess, and that my model
airplanes and other constructions usually broke. As a second year student in Physics, I le=
arned
about the second law of thermodynamics, that everything is sinking into cha=
os
over time. That made a lot of
sense.
=
I
learned only a little about evolution, where life gets more and more complex
over time. The principle made=
some sense. There is trial and error. The successful life forms get sele=
cted
and copied over and over again. The
theory supports the spontaneous development of more complex living systems.=
=
=
There is
a missing part, the evolution of complex atoms and molecules in physics and
chemistry. The big bang theor=
y is
not easy to understand. Anyti=
me I
made a big bang by mixing chemicals it only broke things and made a big mes=
s,
as my sisters can attest.
=
=
The
question about the role of evolution versus the second law of thermodynamic=
s has
stayed with me ever since. I never did make much sense o=
f it,
but it led to great late night discussions with just a drink or two.
=
=
=
I see
evolution as nature’s learning to adapt species to the environment
through continuous and automatic experimentation over a long span of time.<=
span
style=3D'mso-spacerun:yes'> There is some randomization (mutat=
ions
etc.) and then there is selection that favours the more successful
variants. From an engineering
perspective I see this as a very complex search or optimization process, wh=
ere
all the species are searching for better, more satisficing variants. Each of the searches is independen=
t, but
all are interdependent. The t=
hird
part is copying, where the proportion of more optimal variants increases ov=
er
time.
=
=
I see a
parallel process for individuals within a species where learning helps to a=
dapt
the individual to the environment.
Automatically generated experimentation reappears in learning how to
crawl, walk, or fly. I see th=
is
learning as making the individual more complex, from newborn to adult. In some species this learned compl=
exity
can be passed from one individual to another, typically from an older to a
younger, in a kind of apprenticeship training.
=
=
=
I see
the tension between learning and forgetting to parallel the tension between
evolution and the second law of thermodynamics. Learning increases complexity and =
the
ability of individuals to adapt to the environment. Forgetting increases uncertainty a=
nd
chaos. Forgetting seems to in=
crease
over time and aging.
=
=
=
It would
be nice to have a measure of complexity so that we can compare species and
individuals. For species we h=
ave an
approximate measure through the DNA.
I would compare this to an architect’s blueprint for a
building. A simple building o=
nly
needs a few pages with relatively few lines. A large and complex building needs=
many
drawings with many lines. The
builder can read these drawings and build the building. I can describe these pages and lin=
es,
and I can enter this description in a computer. On the computer the description wo=
uld be
stored as a long string of ones and zeros.=
I can then count how many ones and zeros, or bits, it takes for the
complete description. The more
bits, the longer is the description.
The longer the description, the more complex is the object described=
.
=
=
The DNA
acts like the architect’s blueprint.=
It is sufficient to ‘construct’ the individual. I have been asking around for esti=
mates
of the information content of the DNA for different species, but I have not=
yet
gotten a very satisfactory reply. =
span>I
have seen estimates of 20,000 to 30,000 genes for human DNA, but then I wou=
ld
like to know how much information is in a gene. How many different things can it
specify? I read that mo=
st of
this information is structural such as for building proteins.
=
=
=
Is there
functional information such as ‘walking in a straight line’, or
‘recognizing birds’, and is it stored in genes? We have the concept of innate beha=
viours
such as instincts. I think th=
at we should
find out where and how that information is stored in the DNA. As a second approach, and as a way=
of
triangulating, we should find out how much ‘spare’ information
capacity is left, after the information for the construction is subtracted =
from
the total information capacity of the DNA.
=
=
= Learning increases the capabilities of an individual. Learning presumably reduces the randomness of behaviour. Lear= ning also differentiates individuals, as they learn different skills. Some individuals may learn more th= an others. This holds true for a= nimals as well as humans. There shou= ld therefore be an information measure for the added knowledge or capabilities of an individual due to learning. <= o:p>
=
=
This
concept of information content due to learning only came to me quite gradua=
lly. All non-random behaviour presumabl=
y is
either innate or learned. The=
total
information content associated with the non-random behaviour under control =
of
the individual therefore must be the sum of the information content conveye=
d by
innate behaviours encoded in the DNA and the information content added thro=
ugh
learning.
=
=
=
In about
1963, while studying theoretical physics at the
=
I
decided to do a masters in Sociology and Anthropology at UBC. I started to study religious group=
s that
have theories that are clearly counterfactual. An example is a theory that the en=
d of
the world will happen on a certain date, but the date comes and goes without
any cataclysmic event. Talkin=
g in
tongues is another example. T=
he
people in these groups are nice and the conversations were interesting, but=
it
was very hard to understand what was going on. There seemed to be a lot of persua=
sion
and mutual support. To follow=
these
concepts, I turned to experiments with small groups in laboratory settings.=
=
=
As a
recent convert from theory in math and physics, I wanted to generate formal,
falsifiable theories rather than just developing personal understanding wri=
tten
up in a nice story. This seco=
ndary
endeavor became the main focus for my thesis, building computer simulation
models of small group phenomena, and working out formal measures of the
information content in such theories and models. I presented this work at the natio=
nal sociology
convention in
=
=
=
In the
formal part of the thesis, I valiantly tried to develop a measure of how mu=
ch
information a given theory would contribute over the null hypothesis. The measure was based on formal lo=
gic
and proposed that a correct theory would reduce the randomness of the null
hypothesis and thus add a certain number of bits of information. Today I find my efforts of more th=
an 40
years ago amusing and somewhat silly (my first effort at using formal
logic). But the question of h=
ow
many bits of information are added by new knowledge is still relevant.
=
=
I did
learn that theories, and a lot of learning, are expressed in language, at l=
east
for humans.
=
=
=
Since
most theories are cast in language, I decided to develop a computer simulat=
ion
of natural language to find a way of further formalize this approach. I was convinced that mathematical =
models
in theories are just extensions of ideas expressed in language. I was fortunate to convince the So I had a lot of latitude and a l=
ot of
cliffs to fall off.
=
=
I wanted
to combine understanding instructions with understanding descriptions and
questions, and with understanding evaluations. I built an interactive system base=
d on
LISP, a computational grammar, and logic that can do simple tasks specified
with simple English sentences (commands and questions). The system interprets the sentence=
s and
automatically writes and executes a program in LISP to do what it is reques=
ted
to do by the sentences. It al=
so
interprets descriptive sentences so it can remember the content and answer
questions about that content.
=
=
=
One of
my thesis advisors had a very good question: “what is the main contributi=
on of
the thesis”. At the most
abstract level, the question is easy to answer: “the thesis shows how comput=
ers
can demonstrate language comprehension by doing tasks based on language-bas=
ed
instructions and descriptions, where the same task can be done by human
subjects using the same instructions and descriptions, and where the output=
of
the task is similar.”
=
=
At a
less abstract, more detailed level, the question is somewhat harder to
answer. The program underlyin=
g the
thesis uses a computational grammar and some logic to interpret written
instructions into a high-level program.&nb=
sp;
The high-level program that has been constructed in turn uses grammar
and logic to interpret written descriptions. The program (and the written
instructions) can also deal with additional tasks such as answering questio=
ns
based on the written material.
=
=
At a yet
more detailed level, the question is much harder to answer. The program ‘understandsR=
17; and
responds to commands (instructions), assertions (descriptions), and questio=
ns. The program ‘understands̵=
7;
references to people, tasks, etc. =
span>The
references can be direct, descriptive, or indexical. Descriptive components can include
comparisons and negations. Th=
is
list can go on, and describe the capabilities and limitations of the program
underlying the thesis. But it=
is
less clear how and why this list answers the question.
=
=
At the
most detailed level, my program has almost ten thousand lines of code, with
thousands of complex data representations.=
The program modifies itself and adds code while interpreting
instructions and commands. Wh=
ile
using formal logic etc., the program is almost incomprehensible. There are no clear boundaries betw=
een
substantive theory and heuristic computational implementation. It worked reasonably well for mode=
ling
the target behaviour and for generalizing to other, similar behaviour. This type of LISP and self-modifyi=
ng
programming has been almost completely abandoned, except for some artificial
intelligence projects, because it is too difficult to get bug-free and to
maintain.
=
=
=
Being a
thesis in social psychology, the thesis had to be tested, which means
experimentation and statistics. I
had to come up with an empirical method.&n=
bsp;
Fortunately, the British mathematician Turing had come up with the
concept that programs like mine could be tested by giving the same informat=
ion
to the computer model as to humans.
If an observer could not tell by the responses whether they were
generated by a human or by a computer, then the computer model passed the t=
est.
=
I
therefore simulated the language capabilities of a student participating in=
a
typical experiment in social psychology.&n=
bsp;
The student and the computer get written instructions on what to do =
in
the task, then a description to understand, and then an evaluation componen=
t. In the evaluation the student and t=
he
computer make choices on various evaluative scales (dimensions), where the =
choices
are based on the description. The computer
model got the same written instructions, also expressed in grammatical Engl=
ish
sentences but slightly simplified from the typical experimental instructions
and descriptions.
=
=
These
were early days in computing. Most
people were still using punched cards.&nbs=
p;
I was lucky to get a CRT terminal and another terminal that could pr=
int
on thermal paper. However, the
input and the output would not have fooled anybody. My program could do the task, bare=
ly,
but it would not look natural. It
could do different tasks, and produce different evaluations. But, give it exactly the same task=
twice
and it would do exactly the same thing.&nb=
sp;
So it would pass the Turing test only if the observer was watching f=
rom
a great distance.
=
=
The
model does support language comprehension, as long as the sentences do not =
get
too complex or work with ambiguities.
The model also supports some learning, but is restricted to
language-based learning from explicit instructions. So it fits some school instruction
models but cannot show learning by trial and error, or learning by mimicking
others (apprenticeship learning).
=
=
Social
psychologists would have preferred real experimentation, with experimental
groups, treatments, and formal measurements as well as statistical
comparisons. However, I barely
managed to get all the programming done to allow general language
comprehension, including figuring out who or what pronouns refer to in cont=
ext,
and to sort out what is meant when the sentence includes negation. I never did figure out good
experimentation that would be appropriate, and random selection of subjects
into groups just didn’t fit. <=
/span>
=
=
=
Underlying
the thesis research was the assumption that there is a universal method for
understanding and working with language.&n=
bsp;
Alternatively, there might be a single individual somewhere who
understands language just like my computer program, but that all other
individuals do it differently, in their own way. Subsequent research has mostly fol=
lowed
the universality assumption, but I no longer believe it.
=
=
One of
the most powerful tests of theories in physics is based on their applicatio=
n in
engineering practice. The pra=
ctical
theories survive. Following t=
his
idea, I thought of another approach to verification and validity -- testing=
my
language comprehension theory and model in social-engineering applications.=
=
=
=
=
I
experimented with using my thesis computer model for undergraduate instruct=
ion. I was lucky to get some funding fo=
r the
educational application of computers.
Most such research focused on teaching and testing, i.e., on present=
ing
material and testing recognition and recall. I proposed having the students lea=
rn by
exploring and using trial and error.
One such experiment used the computer system to teach about scientif=
ic
theories and experimentation at a very basic level. A class of undergrads was split in=
to
groups. Each group made up it=
s own
theory of the world, and entered it through commands and descriptions into
their version of the computer model.
The group then handed over its version to another group, and that gr=
oup
had to discover the theories by asking questions of the model, and running =
very
simple experiments. It was
interesting how difficult students found it to discover the correct set of
theories, and how illuminating they found it for understanding research.
=
=
=
From my
thesis work there were two directions to go. One was to increase the sophistica=
tion
of the language comprehension. An
example of such a challenge is the disambiguation of indexical references.<=
span
style=3D'mso-spacerun:yes'> Another is to deal with the ambigu=
ous
scope of negation. An example
illustrating both is a sentence like “He is not stupid about thisR=
21;
toward the end of a conversation.
=
=
The
other direction was to further study the utilization of language in the wor=
ld
(an engineering validation of language-based models). I chose to investigate learning wi=
th and
through the utilization of language.
I developed an interactive computer system to simulate a research
assistant. Interacting with t=
he
computer system is like role playing.
The student is the senior researcher. TARA (The Automated Research Assis=
tant) is
his or her research assistant who actually runs the experiment.
=
=
The
simulations are written by graduate students for their specific research
topics. I also included some =
more
simulations from the Michigan Experiment Simulation System, a project I had
been involved with in my graduate years.
=
I
explored three kinds of learning with this system. The first one is learning from exp=
licit
instruction, where the students are told explicitly and precisely what to e=
nter
into the computer to run a pre-selected experiment.
=
=
The
second approach is learning by imitation. My approach focused on moving stude=
nts
from lecture-based learning (language-based learning) to an apprenticeship-=
type
learning, where the students copy tutors and each other. The class is told what kind of pro=
blem
to solve or explore, and the students talk to the tutor and to each other a=
nd
copy the approach used by fellow students that seem to work.
=
=
The
third type of learning is learning by trial and error. The students generate experiments =
they
can do with the system. The
students design, run, and analyze experiments on their own within the
capabilities of
=
=
The
system (also known as QUEST – Queens University Experiment Simulation=
for
Tutorials) was used by about 1000 students over a number of years at
Queen’s University. It
demonstrated that learning the language of experimentation and doing virtual
experiments with language and numeric data can nicely complement hands-on
experiments in labs. This system was considered successful in complementing=
the
usual approach of lecturing and of teaching laboratory techniques.
=
=
I was at
a high-level meeting with a dean and senior academics, including department
heads, when a very senior colleague of mine compared computers to flush
toilets, and did not think this kind of work fit into academia. ‘Ouch!’ Needless to say my work, while dee=
med
successful, did not get me a the required academic credit (1974-1980, before
Cognitive Science) to lead to promotion.&n=
bsp;
In all fairness, I spent too much time in developing the core system,
developing interactive capabilities when the campus was still using punched
cards, developing a typesetting system for maintaining the documentation, a=
nd
developing a data analysis and graphing system for helping student analyze =
and
interpret the data.
=
=
=
Fortunately
for me, industry was interested in my skills and experience, and there were
many opportunities to develop new, computer-based technologies. I wanted to explore knowledge
engineering across most, if not all, social and industrial sectors while ma=
king
a living and raising a family.
Fortunately for me, expert systems, process control, and electronic
books were popular and allowed me to explore many types of applications whi=
le
introducing new technology through designing and building working
prototypes. I developed worki=
ng prototypes
of systems for process control, electronic books, quality control, simulati=
ons,
inventory control, and communications.&nbs=
p;
The technology was and is being used for automating operations in
aerospace, automotive, chemicals, oil, mining, banks, bingo halls, beauty
pageants, legal firms, publishing, medical, and food import and distributio=
n.
=
=
=
Changing
manual labour to computer-assisted labour, especially in white collar jobs =
and
professions, involves a culture change.&nb=
sp;
The people interacting with the computer have to be comfortable with=
the
change in their tasks. At the=
most
trivial level this is captured by the question: Are you working for the computer o=
r is
the computer working for you. At
another level this is captured by the question: Is the computer behaving like a pe=
rson
you know and understand.
=
=
Developing
computer applications involves redesigning tasks, which in turn involves
restructuring the metaphors underlying the work. These critical metaphors are langu=
age
based and link into beliefs (and myths) about who can do what, and how it
works. Successful computer
applications have to tie into these metaphors, and might change them gradua=
lly over
time.
=
=
The
computers have to ‘talk’ to people in terms they understand, in
sequences that are a natural part of their tasks as they understand them. Computers have to be seen as suppo=
rting
the people in their tasks (rather than people supporting computers in doing
their tasks), i.e. we have to anthropomorphize computers. In dividing tasks and responsibili=
ties
between people and computers, both have to be able to do what is requested =
of
them, usually because the new task is an extension or variation of a task t=
he
people have done in the past.
=
=
=
During
this 20-30 year process I became convinced that it was unlikely that there =
was
a single semantic space, with a single basis of knowledge and
understanding. I became convi=
nced
that there are islands of shared tasks.&nb=
sp;
Rather than having des=
criptive
knowledge, it is better to look at the ability and competence to do tasks.<=
span
style=3D'mso-spacerun:yes'> Many of these tasks will be
interdependent with other tasks and require communication to coordinate tas=
ks
by synchronizing actions in both space and time. Much of the work I had done with ex=
pert
system and computer-based semi-automation was to help people coordinate tas=
ks
where it was difficult to synchronize in time and space, but where tasks ha=
d to
be performed in different places and at different times. Language communication plays a maj=
or
role in this remote and asynchronous coordination.
=
=
=
I became
convinced that language learning was a component of learning how to do task=
s,
rather than a separate form of learning.&n=
bsp;
This fits in well with evolution, where successful actions matter mo=
re
than communication per se. Fr=
om
this perspective, learning to use language is learning the appropriate spee=
ch
acts or writing acts. One of =
my
students exclaimed “I don’t have time to think, I’ve got =
to
study”. What I have tak=
en out
of that is that his goal was to do well on an upcoming examination, rather =
than
just developing a deep understanding.
This got me to thinking that the evolution of language as a biologic=
al
capability might be important for understanding language comprehension.
=
=
=
=
=
I have
always thought of language comprehension and use as a mental activity, like
thought and consciousness. Ac=
tion,
such as walking, is a physical activity.&n=
bsp;
Talking is a physical activity, moving the jaw etc., that presumably=
is
linked to mental activity in some fashion.=
This link is not well understood, and has been hotly debated, starti=
ng
with Greek philosophers.
=
=
=
Learning
can be seen as improving the success of actions. Reactive learning is based on past
experience. Predictive learni=
ng is
based on thinking and problem solving.&nbs=
p;
One can imagine reactive learning without mental activity, but it is
hard to imagine predictive learning without mental activity. Reactive learning is based on tria=
l and
error and has some inherent limitations.&n=
bsp;
For some actions, errors can be fatal, which is fine for evolution b=
ut
tough for the individual involved.
It is also a very slow form of learning since it presumably involves
multiple generations. Quick
learning, such as single trial learning, would seem ideal, but it has its o=
wn
problems, including being inherently unstable. Predictive learning has real advan=
tages,
but it raises the question of how animals learn.
=
=
=
=
=
I see
feelings such as hunger and tiredness as independent of and separate from
actions. I see feelings as
influencing the selection of action sequences that are recalled from memory=
and
considered for execution. So
feelings such as hunger and tiredness are input that is not the result of an
action.
=
=
=
=
=
As
someone who has studied both physics in the natural sciences and social
psychology in the social sciences, I would like to connect these
disciplines. Physics and the
natural sciences have led nicely to engineering, where we can apply what we
have learned about the world to change the world, thus also validating our
knowledge. The social science=
s have
been far less successful in leading to social engineering. So it is more difficult to have
confidence in the knowledge obtained in the social sciences.
=
=
In the
natural sciences there has long been a dream to connect all the different
branches through a unified theory.
I have a similar dream that it might be possible to link the social
sciences to the natural sciences.
This work represents my effort to address this issue.
=
=
=
The
theory is that ‘inner language’ instructions for actions are
translated into successively more detailed instructions that eventually are
converted into precise instructions to the muscles that control joint
rotation. This addresses one
direction of the missing link, going from mental and inner-language-based
action plans to physical action observable in the world.
=
=
=
The
theory is that visual and other perceptions can be translated into ‘i=
nner
language’ descriptions that in turn can lead to instructions for
actions. We used the example =
of
mimicry that leads to a description of an observed action so that the action
can be copied. This addresses=
the
other direction of the missing link, going from physical action observable =
in the
world to mental and inner-language-based descriptions and action plans.
=
=
=
For
years I imagined the mind as a person growing up alone in a submarine (or i=
n a
cave). To learn about the wor=
ld he
could stick out the periscope and look, or he could listen to sounds from
outside, or maybe from a radio. It
always seemed difficult to me to be in this position, especially in the
childhood years. Once you know
about the world and can manage spoken and written language, it is easier to
make sense out of what you see through the periscope or hear over the
radio. So an important questi=
on is
how you get started.
=
=
I have
met some people who seriously thought that some initial knowledge was encod=
ed
in the DNA, i.e. was innate. I
always found that claim to be unreasonable and unlikely. But there is a catch 22 in that it=
is
easier to learn additional things once there is a fair amount of initial
knowledge, including the skill to learn.&n=
bsp;
This challenge is even greater for species that do not have a shared
language for communication and thus cannot benefit from language-based
learning.
=
=
I always
assumed that we get started with post-birth learning by trial and error, and
that the same would be true for animals as for humans. The question then becomes: when yo=
u have
no skills and little control even over your own body, how do you start
learning. The second question=
is:
what capabilities, skills and knowledge do you need to be able to learn from
others (apprenticeship learning).
We end up with 3 types of learning:=
learning by ourselves (learning by trial and error), learning from
others (apprenticeship learning), and formal instructions (language-based
learning). Only the first 2 t=
ypes
seem to apply to many other species.
=
=
=
=
=
In the
late 60s I took a course on abstract computing, with automata and Turing
machines. One of the more
interesting assignments was to prove that any mathematical / numerical
calculation could be done with a transformational grammar. We also used McCulloch-Pitts neuro=
ns to
solve computing problems. Wha=
t I
really learned was to abstract computing problems from any specific hardwar=
e or
computer language. I see the
neurons and the brain as a large and complex biological computing device. If I can show how an information
processing problem can be solved with a conventional computer, then I can i=
nfer
that the biological computer could solve the same problem.
=
=
=
Using
this approach, support for the theory is garnered by showing that it is
feasible for an abstract system to behave in a manner consistent with the
behaviour observed in an individual of the species. Building a corresponding simulation
model that exhibits the desired behaviour does not prove that the body and
brain solve the same computational problem. However, it shows that it is feasi=
ble
that the brain solves the same abstract computational problem with similar
information processing algorithms. <=
/span>Any
competing theorist therefore faces the challenge of developing a theory who=
se
feasibility can be demonstrated with a simulation model that exhibits simil=
ar
behaviour.
=
=
It helps
if the theory and feasibility model is minimalist, i.e. is as simple as pos=
sible. In general, if a competing theory =
and
model is competent to illustrates the same behaviours, and if it is much
simpler with fewer elements and fewer interconnections, then that theory and
model is preferable. Galileo =
and
the two competing models of the solar system is a good example.
=
=
A theory
is falsified if the corresponding simulation system is not capable of
simulating the targeted features of the behaviour. Of course there is always the hope=
that
one can fix or improve the simulation and thus rescue the theory.
=
=
The
theory was and is being tested with a successive set of simulations. Each simulation makes different
simplifying assumptions and, in general, adds functionality. All of the simulations work on
simplified stick-figure skeletons. <=
/span>The
current ‘working’ version assumes that there is a pair of muscl=
es
for every plane of rotation for every joint, and that the angle of rotation=
is
controlled by the difference in tension of the two muscles. The model calculates the different
frames of reference relative to the ‘ideal’, starting with the =
hips
and going outward to the hands, feet, and head. The next version under development=
is
integrating limited stick-figure comparisons for simple mimicry.
=
=
A simple
eye-ball verification is like the Turing test: Does the output of the simulation =
model
produce realistic, natural-looking actions (motion sequences). The simulation can be used to prod=
uce
output that can be read by an animation program such as Autodesk (Alias) Ma=
ya
software to produce more realistic-looking motion sequences.
=
=
=
Kinesiology
uses cameras and body markers to record the precise trajectory of limbs and
joints during a specific activity.
Modern animation techniques based on the motion of real actors have =
made
further advances. Because of =
these
quite accurate measurements over time, we can know quite precisely where any
given limb was at any given time relative to the stage. The information, even though it mi=
ght
reflect the contour of the limbs and the body, is very similar to the
information generated by our simulated stick-figure skeleton. Similar measurements are made for =
golf
swings or investigating competitive sports such as running and swimming.
=
=
I have
set up the simulation so that data can be collected to allow such compariso=
ns to
be made. I have not collected=
kinesiology
data, and I don’t have the facilities. I have also not yet developed the =
data
analysis tools to support making the comparisons. These are potential future project=
s,
especially if I can find a lab with the equipment that is interested in mak=
ing
such comparisons.
=
=
The
general approach should be extendable to investigating action sequences for
other vertebrates.
=
=
=
=
=
Evolution
has made individuals across successive species more complex. Since the architecture and design =
for
each of these individuals is carried by the DNA at the time of their
conception, we can estimate the innate complexity of the individual through=
the
information content of their DNA.
=
=
Learning
and skill acquisition adds to the complexity of individuals. Finding a measure for the informati=
on
content of individuals at different stages in their lives would be an
interesting challenge but is not addressed here. It is relevant to this investigati=
on
because it seems likely that some infancy and early childhood skills must h=
ave
been learned because DNA is unlikely to carry enough information to account=
for
those skills.
=
=
=
Evolution
has made individuals across successive species more complex. Since the architecture and design =
for
each of these individuals is carried by the DNA at the time of their
conception, we can estimate the innate complexity of the individual through=
the
information content of their DNA.
=
=
Learning
and skill acquisition adds to the complexity of individuals. Finding a measure for the informat=
ion
content of individuals at different stages in their lives would be an
interesting challenge but is not addressed here. It is relevant to this investigati=
on
because it seems likely that some infancy and early childhood skills must h=
ave
been learned because DNA is unlikely to carry enough information to account=
for
those skills.
=
=
=
I am
always amazed how much information is flowing around in my body. At the rate of sending a complete =
set of
instructions to the muscles 60 times a second, a lot of information is need=
ed
for the 90 minutes of the nutcracker.
At the same time and the same rate we are receiving a complete
information update flowing in from the eyes. Even with the 1011 neur=
ons we
have, they would soon be filled up just from memorizing and performing the
ballet.
=
=
The
layering design discussed above means I do not have to keep all that
information. Using the factor=
s of
10 in the model, the second layer reduces the information by 10, the third =
by
100, and the fourth by 1000.
Reducing the information storage required means that I can remember
more, and a greater variety of action sequences, which in turn give me a be=
tter
chance at survival and success, and thus provide an evolutionary advantage.=
=
=
=
Layering
reduces the requirements for information flow. A lower rate of flow means that we=
can
get speedy responses with relatively slow neurons. The layers allow for local compute=
rs to
manage high speed communication while the upper layers work more slowly but
focus more on integrating information flows such as combining vision with
action or with coordinating the left leg with the right leg.
=
=
=
The
first advantage of layered computing is that the layers can work in
parallel. This is somewhat
analogous to the evolution of computers, where we have also gone to using m=
ore
parallelism to solve separate problems.&nb=
sp;
For instance, the graphics card typically has its own processor and
memory.
=
=
A second
advantage is that the upper layers work at slower data rates and thus have =
more
time to solve somewhat more complex interactive problems.
=
=
There is
gradualism in evolution. Succ=
essive
species show incremental changes.
This also applies to the capacity for information processing. I suspect that the layered approac=
h to
information processing evolved very gradually. However, I suspect that all verteb=
rates
have a somewhat similar information processing architecture since they have=
to
solve very similar problems both in controlling action and in integrating
perception with action.
=
=
Layering
involves multiple stages of information processing. Individual learning adds flexibili=
ty of
responding with action sequences appropriate for local situation, i.e. lear=
ned
responses. This is an evoluti=
onary
advantage. The challenge is t=
o show
how the system might acquire the information processing capabilities we
hypothesize. The capabilities=
must
come from DNA and/or from learning.
For our model we hypothesize that higher-level choreographic instruc=
tions
are translated into low-level instructions for each of the muscle pairs.
=
=
=
=
At
present, the investigation is focused on the integration of perception, and=
on
the representation of geometric information for both mimicry and also for t=
he
calculation of joint angles. =
Like
for language, the information should be layered, with more detail at some
levels and less at others, and with a simple transformation between the lay=
ers. The idea of a dual clock seems to =
apply
to vision as well as to muscle management.=
At the retina we assume a fast clock with probably the same timing as
for direct muscle control. At=
the
higher layer of vectors and action plans we assume a much slower clock.
=
=
Other
topics for future investigation include the widely shared notion that we can
understand each other and that we have a shared reality. An easier topic is the notion of
cooperation and social roles. The
role of institutional learning including play and lecturing could be of
interest.
=
=
=
During
my undergraduate years I was very interested in philosophical discussions.<=
span
style=3D'mso-spacerun:yes'> Solipsism seemed attractive at lea=
st as
a starting point for infants, with access to others and the external world =
only
through action and perception (Kant, logical positivists, existentialists).=
Dualism is an obvious challenge.
=
=
I had a
very close friend, an artist, who was very interested in eastern thought wi=
th
its different approaches to reality.
I even had a reasonably well-paying job to investigate the aura (hal=
o)
that is seen by some people. I
devised an experimental paradigm that showed that almost all subjects could=
not
beat change in aura perception.
However, I had one subject who could consistently beat chance with v=
ery
significant probabilities. (go
figure!)
=
=
At least
conceptually, and from experience with interpreter design, I can see how ma=
cro
expansion with simple rules might generate the instructions that are requir=
ed
by the successive lower-level process control computers. I can also see how the timing woul=
d be about
right.
=
=
This investigation
is congruent with the idea that evolution is a search process that selects =
in
favour of more optimal solutions for biological structures and processes.
– satisficing --
=
=
Chomsky’s
work on transformational grammars was interpeted to reflect a universal
structure for grammatical structures and judgements. A similar universality was hoped f=
or and
anticipated for semantic structures.
I decided to work with logic as basic approach, having the fortune to
learn from people like Arthur Burks and Joyce Friedman. Winograd, drawing on the work of H=
ewitt,
was trying a slightly different approach at MIT, to model the ability of
children to build towers with blocks, etc.
=
=
My
interactive system started working about 1971, and further improvements and=
the
write-up meant that the thesis was finished in 1974.
=
The biological evolution of
language & learning – section 1 version 1