An Attempt at A First Essay
The goal of this essay is to look
at knowledge engineering from a theoretician’s perspective. More specifically, we would like to discuss
the notion of a minimally competent theory to help ‘engineer’ practical solutions
based on working with ‘knowledge’.
Our understanding of the world, and
the way in which we maneuver in it, is based on a string of ideas and theories.
I will begin by describing a
groundbreaking idea, and an uncommon occurrence. (This thread can be taken up again in a
discussion of how to manufacture these uncommon occurrences.)
Albert Einstein had a remarkable
theory: maybe light, in the presence of matter, does not travel in a straight
line. In 1919, he gave credence to that
theory. The scientific community, during
the solar eclipse of that year, measured the deflection of stars erstwhile
hidden by the intensity of emissions of the sun. The results gathered were inconsistent with
the common notion that light travels in straight-line trajectories, erstwhile
(and for hundreds of years) accepted.
The world of physics was tilted off its axis. The rest of the world was content to buy the
poster.
Theories can also come in smaller
denominations.
My high school science teacher had
a theory (though less original, I suspect): if we were to add equal parts
vinegar and baking soda, we would produce carbon dioxide and water. She lent credence to this theory with the help of the class.
In groups of two or three, we added various quantities of vinegar and
baking soda, and attempted to measure the amounts of carbon dioxide and water
produced. We found a spread of results,
all of which, uncertainty included, convinced us that our teacher was right. The class might have been stunned, but likely
was not, as it was accepted that our teacher would likely not mislead us.
Theories are, in fact, common
occurrences.
Architects and builders are in the
business of theories: they believe that they can plan the requisite steps in
order to erect a house. This plan is
multi-faceted and multi-layered.
Included must be a structure whose walls will not fall, even if
subjected to a wide variety of stresses, wiring that will provide future
inhabitants with electricity and telephone use, pipes that will allow running
water, etc. The spread of results for
builders and architects must be of a much smaller range. The design and use of the structures may vary
widely, but the buildings erected must be inhabitable and must abide by
established safety regulations.
In the following dissertation, I
would like to consider that which is involved in postulating a good theory.
First, let us dissect those
theories stated above.
All three theories are postulated
in a fashion understandable to members of the same academic or employment
community.
Einstein formulated his theory of
general relativity in the language of Mathematics. My high school science teacher expressed her
theory in the symbols understood by beginner chemists. Builders and architects write their plans in
charts and diagrams readable by other professionals. This is a description of the meta-theory of
the individual theories.
The meta-theory consists of the
components of the theory and the formalism.
In a formulation of the meta-theory, should be included descriptions of
that for which the theory (or family of theories) should be responsible and the
language in which they should be expressed.
The above theories postulate
something; they predict behavior or occurrence.
Albert Einstein postulated that
light curves around matter. My third
grade teacher predicted that two substances combine to produce foreseeable
products. Architects and builders
postulate that they can draft finite number of steps that, once completed, will
lead to the creation of a house.
The theory itself should represent
a predictor, usable and understandable to anyone in a general population. It should be expressed in the language
decided upon in the meta-theory.
A theory should be, in some
fashion, verifiable.
Einstein’s theory was satisfying
because it was verifiable. He chose a
phenomenon that could not be explained using contemporary theory, and he
proposed a new theory. My science
teacher directed our class to verify the result that she postulated in
laboratory experiments. Architects and
builders produce structures whose lasting existence proves that their theories
are sound. Therefore, the above theories
fit in a model of behavior.
The model is the situation in which
the theory is evaluated. It is an
application of the theory to a particular situation in which its prediction can
be measured and its accuracy evaluated.
We will now treat another important
theory that has existed in various forms for millennia.
My parents seem to have theorized
that if I were subjected to between approximately
fifteen and twenty-six years of education, then I would become a productive
member of society. My community agrees;
it is not socially acceptable to drop out of school before a given age. My country agrees as, in fact, it is not
legal.
What is involved in this
theory? More aptly put: what would be
involved in constructing a theory of education?
Note that education is a good model
with which to begin, because it is institutionalized in stages. Passing from one stage into another is
dependent almost solely on passing a (standard) set of tests. Our model will thus not be artificial.
Therefore, in order to formulate an
answer to the above questions, we will first consider the components of the
theory. In other words, we will define
the terms, and the manner in which we will proceed to talk about this theory of
education.
The meta-theory:
If we were to postulate a good
theory of education, then it would handle the following components. It would predict progress in increments (i.e.
in years and semesters) and it would make long-term predictions. It would follow the learning cycles and
curves of real students. It would treat
a wide range of students, differentiated by intelligence, learning styles and
capacity, and rates of maturity.
A good theory of education should
be expressed in terms of intervals of semesters, intelligence, and maturity.
The theory:
A theory of education should
predict the course content that would allow students to pass the tests required
to pass to the next level of education.
It should predict the difference in test scores of students given a
range of intelligence and maturity. It
should in fact predict the course of study and the student-type required to
manufacture the next Albert Einstein or Mozart.
The model:
A theory of education may be
evaluated within the framework of public education. We can evaluate a given grade ten course
content administered to children who have demonstrated a grade nine competence
level (i.e. monitor the spread of test scores at the end of the administration
of the course.)
In order now to begin to maneuver
inside our model, we will decide that which we would like to primarily
accomplish. We will begin by formulating
a minimal theory.
We would like to start with an
educational policy that should translate into educational strategies. These educational strategies should translate
into educational tactics. These
educational tactics should finally be used to produce a curriculum (i.e. a course content for some particular course to satisfy some
particular educational goal).
We will begin by imagining or
modeling the behavior of a student. Our
model is solipsist. We will therefore
not represent any particular student, nor will we represent the average
student. The behavior of the student
that we imagine must have ‘something to do’ with the behavior of the students
more generally. (Although in a more
refined model, we would like to consider a wide spectrum of students in
consideration of energy, motivation, intelligence, IQ scores, etc.)
We will initially place our student
in grade ten. If he enters a grade ten
level of education with a grade nine level competence (i.e. he passes the tests
required of a student leaving the grade ten level), then we would like to know
the requisite material that we must administer for the computer to pass the
tests required by grade ten students.
We must now make some limiting
assumptions. We will this student after
one who is given only textbooks, and we will thus not consider the way in which
material is presented. One important
limiting argument is that the information must be presented in sentences, as it
would be to a corresponding real student.
Therefore, our computer model of a student must demonstrate the skill of
sentence comprehension. The model student
must also learn in a method comparable to corresponding real students. We will thus present material and test the
model student in stages. Like
corresponding real students, we do not expect perfect scores from our model
student; in order to pass to the next stage in education, our computer model
must merely pass the tests to which it is subject.
Testing for understanding, even in
current educational practices, is a difficult matter with which educators
contend. We will require that our model
be cognitively competent. We don’t know
whether our program truly understands what we are saying, but we are requiring
for it to ‘fake it’ and to answer our questions correctly. We will thus require only that our computer
model of a student have the skills necessary to ‘fake’ understanding on the
tests to which we subject them, to act, within given constraints, as if they
understand.
Our model should be parallel in
measurement to the rules of evidence.
Consider two students at the same starting position, both vested with
the same information (i.e. in this example both given the same knowledge that
yields grade nine competence). If we provide the students with different
grade ten course contents, then one student should test higher on the tests to
which both students are subject at the end of grade ten.
Now we must consider how to
serialize this model. Our model should
grow in accordance with real people. How
did the student get to a grade nine competence level? This should work in the same manner in which
a kindergarten student progresses; at the end of each step of educational
training, the student should a possible cognitive competence of the
corresponding child at the same level, following the same curriculum.
How can courses be evaluated? Clearly, if we were to evaluate the entire
science curriculum in a school experience, or similarly, two years of physics
in the high school level, we cannot isolate the physics from the other courses
for which a corresponding real student would be responsible. Understanding the physics depends on a
requisite amount of math and chemistry (for an adequate understanding of the
core problems treated in the physics course) and English (for sentence
comprehension and for communication skills.)
Hence, if we would like to isolate courses, we must do so in the
smallest possible increment.
Why is this a
knowledge engineering problem? We should
be able to optimize the curriculum content for a given course, by selecting and
ordering the material and trying different sequences to maximize performance on
the examinations. Optimization can take
into consideration different entry qualifications, so that we can optimize
across a known distribution of students, resulting in an optimal distribution
of examination results.
The theory is verifiable since we
can compare it to actual results.
The scope of the theory is limited,
but fairly wide. Variants can be used to
optimize instructions for complex products, operating equipment, learning new
procedures and jobs. It applies where
the instruction is explicit, in manuals or other verbal form. It can be evaluated where the performance can
be measured.
The scope of the theory is limited,
since it does not deal with collaborations, or with learning situations where
the information is not presented in manuals but may be visual or is based on
learning by doing.
How would the theory work? Since it is very complex, it will have to be
computational rather than mathematical.
In other words, the theory is represented by a computer software
application for which the program or algorithms instantiate the theory in a
model. The program behaves as the theory
specifies it should behave.
What are the components of such a
theory / model? There must be software
algorithms and data that instantiate the knowledge and capabilities at any
given stage. Rather than ‘dissecting’
the specimen, it should be possible to examine these algorithms and data. At the appropriate level of abstraction, the
theory is represented. At a lower level
of abstraction, the syntactic and mechanistic elements represent the notation
and infrastructure for the theory, rather than the essential content of the
theory.
How is such a theory/model
used? For our grade ten example, the
starting point must be a model /programme which can pass grade nine
examinations. It must then be possible
to feed in the grade ten curriculum materials – as textual material. It must then be possible to feed in grade ten
examinations, and the system must be able to produce appropriate answers – not
perfect, but sufficient to fake it as a student.
In conclusion, we hope that we have
demonstrated how a competent theory would function, and what role it would play
in dealing with knowledge engineering tasks.