Some of the projects undertaken by Rainer von Königslöw at Task Re-engineering Inc.

Meccano Set expert system shell

Allied-Fibers, Virginia.

Allied-Fibers, Frankford, Philadelphia.

Precarn.

NSERC - National Science & Engineering Research Council

Hutchison Research - CanTech.

Syncrude Research.

CIBC.

York Bingo – fraud detection.

Ontario Cancer Foundation – an expert system to analyse data.

Extending expert system technology to handle text-based representation needed for legal reasoning and similar text-based information processing.

.Sort & sort_l

.sort_p.

.trans.

.qa & qa_index.

.char_map.

.tag_ex.

.char_ex.

CAE: Computer-Aided Editing for large & complex books.

TARA: The Automated Research Assistant

i_Link: The Intelligent Link.

Computer-Aided Support for maintaining large & complex books in multiple languages - (The Translation Assistant)

The Scoring System.

IDPC: Internet based wide area distributed process control

Meccano Set expert system shell

The purpose of the project is to develop an expert system shell which is suited both for Task Re-engineering consulting projects and as a software product.

The original capabilities of the shell are to allow expert systems to be created for process control and similar applications in manufacturing and other industry which:

supports extensions in process control to include feed forward controls (predictive controls) with trend detection can be added onto, and integrated with existing process control or other data acquisition and information processing systems is inexpensive: runs on a PC, involves inexpensive projects to develop and integrate follows the KISS principle in engineering - and consequently is easily understood not just by engineers but also by people in operations or on the shop floor. it supports a clear layout of temporal data it supports a clear layout and documentation of control rules

These objectives have been extended to include problems of task re-engineering in banks and professional practices such as law offices and accounting firms. In these contexts the output is not numeric controls but text documents. To allow expert system technology to be applied to these problems one must solve problems of how to manipulate textual data in an automated fashion, i.e., how to apply process control to the generation of text documents. To do so, the expert systems must:

recognize where there are alternatives or options in the text allow the insertion or substitution of alternative text fragments represent the logic for choosing alternatives in IF-THEN format supported by expert systems develop an "expert's" user interface that is appropriate for text handling professionals such as lawyers As in traditional areas of application, the expert system must continue to: support a methodology of rapid prototyping, staging, and good project management, including good documentation and testing with special utilities and support for documentation with special utilities and support for quality assurance support a dynamic lifecycle (rather than static), with provisions for local customization and constant adaptations to a changing environment local documentation with desk-top publishing local quality assurance - with a special facility

The technological problems to be solved include:

development of a special user interface - for the "expert" (not the end-user), for understanding the data and the logic of the application, for making local adaptations, testing them, and for good project management. This "expert" interface includes: data representation in files, knowledge representation in rules and other structures, message representation both from & to the end-user,

We also need procedures and formats for working with these in ways that the expert is used to, i.e., that does not require extensive training. To do so, we have to leverage off existing knowledge and skills in handling traditional documents that portray the data, knowledge & messages. We need to:

automatically produce the documents, using desk-top publishing allow word processing of the documents to manipulate the programs alternatively, allow spreadsheet processing to represent and manipulate the programs b) development of facilities for semi-automatic documentation, and for simplified quality assurance, so that non computer experts in operations are able to do local customization under good project control and management c) development of computer assisted methodologies to explore and apply new methods of control such as predictive controls, or controls appropriate for larger plant integrations (dynamic adaptations of control strategies).

The technological problems are addressed in the context of solving practical problems - based on experiences gained solving related problems for clients in consulting projects.

Personnel:

Project management & much of the development is done by R. von Königslöw, with occasional assistance from others

Project status & plans:

The project started in Jan. 1989.

data representation and testing: a facility for temporal data that supports the development of trend detection algorithms and their testing was developed. A novel interface with a spreadsheet format for both data representation and knowledge representation was completed for data. A similar interface for the knowledge base has been completed, but was not found to be very satisfactory in a number of experiments. This line of experimentation has been suspended until further progress has been made exploring text embedding and free text representations. Several modules for message control were built to deal with persistence, suppression, delay, and redundancy. Several experiments were conducted for applications in process control, where the expert system provides detects trends & provides advice. (The advice may take some time to implement and take effect. Message persistence ensures that the message 'gets through'. Message suppression ensures that the advice does not become a nuisance. The delay ensures that a reminder can be issues if a persistent trend continues.) The first successful small experiments led to a larger experiment which constitutes a novel approach of process control to the management of large computer mainframes, using a PC to monitor & send messages to operators. Initial results from this experiment are very positive. Work on text embedding and document manipulation has started, and is anticipated to last through 1993 into early to mid 1994. A first small-scale experiment with this approach was performed early 2nd Q 1993.

Allied-Fibers, Virginia

1. Objectives: the project has two objectives:

• to develop a system to aid design houses such as Catalytic Inc. in the specification of process measurement and control instruments, primary elements and control valves - for Allied Fibers projects

• An advisory system to aid in assembling and reviewing cost estimates by engineers, and to aid in training junior people taking on this function.

2. The technological problems & risks:

Both of these applications are new targets for the application of AI and expert system technology. The main technological objective is to adapt the technology to be able to provide a functionally satisfactory and cost effective solution to the problems. Both problems include effective technology transfer. In the first case the knowledge transfer is from the client firm to the design house. In the second case the transfer is within the client firm - from senior engineers to junior engineers. In both cases the transfer of data is not sufficient, but transfer of decision making rules must be included.

The technological problem involves providing a computer assisted system which also has an "expert" interface that it can guide junior engineers through making correct decisions - ensuring that important alternatives are not overlooked, and that the junior engineers understand the logic and the relevant data.

Present expert system technologies and methodologies have to be adapted to fit the requirements (in line with the concepts discussed for the Meccano Set above).

3. Personnel:

The basic design and some of the development is done by Rainer von Königslöw acting as consultant, with technical support from Allied Fibers at the Virginia Tech Center.

4. Project status & plans:

The basic design and some of the requirements have been determined in a feasibility study. The project is awaiting approval for further development.

Allied-Fibers, Frankford, Philadelphia

1. Objectives:

To develop a system to alert operators to potential safety hazards in the operation of a chemical plant. The process is exothermic and potentially quite dangerous. The plant is located in a suburb of Philadelphia, and a previous problem caused extensive damage.

To develop a system to help operators in tuning the process to improve process control, and thus to increase production and increase both volume and quality.

2. The technological problems & risks:

Both of these objectives are addressed through the application of expert system technology. The main technological objective is to adapt the technology to be able to provide a functionally satisfactory and cost effective solution to the problems. Both objectives requires close integration into existing process control, and improving effective technology transfer. The production personnel in the plant have to understand and utilize the messages from the system. They also have to understand the logic, so that the system can be improved in reaction to changes in the plant, through the introduction of improved sensors and other engineering improvements.

Present expert system technologies and methodologies have to be adapted to fit the requirements, in adapting to existing process control data and systems, and in meeting special needs such as in delaying repeated messages to allow operators enough time to correct problems - yet reminding them if the problem does not go away.

Difficulties arise at three levels:

• in extracting data from the constantly changing data stream of the process control system, integrating into the timing constraints (sampling rates)

• in integrating the messages from the system to the operator into the messages received from the process control system, including avoiding false alarms and nuisance messages

• in integrating the logic of the expert system into operations, so that it can be understood, maintained, and trusted

3. Personnel:

Project management & much of the development is done by Rainer von Königslöw acting as consultant, utilizing the Meccano Set.

4. Project status & plans:

The requirements evolve with experience with the system. The first version was installed in 1987, and has been evolving at a steady pace, as requirements become better understood.

Precarn

1. Objectives:

Precarn is a consortium of Canadian companies. Precarn is dedicated to precompetitive research in AI and robotics. Projects generally involve two member companies and a university. The intent is to develop technology that can be exploited by all member companies, and secondarily by other Canadian companies. To aid in facilitating exploitation, a separate project was organized under the exploitation committee, to help member companies and others understand the technologies, and to recognize and prepare for integrating the technologies into their own operations. In other words, the objective of the exploitation component is to help member companies re-engineer parts of their processes to take advantage of the new technologies. A second component is to investigate requirements arising from anticipated applications of the technologies and to feed them into the research projects.

2. The technological problems & risks:

In order to apply new technologies successfully, one has to understand the applicability of the technologies in redesigning processes. One also has to manage the projects of applying the technologies, where a component of the project consists in leading edge research.

A second area of problems and risks arises in developing technologies, where it is easy to "miss the boat" unless the requirements for the eventual applications are understood clearly.

3. Personnel:

Rainer von Königslöw acted as "industrial awareness" consultant, reporting to the Exploitation Committee.

4. Project status & plans:

The project involved developing and presenting feasibility prototypes to demonstrate the feasibility of applying the technologies. The majority of the activity was funded by NRC, the National Research Council. A second component involved interacting with NRC, the member companies and other potential users of the technologies to explore how the technologies might address their needs, and to help them understand the potential applications and benefits.

NSERC - National Science & Engineering Research Council

NSERC administers research funding for academic research through a variety of grants, using a peer group evaluation system for allocating the funds. There are committees, acting like juries, for different granting areas. Rainer von Königslöw participated on the Computing and Information Science committee, on the Collaborative Project Grants committee, and on the Industrial Research Chair in Intelligent Software Systems committee.

Hutchison Research - CanTech

1. Objectives:

To develop an inexpensive method, applying desk top publishing technologies and Postscript, to map out information from a database to print directories. The directories have to contain a lot of information in multiple columns, fonts, and using other typesetting devices. The process of producing the directories has to be sufficiently inexpensive to allow the production of low volume runs for specialized audiences (e.g., Metro Toronto & the province of Alberta).

2. The technological problems & risks:

Several new computer technologies were integrated to allow migrating the process of producing the directories from traditional typesetting methods to inexpensive methods using "office" equipment, integrating variable length database fields, mapping it into word processing format, mapping that into Postscript, and mapping that onto a 300 DPI laser printer with special paper to produce inexpensive masters for a photocopying approach. It is a novel application of new technologies.

3. Personnel:

Much of the development is done by Rainer von Königslöw acting as consultant.

4. Project status & plans:

The process has been developed and transferred for routine production use. Unfortunately even this more inexpensive process is still marginally too costly for the volumes and the price supported by the market. Yet more inexpensive production methods may fall out of new technologies that are still in the research and development stage.

Syncrude Research

1. Objectives:

The project focuses on process control and has two objectives:

• to develop an approach that allows plant integration, i.e., where information about the process at one stage in the plant is used to improve process control in another part of the plant. Material goes through the plant, and information from earlier stages of processes can be helpful in selecting processing strategies for later stages.

• the approach must allow the integration of processing stages with different types of sensors and controls, e.g., the integration of a mining stage with a refining stage.

• the approach must allow for intelligent path planning, for different types of feedstocks, and for different environments, including equipment failures and variations in the processing environment.

2. The technological problems & risks:

This type of application is a reasonable candidate for the application of AI and expert system technology, especially since information poor mining stages are meant to provide input for information rich refining stages. An integration of qualitative and quantitative information is needed. Existing expert system technology has to be adapted to allow for better representation of time and pattern recognition windows sliding over time. The main objective is to adapt the technology to be able to provide a functionally satisfactory and cost effective solution to the problems.

3. Personnel:

Much of the development is done by Rainer von Königslöw acting as consultant, under contract to Syncrude Research, which in turn is doing the applied research for the main operating unit.

4. Project status & plans:

A special facility supporting the development and testing of an expert system for predictive process control was developed. The facility was utilized for developing pilot prototypes of an expert system for predicting upsets in centrifugal extractors, and for providing advice to operators. The expert system is being developed by Syncrude Research and integrated into the plant process control system.

CIBC

1. Objectives:

There are a number of projects, all with the general objective of exploring how expert systems and related technologies can benefit the bank:

• integrating expert systems into a future architecture for developing banking applications.

• a sales workbench, to aid branch personnel working at the side counter or in an open style office to increase customer support, especially for complex financial products, and to increase cross selling of financial products where appropriate.

• increasing the automation of mainframe systems operations, including performance, software maintenance, production support, contingency planning, etc.

2. The technological problems & risks:

These and other potential applications are targets for the application of AI and expert system technology. Most banking is still delivered with traditional COBOL based software on the mainframe, and with limited, C based software on PCs.

The main challenge is in the architecture and design of new applications that fit well into both the business environment of the bank as well as into the information processing environment of the bank. The "expert" interface can be utilized by banking experts to increase the responsiveness, accountability, and maintainability of application systems.

A better methodology is needed, avoiding the long cycle of user requirements, development, testing, and implementation. Effective technology transfer is also important, since there is an established software group that understands the banking applications.

Present expert system technologies and methodologies have to be adapted to fit the requirements and to fit into a demanding transaction oriented environment.

3. Personnel:

Rainer von Königslöw acts as consultant to the Architecture and Consulting group, which in turn lend his services to different groups within the bank.

4. Project status & plans:

This is an ongoing program, with a number of pilot projects, each with their own timetables. The basic program has been going on since about 1986, with occasional stops and starts. It still has a long way to go before expert systems and related technologies and methodologies are firmly integrated into the banking information processing environment.

York Bingo – fraud detection

1. Objectives:

To develop a system to aid in managing Bingo game cards, primarily to reduce opportunities for fraud and other potential abuses.

2. The technological problems & risks:

Bingo games are run by non profit organizations using volunteer labour. The halls may also be run by a non profit organization, using the proceeds for charitable purposes (e.g., York Bingo is run by the Fairbanks Rotary Club). The hall is limited in the percentage of the proceeds it can use as overhead to run the hall. The net effect is that Bingo halls & games are run by people with relatively little training or experience, at a relatively low level of professionalism.

On the other hand, there is quite a bit of cash involved, so that there is a certain amount of temptation for abuses. The challenge is to re-engineer the process using expert system technologies and methodologies so that the system can be run with little training, and so that it prevents potential abuses through closer controls on the management of Bingo game cards.

The challenge is in restructuring the tasks and devising an application so that it fits all the regulatory constraints, is easy to understand by volunteers and low skill personnel, prevents potential abuses, and provides a tighter accounting over game cards

3. Personnel:

Project management & the development is done by Rainer von Königslöw acting as consultant.

4. Project status & plans:

The basic design and some of the requirements have been determined in a feasibility study, and explored with a manual, spreadsheet based procedure. The next stage of integration will involve the Meccano set to automate the spreadsheet data and controls. The previous stage of the project has been implemented in one Bingo hall and has been migrated to a second. The basic facilities are being built into the Meccano set and should be completed on early 92. The final integration phase of the project is planned for late 1992.

Ontario Cancer Foundation – an expert system to analyse data

1. Objectives:

To develop a system to diagnose cancer patients from source records that investigate potential cases of cancer.

2. The technological problems & risks:

The system not only has to simulate medical diagnosis, it also has to document the logic in such a fashion that medical practitioners and other researchers are satisfied that the system simulates their reasoning. A second challenge is that the data will be interpreted with statistical analysis methods. The logic in the inferential program may not violate important measurement assumptions which could invalidate subsequent statistical interpretations.

3. Personnel:

Project management & the development is done by internal staff, assisted by Rainer von Königslöw acting as design consultant.

4. Project status & plans:

An initial conceptual design is being drafted, at the same time that user requirements are being collected. A concept for testing the validity of the inferences still needs to be developed. The next stage of devising and testing prototypes for the medical reasoning is anticipated to start in mid 1992.

Extending expert system technology to handle text-based representation needed for legal reasoning and similar text-based information processing.

3 sub-projects:

a) inventing and developing a knowledge representation which can integrate text, including huge text bodies

a) Jan 1, 1994 to Feb. 19, 1995 (continued from 1993)

a) the first project (continued from 1993) involved inventing and developing a text-based knowledge and information representation that is appropriate for huge text-based information sets. A scheme was developed, along with suitable technology to transform information to fit the scheme. The scheme involves "naming" fairly small individual chunks of information so that they can be individually addressed and jumped to or retrieved. The scheme extends the concept of cataloguing and can be used to time-stamp information. To the best of our knowledge the scheme is novel and unique. The development of the supporting technology and the first test was slated to be completed by end of 1993, but in fact was not completed till mid February 1994. A first report on this approach is appended.

b) developing a set of software tools for automatic and semi-automatic text transformations

b) Feb 7, 1994 to June 22, 1995; -- phase 2: Aug. 8 to Nov. 18, 1994

b) the second project involved the development of a set of tools which allow automatic and semi-automatic text transformations. The first goal involved building a set of tools which could fully automatically transform the complete set of Ontario Statutes from a WordPerfect representation into a Folio electronic book representation for CDs. The transformation included the extraction of defined terms. Both English and French text would be transformed, and hypertext text links would be build automatically. Development of the first tools started in early February. Full tool development started in April, with first phase completion on June 22. (This project did not include building an automatic index).

The first experiments used Rexx, but these failed due to slow performance and limitations in large text handling. The second experiment used Awk and PolyAwk, with success in the transformations but with still inadequate performance. The goal was to do the transformation in 1.5 days, so that a partial update could be done overnight. This first version was too slow by a factor of 2.5 or 3.

The second phase of the project started in the middle of August with the development of special sort routines in C to help in building indexes and in speeding up the performance of the transformation. This phase was completed in the middle of November with a significant speed-up, but not quite as yet satisfactory performance. (The next phase is not anticipated until sometime in 1995 when transformation into HTML will be included.)

.Sort & sort_l -- these DOS routines are for lexically sorting an index, e.g., for an index of defined terms. Each entry to be sorted must be represented in a single line, where the lines are sorted relatively to one another. The target string on which the line is to be sorted must always have the same relative offset from the beginning of the line. This routine can handle large files, is relatively fast, and can ignore initial tags (C based). Unlike commercially available sort routines, this routine can handle very long lines and does not suffer from the usual size restrictions.

.sort_p -- this DOS routine is for lexically sorting larger multi-line chunks of information. The line on which the sorting will be based must be uniquely identified with initial tags. This line must be the first line for the chunk of information, (or it must be a fixed number of lines from the beginning of the chunk.) The chunk can contain a variable number of lines, as it is bounded by the next unique sorting line or the end of the file. To give an example, the "record" entries in FileLaw (a Carswell electronic publication soon to be published) were not in alphabetical order. This routine was used to sort the files within provinces and within industry types to put the entry records in the correct lexical order. (C based)

.trans -- this DOS system translates from a specially formatted ASCII source format into Folio flat files for a design. During the same pass it may report of errors and prepare index files. An associated err routine translates and helps to interpret the error reports. This system is complex and needs to be bundled with consulting for customization. The system is also highly novel, using parallel automatic editorial processes in synchronization. The system should be protected as proprietary, and thus can be licensed without technology transfer. This system is based on PolyAwk. The system comes with special utilities to prepare batch files automatically and to interpret error log files.

.trans2 -- this DOS system can also translate from source into Folio flat files, but it is much less powerful than trans. It is based on Rexx.

c) developing a set of tools for quality assurance and change management for huge bodies of text

c) Dec. 5, 1994 to Dec. 31, 1994 (continued into 1995)

c) the third project involves the development of a set of tools for quality assurance and change management of large bodies of text. This project started in the beginning of December, 1994 and is planned to complete in March or April 1995. Unlike change management in traditional engineering disciplines, there is very little formal change management control even in editorial productions which demand high accuracy, e.g., the publication of legal texts. Quality assurance is handled in a very labour intensive fashion by copy reading galley proofs. This process can be and must be automated if one is to handle large bodies of text in a timely and cost-effective fashion. The first approach is to automate and semi-automate the copy to copy reading and chunk checking that is now done on printed page proofs. A multiple window approach allows the textual material to flow in parallel, and to note differences into a change management report.

c) Here is a list of tools for quality assurance and change management of large bodies of text. (Note: Most of them were completed in 1995 or are still under development. None of them have yet been used by any client.)

.qa & qa_index -- these Windows routines are for quality assurance and change management. This routine compares two files and detects differences. The differences are noted in a file which can be saved or printed. The differences can also automatically prepare a change index which indicates the difference between successive versions and allows users to use jump links to go directly to the chunks of information which have been changed in the new release. (C based)

Example 1: qa could make a comparison between two successive releases of TaxPartner (Carswell) which indicate precisely what was changed in the new release. This change record can be used as a log of changes, and to get appropriate management approval and sign-off for the changes.

Example 2: qa_index can do what qa can do, and at the same time automatically prepare an index of the changes that can be integrated into the TaxPartner product. qa_index is still under development. It would need customization for each product to which it would be applied.

.char_map -- this Windows routine checks the character set of a document and translates extended ASCII characters into the equivalent ANSI Latin 1 character set required for Folio flat files. The ANSI character set is the standard for Windows and for the Mac. The routine can also map string or tag encoded characters used in SGML and other representations. The routine can also map special characters into tag strings, such as mapping special characters into characters in special symbol fonts or using super and subscript tags. (C based)

.tag_ex -- this DOS routine is for extracting all SGML or folio tags from a file and listing them. This list can be sorted with sort above. The routine is used for document analysis and QA. It is also used for converting from strange typesetting systems by first extracting the tags in preparation for subsequent translation and clean-up. (REXX based)

.char_ex -- this DOS routine is for extracting all SGML based special characters from a file and listing them (&xxx; format). The routine supports document analysis and QA. The routine can be used to customize char_map to assign Folio characters and strings to electronic versions of documents. (Rexx based)

CAE: Computer-Aided Editing for large & complex books

To discover and explore schemes and algorithms for semi-automating quality assurance on large and complex texts. SGML only supports internal structural consistency checks, and did not provide sufficient support. An approach is needed to allow testing documentation against previous versions and against software, by content, format, and structure. A scheme based on successively marking text, extracting lists, and substituting information was developed. The first coding scheme worked well for English but could not handle French (accented characters). A new coding scheme was developed.

1. A methodology based both on files (information representation) and tasks (information transformations) was developed and explored. The common approach is based on tasks and resources, but does not handle information and knowledge.

2. A quality assurance scheme was developed looking both at internal consistency of the information and at the correspondence between small information chunks (components of the document) and software. The information extraction and comparison tasks were automated with software tools and utilities, and with lists.

3. A list based information insertion - transformation scheme was conceptualized and explored for automated maintenance and transformation tasks.

TARA: The Automated Research Assistant

To produce a software product for high schools and the undergraduate curriculum of universities to help students learn about the language and reasoning involved in scientific experimentation. It was theorized that it should be possible to represent the knowledge of a researcher / research assistant using a knowledge-based system. If domain knowledge and knowledge about probabilistic models was included, then it should be possible to do experiment simulation based on this knowledge representation. Dr. Rainer von Königslöw developed an experiment simulation system in the early 70's in Fortran IV for mainframes. The system was used very successfully for some of the earliest interactive learning experiments at Queen's University, and other places. (Reported at conferences around 1976). Graduate students could add new models with help from Dr. Rainer von Königslöw. Graduate students tutored undergraduates in learning about scientific experimentation.

The expert system shell has to be extended and redesigned to allow representing abstract knowledge about experimentation and statistical analysis as well as more concrete knowledge about various experiment domains. Rather than depending on knowledge engineers, the knowledge representation must be simplified so that the models can be developed by teachers and students. In other words, the models must be very easy to develop. To fit the classroom, and to make experiment simulation affordable, the system must run on PCs instead of mainframes. Graphics must be incorporated.

The system focuses on the language and reasoning underlying experimentation, rather than on specific techniques. This supports more abstract learning about science and complements current laboratory and classroom approaches.

The system is student based, and encourages the student to learn through exploration and by making mistakes. The student is empowered by asking an "automated research assistant" to do the actual experimentation. Experiments can be done quickly (20 minutes or so), so that the student can do several, and thus improve over time. Scientific experimentation is represented as an expert system - knowledge-based representation. This allows advanced students to understand the underlying modeling, and to assist in adding models.

Reasoning about experimentation must be represented as expert system based knowledge representation. This includes 3 levels of reasoning:

At the abstract experiment design level, the system has to reason about topics such as surveys, within-group or between-group designs, etc. The reasoning structures must carry forward into reasoning about data analysis.

At the specific domain level, the system has to represent and reason with independent and dependent variables, apparatus, materials, etc. At the specific event level, the system has to support probabilistic reasoning to support modeling different types of behaviours. Special events and special cases that can cause experimental error must be supported (e.g., contamination, overdoses, etc.)

The challenge is to represent the different levels of abstraction involved in experimental design. We also need to represent reasoning with probabilistic models -- and incorporating this kind of predictive reasoning into a goal-oriented, backward-chaining expert system shell. (In this case the model has to analyse the design and predict the data -- in contrast to a trend detecting expert system, where probabilistic trends are analysed from the data and reasoned about.)

i_Link: The Intelligent Link

The specific objective for this year's research project is to provide a method, with appropriate tools and utilities to allow the interlinking of separate complex electronic books. These books may be developed and maintained by separate organizations. They may also be distributed independent of one another, at different times. They may also be based on different technologies, such as Folio infobases, Windows Help engine, word processing, or Internet browsers. Yet it is desirable to enable the user to link seamlessly from one book to another. This involves searching for the book on a network, finding the most recent copy, opening the book, and going to the appropriate topic (bookmark, heading). There is no such tool or method available at present.

The research was broken into phases. The sequence of phases allowed a systematic and graduated exploration not only of technical internal complexity, but also of increasing complexity in the user interface.

The first phase objective was to link from a context sensitive Windows Help document to an electronic book from an outside publisher to provide more in depth information. Users of the help documentation may or may not subscribe to the publication. There is a monthly update to the publication. The publication can expire. The first phase focused on finding the most recent version of the publication, checking for permission and expiry, and linking to the publication.

The second phase objective was to link to two Windows help files and to display them side by side. This objective arose from the need to do quality assurance on translated documents. In large and complex documents with thousands of topics quality assurance becomes problematic. It occurred to me that the same linking mechanism could be used to solve this problem.

The third phase objective was to link to word processing documents, with the ability to jump to specific topics or headings. The target document should be user maintainable, even if it is initiated from a template and supported with macros. The links should be able to originate from Windows programs, Windows help documents, or Folio infobases. The same link mechanism (program, etc.) should support multiple originating and destination documents for the same user.

The fourth phase objective was to link from word processing documents with a simple, user friendly user interface to all the supported destinations.

The fifth phase objective was to link to internet documents.

Future phases might involve automatically retrieving and sending parts of documents.

Other concepts involve decision support for selecting documents, and automatic background processes for filling in documents (e.g., a screen print & a debug log automatically inserted into a problem report - support request document which might be opened on a word processor for the user to add comments and other details, and which might then be sent or faxed automatically.

• The fourth and fifth phase were not addressed in 1996

Functionality and usability: Two types of users must be supported with appropriate functionality and user interfaces. The end-user uses the links and may maintain one or more of the relevant word-processing documents. The documentation specialist prepares and maintains the documentation that is distributed to the end-user. The document specialist is the target client for the output from this research, but the end-user, the client's client, is the prime target of the usability of this technology. Usability for the document specialist has to be kept in mind, but can be improved later.

The primary focus in all four phases was functionality and usability for the end-user. The goal was to make the link look natural in context, and to minimize the need for user intervention.

A secondary focus was on creating automated utilities for preparing the files and templates needed to support the linking functionality and which have to be distributed with the electronic document(s). Text transformation processes, a primary research focus for 1994, were adapted and applied to extract lists of topics and to prepare templates for managing topic, title, and query translation for the linking process.

Advanced & distinctive features:

Phase 1: Relevant files are found on the network, and paths are noted. Links can stale date and trigger searches for new versions (updates) of documents.

Phase 2: Context sensitive help can chain across multiple documents, some of which are client maintainable.

Phase 3: Users can annotate and maintain links to published documents, even though the electronic files are replaced with updates and new releases.

The long term objective is to discover and develop both a methodology and a technology for interlinking large and complex documents representing interconnected knowledge structures rather than just data.

The initial (phase 1) subgoal for this taxation year was to focus on linking from a large and complex electronic text to a commercially available electronic publication. There are three technical challenges:

• To design and test an unobtrusive user interface with minimal interactions for users who are not technically sophisticated (e.g., legal researchers, payroll clerks), who just want information quickly and do not want to be distracted by a linking process.

• To design and test algorithms which set up the link between electronic books, keep linkage information between uses, track when the linkage information might be "stale", i.e., when a new release should be looked for.

A second (phase 2) subgoal for this taxation year was to experiment with opening more than one link to parallel destinations from the same source document. If both targets can be available simultaneously, then it can be used for comparing documents, e.g., quality assurance for translated documents by displaying the content for the same topic side by side. Another use is to compare sequential versions of the same document. The first linkage target chosen was the Windows Help engine, because of the perceived need and difficulty for performing quality assurance.

• To design and test algorithms which open two Windows Help files side by side on the screen, at the same topic. (This functionality is not supported by the Windows Help engine embedded in Windows.) It must also be possible to randomly access topics, or to step through topics sequentially.

A third (phase 3) subgoal for this taxation year was to experiment with multiple links to multiple destinations from the same source document. If multiple targets are available sequentially, then it can be used for linking to multiple publications, or to also link to user maintained documentation. For this latter use, the link has to be to common word processors.

• To design and test algorithms which open a word processor in a separate window, find and load a selected document, and jump to a chosen topic. It must also be possible to randomly access topics in a large document containing thousands of topics.

• To design and test a combination of automation and a simple user interface which track the word processor used by the end user and adjusts the link accordingly (Word, WordPerfect, different versions). This selection and adjustment process should be automatic or make minimal demands on the end-user.

• To evaluate and optimize the resource requirements and performance of the linking mechanism so that it could work in context of appropriate usage. For instance, the end-user may be working with an application, ask for context sensitive help, and from the help documentation link to other publications or to client maintained, word processor based documentation. All of this has to happen on a low end to average workstation (for a clerical position), with limited resources and in a network environment.

A next (phase 4) subgoal for this taxation year was to allow the user to link out from word processing documents, using the same linkage mechanisms.

• This objective was not directly addressed in 1996, but it was anticipated in the designs.

Advance: To discover and explore methods and algorithms for automatic and semi-automatic links between large and complex documents on diverse software platforms, and from different sources.

• A method and tools for linking from Folio infobases or from Windows help to a bound Folio infobase. (An instance of this is installed in the Royal Bank Payroll product: "Images", linking to the Carswell product: "The Canadian Payroll Manual on disk".)

• A method and tools for linking to two documents side by side and randomly or sequentially accessing the same topic. (This method was used in early 1997 to do quality assurance on the Royal Bank Images documentation, comparing the English and French Help topics - more than 1600 topics.)

• A method and tools for linking to Word and WordPerfect documents. (Because of technical limits in WordPerfect, only WordPerfect 7 and up can support the linking functionality.)

1. The core components were designed and programmed in C: Finding files on all hard drives and across the network, selecting the most recent version; Displaying bitmap graphics while waiting; Displaying metafile graphics while waiting; Managing the user interface and the Windows messages, both for the main loop and for the dialog boxes; Assembling the link call string and calling. The interface was designed to allow dynamic language switching: English & French. The logic, performance, and user interface were tested in a sequence of prototypes. The call to i_Link was embedded both in Windows Help and Folio documents as hyperlink calls, and more tests were done. Finally the system was tested and adapted with a number of networks.

2. For phase 2, a topic translation map was designed for the random and sequential access, and so that it could be automatically extracted from Folio or WinHelp documents. A dual link call was designed and tested. This latter component gave a lot of problems since Windows does not allow the user to use the WinHelp engine re-entrantly. After many experiments and not much (expensive) help from Microsoft technical support, a method was found to open two simultaneous copies of the WinHelp engine, and to drive them in parallel. The user interface had to go through many alternative prototypes, since the link mechanisms had a tendency to obscure or interfere with the side by side document displays. A satisfactory solution was found, using a single character height control strip at the bottom of the screen.

3. For phase 3, the call on the word processor could be managed in C within i_Link, but the jump to the appropriate topic had to be done with word processor macros. An appropriate ini file had to be designed to pass information to the word processor. Unfortunately, different word processors, and even different versions of the same word processors have different capabilities, so that designing a scheme that is transparent to the user proved quite challenging. Also, making the system work for large and complex documents, with more than a thousand topics proved to be quite difficult. We do not yet have a satisfactory solution for WordPerfect 6.1 or earlier.

Experiments were performed with computing environments, to find the most recent versions of electronic books. Some networks, and some CD readers (especially when not containing data CD's) presented some challenges that had to be overcome.

Experiments were performed with large and complex bodies of text, including electronic publications and user manuals for complex software systems. A range of linking methods and user interface approaches were explored. A generic scheme for extracting topics and linking between documents was investigated.

Experiments were performed with usability, focusing primarily on the end-user. Special cases, such as finding some but not all of the files, proved to be difficult to keep simple enough for the intended users. The user interface for the quality assurance testing also needed a number of experiments to minimize complexity and to make it work fast enough so that testers could compare large documents with more than 1000 topics in a reasonable time span.

Computer-Aided Support for maintaining large & complex books in multiple languages -- (The Translation Assistant)

The specific objective for this year's research project is to provide a method, with appropriate tools and utilities, for reducing the time and cost for maintaining translations of large and complex texts. There is no tool or method available at present which reduces the translation costs and time for preparing a second or third version of a document.

Many large and complex documents are maintained in both English and French through a succession of versions or editions. For example, software products go through a sequence of releases of equivalent English and French versions. For each such release, both English and French documentation has to be prepared. Traditionally the software is developed first, then the English (or French) documentation, then a translation into French (or English). (For ease of presentation we will talk about English development and translation into French, with the understanding that the same concepts apply to the reverse). For each successive release of the software, the English documentation has to be updated and then sent for translation. The translation adds expense and delays.

Similar considerations apply to other large and complex, synchronously updated multi-lingual documents, such as legal texts. For simplicity in presentation we will continue with the software example.

Concepts and processes:

In preparing a new release, we are dealing with four distinct documents:

1. The old English document is the documentation for the previous English release.

2. The new English document is the documentation being developed for the 'soon to be released' English version of the software.

3. The old French document is the documentation for the previous French release of the software.

4. The new French document is the documentation for the 'soon to be released' French version of the software.

The process of preparing the translated new version has several phases, after the new base version is ready

• pre-processing: preparing the material (document) to be submitted for translation, including any instructions for the translators. Some of these instructions may be embedded in the document.

• translating the material as submitted, according to the instructions

• post-processing the translated materials, as received from translations. This may include merging newly translated parts with unchanged parts from the previous version. It may also include deleting embedded instructions, and doing quality assurance.

This is followed by the normal processes for preparing the version, i.e., 'publishing' the material as printed document, as Windows Help file, etc.

There are two traditional approaches to translation:

• Complete translation is the process of generating the new French document (4) from the new English document (2). This is the most expensive and time consuming process. It must be done for new software, but it is also commonly used for software updates. (Since the complete document is translated, there is no pre-processing or post-processing stage required.)

• Selective translation is the process of generating the new French document (4) by merging unchanged portions of the old French document (3) with translations of the changed parts of the new English documents (2). This approach is faster and less expensive, but it is very difficult to manage, since it is easy to miss minor changes or to make mistakes in merging.

A complicating factor is that document development and translation is often done by different groups in different locations under different contracts, so that the process must involve clean hand-overs and minimal interaction requirements.

Goal:

The overall goal is to make the translation process both faster and less expensive by using computer aided text editing both to prepare the text and to merge the translated text.

A secondary goal is to make the translation process 'safer'. Quality assurance is a major problem for large and complex documents that have to be produced fast and yet be very accurate. Reuse of previous material reduces the risk, but the process of splitting and merging can increase the risk.

An added constraint is that any technical pre- and post-processing cannot increase either the time or cost as compared to a complete translation or as compared to the approach currently used.

• There are two users with this approach: the person doing the pre- and post-processing, and the translator. The new process must have minimal impact on the translator and the translation process.

The long term objective is to discover and develop both a methodology and a technology for developing, maintaining, enhancing, and interlinking large and complex documents representing interconnected knowledge structures rather than just data. Documents in multiple languages is part of this environment.

The goal for this taxation year was to focus on the translation process. There are three technical challenges:

• To design and test an unobtrusive user interface with minimal instructions and training requirements for translators, who are not technically sophisticated, who just want to translate and do not want to be distracted by a new process. Use of word processing is assumed, and standard word processing systems must be supported. To allow the use of basic word processing skills without additional training, a special document representation and a corresponding editing method had to be designed for selective translation.

• To design and test a user interface for the pre- and post-processing. The interface should be usable by the technical writer who develops the new version, or by the editor who reviews the new version (2).

• To design and test software which does the pre- and post-processing, i.e., which prepare the document for selective translation, and which does any required post-processing. In parallel, a document structure has to be designed for the old and new versions which support the software functions in keeping the documents aligned. This alignment is crucial in semi-automatically inserting the change and insertion requirements for the selective translation.

Scientific or technological advancement achieved

A computer-aided approach to translation was invented which has a novel method for preparing a large and complex document for translation. The document must be updated from a previous version, with a translation for the previous version. The associated software represents advances both in user interface design and in semi-automated text processing.

• Windows95 or NT software (32 bit) with a novel user interface for preparing a document for selective translation. A special text editor contains three windows. The top window contains the old English version of the document. The middle window contains the new English version of the document, and the bottom window contains the old French version of the document, which is being edited and marked up for the selective translation. The system automatically keeps the three documents aligned in equivalent places, and assists in marking changes and insertions in the new French document. Old to new English uses a character by character comparison, while Old English to old French compares and aligns on the tag structure (e.g., SGML, HTML, Folio tags).

• A novel document-based user interface for selective translation. All changes and insertions that are required to change the old French to the new French are marked in the old French document. Insertions are marked with unique insertion tags, and contain the new English text that is to be inserted. Changes are marked with special tags - including the old and new English as well as the old French text. This marked up document can be edited in any standard word processing system.

• A novel extension to text processing keeps the three documents aligned. Aligning the old and new English documents uses both the internal tag structure and a character by character comparison. Aligning the old English with the old French uses the core tag structure underlying the documents. Not all tags can be used since the languages have different sentence structures and word sequences. The richer the document, i.e., with embedded jump links and other features, the better for this comparison and alignment approach. The program automatically flags differences in tag structure, in case that the old English and old French versions are not fully equivalent. Documents have to be mapped into an SGML, HTML, or Folio tag structure to support document alignment and difference detection. Other document tag structures will be explored in the future (e.g., RTF).

Activities in the taxation year:

1. The 3 window comparison and editing component was designed and programmed in C: Loading relevant parts of the 3 documents into 3 buffers, moving through the 3 buffers while keeping track of tags to keep the documents aligned, and supporting the relevant semi-automatic editing functions for deletions, for marking insertions and changes while copying the relevant pieces from the old English and new English documents. Managing the alignment and managing the buffer structure proved difficult for large and complex documents. A number of approaches were evaluated with a series of prototypes.

2. The document for the translators was designed with SGML-like tags which carried the instructions on what to do: <change><from><to> in the tags. There was experimentation with tags so that they are easily understandable by the translators, yet sufficiently distinct from other embedded marks and tags.

Systematic experimental investigation:

including analyses and experiments, interpretation, conclusions

The first set of experiments attempted to mark up an output document fully automatically, using previously developed computer-aided editing tools (developed in previous years with SR&ED support). This experimentation failed as soon as large complex documents were used as input materials. It ends up that supposedly matching versions (old English and old French) do not fully match since last minutes changes can be made to the English version while it is in translation. As soon as the document structures do not match, human judgment is required. It was decided to discontinue the fully automatic approach, since, on anecdotal evidence, these last minute changes were quite prevalent.

The second set of experiments were based on the first prototypes of Comp_3, the 3 window system, and tried to produce snippets of text for selective translation, marking both the copy of the old French and the snippets for later insertion when the snippets would return from translation. On discussion with the translation group for the Royal Bank in Montreal, it became clear that context was important for stylistic consistency, and that the snippet approach was not satisfactory.

The third set of experiments were performed with later prototypes of Comp_3. The strategy was developed of making a copy of the old French version and marking it for deletions, insertions, and changes. In the final version only content insertions and changes are marked to go to translation. Content deletions, and tag (document structure) changes are made during the editing process.

Progress made toward the research objectives:

A novel method and appropriate software was invented for marking up a previous translated version to indicate the insertions and changes required to convert it to the new version.

After completion of the research project, the method and the beta prototype software were used to translate a user manual for the Royal Bank payroll technologies system - Images(tm) (Note: the consulting project to use the beta technology for the Royal Bank payroll application is not part of this project, but it reflects the progress.) The client, and the translation department seemed satisfied with the approach. The resulting translation needed mininal clean up to remove some of the editing marks.

Note: The general approach and the software system have not yet been successfully transferred to an inhouse document developer or editor. More research may need to be done on that part of the interface.

The Scoring System

The specific goal

To invent, design, and develop a scoring system for the Bahamas Beauty Pageant which collects the scores and displays them on the national TV network. The system should be generic, i.e., usable for other competitions which are judged and where the scores are displayed real-time on TV. The system must be robust and reliable, and rely only on computers that can readily be rented on location in developing countries. The system must be so user friendly that judges without computer experience can use it without more than a few minutes of training or explanation. The system must support multiple competitions, with score integrations and score summaries, and complex scoring mechanisms - such as dropping high & low scores, etc.

System architecture, software engineering:

• The system must rely only on PCs running ordinary Windows (95 and up) connected through any type of LAN (most likely the lowest common denominator: peer to peer as supplied with Windows).

• It must be possible to install and configure the system within an hour or two, as venues for such events are heavily booked and do not always allow for much set-up time.

• The system must be installable from diskettes, as CDs are not universally provided in developing countries.

• The system must be very robust and dependable, as these are one-shot events, and the performance (and possible failure) of the system is highly visible (on national TV).

• The system must allow for real-time controls, since judging and display of scores on TV are tightly scheduled and choreographed.

User interface

• The score entry interface must be extremely simple and intuitive

• The scoring may be done by judges, some of whom have never used computers before, and who may be "computer-shy"

• The event organizers generally allow little or no time for training the judges on site and in context. There may be very little briefing as well.

• As the judgment is "one-shot", the system must allow the judge to validate the score and to recover from entry errors.

• The system must support the judges to conform to the timing and scheduling imposed by the event organizers and TV producers.

• Scores will be entered in distracting circumstances, with dim light etc. The judges may be drinking.

• The system must be protected against accidental or malicious interference by either judges or bystanders.

Scientific or technological advancement

A distributed component approach was designed which is much simpler and more robust than the DCOM approach, but still supports real-time transactions

• There are three components, a management component, a judges score entry component, and a TV output component.

• Both the judges component and the TV output component are under full control from the management component

• The judges component only uses the monitor and a mouse, but no keyboard - to reduce the chance of error and loss of control.

• The judges component has timing alerts and a time countdown - The transaction (score entry) is aborted and defaulted if the score is not entered on time. The timing cues are controlled from the management component and support the TV production scheduling. Default scores are not shown on TV

• The judge is warned with messages and count-down numbers

• The TV output component shows individual scores and summary scores across candidates for completed competitions.

• Countdown timing is done individually and independently by the judges components on each PC, but is triggered from the management component. Timing is accurate to within fractions of a second, and does not depend on the setting of the clocks on each PC (which may vary significantly)

• Only standard PCs are used, with basic network capabilities, that are locally obtainable. Configuration and installation is very quick. Slightly longer for the management component, which can be done ahead of time in approx. 2 hours.

Scientific or technological uncertainty

• The greatest uncertainty arose from the one-shot nature of the event. Everything had to work perfectly, with no chance to repeat or recover, as the judging for the pageant, and the TV broadcasting was live.

• Within competitions, transaction timing is critical, judgments have to be collected as candidates walk across the stage, so that scores can be displayed on TV, with calculated averages, when the camera focuses on them in their final stage position.

• The user interface has to work properly in reminding judges to enter, verify, and approve their judgment in time, even though they are distracted by the show and would like to use extra time to evaluate the candidate.

• The user interface for the judges has to time out and reset itself if the judge does not enter a score in time for a particular candidate, despite the reminders. Controls in remotely resetting and controlling the score entry displays have to work properly over the network.

• The management component has to work properly in waiting for the scores, applying the judgment algorithms (drop high & low, average, consider default scores ...) and sending it to the TV output component. Trade offs in in dealing with late judges and pressure from TV production have to be managed and supported. Potential message and file timing conflicts have to be resolved correctly.

• The TV output has to overlay properly, with proper genlock synchronization.

Description of the work

• It was initially hypothesized that a traditional client-server approach with a data-base would be ideal for the job. Some experimentation showed that this approach would not be satisfactory, either from a system perspective or a user interface perspective.

• A three component design evolved after some experimentation. Experimentation focused on the messaging and transaction management. Shared file and shared memory approaches proved to be not sufficiently robust to handle timing conflicts. A small file information exchange plus semaphore system proved to be most reliable, when combined with a state transition - polling approach.

• There was some experimentation with the TV output. The most robust approach proved to be a separate PC with a remotely controlled display state, and with all system windows and message either defeated or set to the overlay colour so that they would not interfere.

• There was extensive experimentation with the user interface design for the judges. Since real judges and the real event were not available, experimentation involved subjects trying to be difficult and trying to "beat the system". Even at that they did not prove to be as difficult as some of the real judges proved to be.

• It was discovered very early that keyboards did not work in a dim room with slightly inebriated judges. Touch panels would be nice but are not easily available for rent in these locations. A remotely controlled with a mouse only interface proved to be the only solution, where the judge selects a score by clicking on large digits on the display. (Even with this one or two judges had to be shown how to slide the mouse rather than lifting it.) Different displays have to handle different types of competitions - e.g., numeric scores and text-category scores.

• There was more experimentation with reminding the judges when the scores have to be entered. If the scores are not entered in time they cannot be used, as the TV display goes up while the candidate is still on stage. Typically the next candidate is already walking onstage while the previous candidate is finishing. Finally a series of escalating reminders was devised, starting with text and icon reminders, and ending with bulls-eye count down to defaulting the score and resetting the display.

• There was some experimentation with the management component, since scores from previous, non-televised competitions need to be integrated into the summary judgments displayed on TV.

• The system was used for the Miss Bahamas Beauty Pageant in mid-August. Fortunately the system worked reasonably well, despite one or two difficult judges (inebriation was not anticipated), so that some scores were defaulted and dropped. The main flaw of the system design was in the final selection. The choreograph script from a previous competition, and the client specs all called for individual selections with a minimum time spacing between them. The script was changed at the last minute, and the final selection had the top three finalists on stage side by side with a very fast sequential judgment. The system proved awkward for this situation.

• With more experimentation after the fact, a new type of scoring panel was devised to allow side by side comparisons. This is quite tricky, since the finalists are not known until the last moment, and their relative positions are not pre-determined. The management component and the messaging system had to be altered to allow for this type of comparative judgment.

Design objectives:

A low to medium speed general process control system using readily obtainable and easily replaceable components. The system and the components must be easy and quick to set up and configure. The system must be robust, reliable, and relatively fault tolerant for one-shot or mission critical applications.

Low to medium speed process control refers to processes where both the minimum and the maximum time resolution is in seconds or minutes rather than in microseconds or milliseconds. Time resolution applies to the sampling rate for data acquisition, the modification frequency of output controls, and the time synchronization between sampling points and control points.

Readily obtainable components refers to the use of general purpose PCs rather than PLCs (programmable logic controllers) and other specialized instrumentation.

Quick configuration and easily replaceable components refers to the time and to the specialized skill required to set up the process control system, or to replace and configure components.

Robust, reliable, and relatively fault tolerant: Cheap components such as leased PCs can fail, or the power can be interrupted briefly. The system as a whole should stay up even if a single part fails - and it should quickly come back on stream after appropriate replacement, without having to reset and reconfigure all the other components.

Application opportunities for such a design

The immediate opportunity was the request to develop a scoring system for a beauty pageant in the Bahamas. For this application, the time constraint is approximately one quarter of a second. The score is acquired from judges and displayed on TV broadcast, overlaid with video of the contestant. Timing constraints for the output arise both from synchronization with the show, and the timing for the broadcast, including advertising etc. In general, the time synchronization does not have to be very close, e.g., not closer than within one quarter second.

A more long-range opportunity is the concept of delivering process control over the internet. One possibility being explored is the unobtrusive monitoring of elderly persons living alone. Other opportunities deal with more traditional process control applications, such as the remote monitoring of local process control applications. The local process control application can be fast and provide the usual controls. The remote process control monitors trends and tries to anticipate the requirement for human intervention.

Architecture

In general there are three types of stations: a data acquisition station, a management/control station, and an output control station. For the scoring system, data acquisition consisted in obtaining scores from the judges, so that there was 1 separate station for each judge. The output consisted of the signal to the TV network, so there was 1 output station. There was also 1 management/control station.

The platform: Hardware, OS, and Communication

For the initial platform, network enabled PCs were chosen, with Windows 95 or Windows 98, and with a peer to peer LAN network. The LAN could also be based on Novell Netware or NT, or any other LAN that supports drive mapping. NT is fine as alternative OS.

Alternatively, for remote (internet, WAN) applications, internet connected PCs can be used. The same process control approach can be used with FTP, and keep the present simplicity and robustness. Time resolution and time synchronicity present more of a problem with this approach because of widely variable communication delays.

In the future, for remote (internet) applications, a browser-based approach will be explored, but there are still issues about integrating interrupt handling and timing loops into browser scripting. If it proves to be robust, this approach reduces setup and configuration requirements to a near minimum.

Software components

For the beauty pageant scoring system there are 3 modules corresponding to the data acquisition component, the management/control component, and the output-control component. The bottom level: data exchange - timing synchronization layer is essentially common across the three modules. The business logic layer and the interface layer are distinct for each module. The management/control module is the largest and most complex, with approx. 10,000 lines of code in C, including embedded documentation.

The data exchange - timing synchronization layer

Each of the modules uses timing intervals, using the system clock and interrupts.

• the timing interval is approximately the same for each module

• most state changes in the application are controlled with the timing interval

• the timing intervals are not synchronous, since they are independent on each of the PCs, with different clocks, etc.

• the timing intervals check for file flags to force synchronicity. The duration of the interval thus determines the limits of synchronicity.

• the timing interval in conjunction with the small files allow for external control of the application

Each of the modules uses small files in shared directories to communicate with the other modules.

• some files are simply renamed, and the existence of the named file is used as semaphore.

• some files convey data, for configuration or other run-time controls, or for data processing

• after some experimentation it was determined that it was faster and more robust to use multiple files with single uses rather than fewer files with multiple uses.

• only a single module on a single station has write access - all others have read access only

This part of the system is relatively application independent, since it can be used for a variety of applications that do not have tight timing requirements but can advantageously use a process control approach.

This layer of the system represents the main technological advance, as well as the primary source of technological uncertainty. To the best of my knowledge all commercially available process control systems have much tighter timing, but cannot meet the requirements for inexpensive components and fast configuration.

The business logic layer: data processing - data storage & retrieval

This part is application dependent. The most interesting part, and the main technical advance of this component is the implementation of voting algorithms that can handle ties, and that avoid the intransitivities that are potential errors for adding / averaging algorithms.

Another challenge was the detection of voting inconsistencies by single judges.

The interface layer: data acquisition - management/control - output control

There are two user interfaces and one system interface. The user interfaces dealt with data acquisition (collecting scores from judges) and management/control (synchronizing the judgments with the show, and synchronizing the output with the network requirements). The system interface dealt with formatting and overlaying the scores, averages, rankings, etc. with the network TV signal for live broadcast.

The main technological advance relates to the time-driven nature of the user interface. The collection of the scores could not be done at the convenience of the judges but had to be synchronized with the timing requirements of the live broadcast. For this reason, data acquisition is a better description than user interface.

• commercially available interface controls are passive, i.e., they wait for the response from the user

• "active" controls may show pictures or play sounds, but they are not used to control the response timing from users

• This application requires timing control over the judges. The scores have to be delivered to suit network timing. The judges, however, do not see the TV signal, they only see the show. The user interface, therefore, had to control the timing of the responses of the judges. The interrupt-driven time intervals were used to force the judges to synchronize their responses with network requirements. The synchronization had to be controlled remotely, based on signals from the producers and on triggers and events on the live broadcast. The technology of implementing this forced timing response constitutes the technological advance of the user interface component.

• The technological uncertainty relates to the difficulty of using the user interface signals to control the timing of responses of the judges. A contributing factor is that the judges are senior business leaders who volunteer for this role, may be computer illiterate, and do not receive any training. If any judge does not score on time then fewer scores are displayed on TV, a very visible error. While the judges cannot see the TV signals, and thus the scores, their friends and relatives do record the TV show, and these issues can lead to a debate on public forums such as radio, TV, and newspapers.

IDPC: Internet based wide area distributed process control

Scientific or technological objectives

System architecture, software engineering:

To identify, design, and develop the technology required to deliver distributed process control over the internet.

The general technological goal

To explore the feasibility of distributed process control over the internet, and to demonstrate this feasibility with small demonstration prototypes. To identify, to solve the technological challenges, and to identify the constraints and limits for this kind of use.

Distributed process control allows the integration of local controls into groups that support both trend detection and anticipative or feed-forward controls. Such systems depend on LANs or other equivalent communication links. The maturation of the internet promises to provide the communication base to support an extension of distributed process control to wide area distributed process control.

Application opportunities include device monitoring for home health care, and the unobtrusive monitoring of elderly persons living alone. Other opportunities consist of more traditional process control applications, such as the remote monitoring of local process control applications. The local process control application can be fast and can provide the usual controls. The remote process control monitors trends and anticipates the requirement for human intervention.

The internet has primarily been used for applications which interact with users, who may have to wait for a response. The rate of information flow changes in accordance with traffic and the time of day. Process control, on the other hand, traditionally depends on fixed sampling periods with tightly controlled timing intervals. A number of technological hurdles must be overcome to adapt the structure of process control and to adapt internet protocols for wide area distributed process control.

General technological constraints, to make the approach practical

• The system should rely on PCs running Windows 2000, (preferably Windows 98) connected through any type of internet connection.

• Ideally it should be possible to install and to configure the system within several hours. This process should be feasible to persons with relatively low level of computer expertise following instructions.

• The system should be self-diagnosing, and maintainable with help-desk support.

• The system should be very robust and dependable, with automatic error and delay recovery, for applications such as health care.

Scientific or technological advancement sought

Distributed process controls (DPCs) connect sensors and controllers with central processing units with LANs or equivalent communication links. DPCs almost always use fixed time sampling of sensor data assembled into database tables. The data is analyzed and interpreted to check against thresholds and limits and to extract trends. The control algorithms derive control parameters which are sent to controllers. A standard fixed time sampling approach cannot be supported over the internet because of variable communication delays. Alternative algorithms and associated technologies must be explored. Additional layers may have to be added to the control architecture to deal with the risk of long delays and internet down times.

Time sampling and communications infrastructure

A new time sampling algorithm was designed, prototyped, and explored with different internet protocols. The new approach separates sensor data time sampling, communication time intervals, and data representation in the database for the central controller.

Control architecture with strategies and algorithms

Distributed process allows the integration of local sensors and controls into groups that support both trend detection and anticipative or feed-forward controls. Such systems depend on LANs or other equivalent communication links. The maturation of the internet promises to provide the communication base to support an extension of distributed process control to wide area distributed process control.

A new control architecture was explored. This architecture has a local and remote component to deal with the possibility of unreasonable communication delays and communication failure. The work this year was very preliminary. Much more research is needed in order to investigate different types of applications with different time rates and with different control parameters and control strategies.

Scientific or technological uncertainty

Time sampling and communications infrastructure

• This concept, to my knowledge, is novel and has neither been implemented or proven. It is also not yet clear what the boundaries of usability are.

• Part of the uncertainty arises from the risk of unreasonable communication delays and the need for acceptable solutions, e.g. use alternative communication channels, shut down the process, just wait, reinitialize communication etc.

• The primary uncertainty at present is feasibility. Cost and performance concerns contribute to the uncertainty.

• Complexity of the technology contributes to the uncertainty. Distributed process controls are not operated by the engineers who design and implement it, but rather by operators. One aspect of the applied research and feasibility study is to simplify the technology to make it stable and maintainable.

Description of the work

Time sampling and communications infrastructure

1. Experimentation with e-mail as a communication mechanism, with data embedded and with data attached in data files.

• Several experiments were run, both with and without attachments, and though different mail providers and protocols.

• The concept of using e-mail is attractive because many people are used to it, so that it is an easy extension to existing internet access arrangements with service providers. Unfortunately mail is routed through mail exchanges and the delays in communication varies tremendously. This approach was deemed not feasible.

2. Experimentation with hypertext transportation protocols as communication mechanisms, with the central process controller as web server, and Active Server Pages plus Visual Basic as mechanism on the server - either with browsers plus Java or J1 as mechanisms on the distributed systems, or by using Windows Internet APIs in integrated applications.

• Several experiments were run, using NT4, MS-Proxy, and IIS as web server on an internet gateway, and Visual FoxPro on another NT4 connected to the gateway by LAN as central process controller.

• These experiments were considered to be reasonably successful. Automatic recovery from communication errors were identified as additional technological challenges. Also, integration with local sensor data acquisition and control loops is less than ideal on NT4 and difficult and limited on Windows98 clients.

3. Experimentation with file transfer protocols as communication mechanisms, with the central process controller as FTP server, and by using Windows Internet APIs for file copy operations in integrated applications on the clients.

• Several experiments were run, using NT4, MS-Proxy, and IIS as FTP server on an internet gateway, and Visual FoxPro on another NT4 connected to the gateway by LAN as central process controller.

• These experiments were considered to be more successful than the http approach. Automatic recovery from communication errors were less problematic. Also, the integration with local sensor data acquisition and control loops was found to be easier.

4. Experimentation with Routing and Remote Access (RRAS), as well as with Virtual Private Network (VPN).

• This technology was quite immature in 1999, but it promises much better support for internet-based distributed process control.

• Experimentation dealt primarily with trying to get it to work properly, and work is progressing into 2000. When mature and stable, this promises to be the best approach for process control, but at the moment it is not stable or properly usable. Once operational, it promises proper integration with local sensor data acquisition and control loops, so that only communication delays and failures remain as challenges.

Communications and control

1. Experimentation with WinInet in C as a communication mechanism via an IIS http server.

• Several experiments were run, exploring coordination between a number of client machines, using single and multiple server pages, and http as well as https.

• Both synchronous and asynchronous communication were explored to investigate the advantages and limitations of these approaches.

• Communication via intranet, internet, and proxy were compared.

2. Experimentation with WinInet in C as a communication mechanism via an IIS ftp server.

• Several experiments were run, exploring coordination between a number of client machines, using stub files, as well as single and multiple directories.

• Communication via intranet, internet, and proxy were compared.

Control architecture with strategies and algorithms

1. Some analysis and design work focused on the separation of time sampling for sensor data collection, time delays in communication, and time sampling for trend detection.

• data aggregation after delayed communication

• limit detection and trends when data from different locations is not synchronous

2. Some very preliminary experimentation explored the design in a prototype using http, text file stubs for the data, and Visual FoxPro to accumulate the data.

Results and plans for the future

• Results were mixed. The general approach seems to be heading in the right direction. The blackboard approach has some advantages of maintainability. Timing is still very problematic, with noticeable internet 'rush-hours', so this approach holds more promise for overnight remote monitoring than daytime monitoring. There are still issues with ISPs, proxy servers, and security.